[TAG] Tracking load average issues
Neil Youngman
ny at youngman.org.uk
Fri Jul 20 13:53:07 MSD 2007
On or around Thursday 19 July 2007 23:01, Thomas Adam reorganised a bunch of
electrons to form the message:
> On Thu, Jul 19, 2007 at 10:57:07PM +0100, Jim Jackson wrote:
> > It sort of depends on _how_ the process is "waiting" for i/o. Doing it
> > the sensible way and the process should be sleeping untill i/o, i.e.
> > doing a blocking read or using select or similar. However bad design
> > spinning on a non blocking read would possibly account for it.
>
> Maybe, but that's slightly going in the wrong direction. I suspect if Neil
> can confirm if this is a persistent issue or not, that it's going to be
> hardware-related, and not software I/O (the mark of a suspect program, for
> instance.) In which case, going with the noapic suggestion both in the
> kernel and in the BIOS (as I suggested) is still something worth trying.
It's intermittent rather than persistent. The affected servers will run with a
load average around 0.3 normally for weeks, then the load average will ramp
up and settle at a relatively high level for an hour or so before returning
to normal.
There is no obvious degradation of service, so we're not panicking about it,
but we are wondering if the systems are trying to tell us something.
These aren't systems on which I can easily mess with kernel and BIOS options,
but I'll look into whether we can do something with the NOAPIC option.
Neil
More information about the TAG
mailing list