kernel problems


Subject: kernel problems
From: Matthew Schumacher (schu@schu.net)
Date: Sun Jul 14 2002 - 22:45:38 AKDT


Hello all,

Last week a mail server I manage suddenly failed so I tried to get a
shell to the server and found that I can login but bash would hang. I
thought this to be interesting to I passed a 'ps -ef' though ssh to the
machine to see if that would work. To my surprise it did. I got a list
back of 248 processes. After further investigation it seems that my box
hit a process limit and simply would not start a bash shell (but it did
start a ps which is odd). Anyway I started passing kill commands though
ssh to try and shut it down but kill wouldn't work, kill -9 wouldn't
work, killall wouldn't work, shutdown wouldn't work, nothing I did would
kill a process. After fighting with it for about an hour I drove to the
co-lo room and hit the switch.

Does anyone know what might cause this? Why would the kernel simply
refuse to kill anything. Btw, the box normally has 110-130 processes
running so something had to happen to cause it to hit 248. The extra
processes looked to be stale sendmail/qpopper/imap processes.

The machine is running redhat 7.3 with the redhat 2.4.18-4 kernel. I
tried using a generic kernel I compiled but my scsi performance dropped
in half. After some conversation with Alan Cox he says that redhat
patches some hi-mem/scsi code to their kernels which fixes some
performance problems with the generic kernel. I also had a lot of
trouble using quotas under heavy load with the generic kernel so alas I
am running a redhat kernel.

Anyway, I really can't deal with software failure on this machine.
Hopefully someone will have a suggestion on how to trouble shoot this.
If not maybe I'll start testing FreeBSD....

Later,

schu

---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.



This archive was generated by hypermail 2a23 : Sun Jul 14 2002 - 22:50:24 AKDT