Red Hat Bugzilla – Bug 10203
Kernel panic on SMP HP Netserver
Last modified: 2008-05-01 11:37:54 EDT
On an Hewlett Packard NetServer LC3 currently running RH Linux 6.0 since
sep. 1999 I tried to upgrade to smp installing the second CPU (Intel PIII
550) since at installation time The pc was recognised as a smp machine
(kernel already smp).
A few days later the server started to crash every 5/10 minutes.
I decided to upgrade the System bios and to re-install RH Linux with 2 cpu.
After 2 days of running it started to crash every few hours.
Hewlett Packard assistance told me that the hardware is ok (and since it's
tested from RedHat and HP with this configuration they don't konw what's
At the same time, I have installed some old ALR Dual processor Pentium 166
with the same problem: when running as single cpu machine everithing is
ok, but when the kernel run in smp mode they start to crash.
I tried also to upgrade the kernel to 2.2.13 but with the same results.
I think the problem could be in some driver, but the netserver is a "All
from hp" machine (RAM, CPUs, RAID controller, HDs, NIC), and the old ALR
have very different configuration.
Since I have 4 customers with SMP machines running with a single cpu, I
need a prompt reply please!
I've about the same kind of problem; I'm trying to add the second
processor to a Linux box on a dual PIII/Xeon-550MHz Intel C440GX+ board, and I
get a bunch of problems; the machine runs perfectly for about 24 hours (and
it's incredible how fast it compiles :-)), then freezes ?-(.
I'm usually no able to get any indication as to why it crashed (as it seems to
like crashing in the middle of the nightly builds :-)), but occasionnaly it
crshes in the day, and then I get the following behaviour:
As long as you are not accessing an NFS mounted file system, for example
logging as root from the system console, all is working perfectly, but as soon
as you try to access one, you're dead :-(
As long as it is working I get occasional complaints like these:
svc: unknown program 100227 (me 100003)
svc: unknown version (3)
Note however that I also get these messages in single processor mode, so I'm
not sure they are related to the problem.
When freezed you from time to time see the following message on the system
nfs: task 37637 can't get a request slot
where the task number may change from message to message (I've seen at least
37638 and 37639)
At this point the CPU is idle (top reports 1 running process and 99.8% idle
CPU, with about 60Mb free memory out of a total 1Gb and no swap at all; swap is
not even configured).
Note that all these messages are related to NFS accesses to filesystems
exported from a Solaris-2.6/PC system (running on an dual PII-450 SMP platform).
I was using kernel 2.2.12-5 from RedHat-6.0, then 2.2.12-32 from RedHat-6.1 in
uniprocessor mode, then switched to 2.2.12-32smp and now kernel 2.2.14-8smp (as
provided by Ed Schlunder on http://www.ajusd.org/~edward/silkhat-
6.1/i386/kernel-smp-2.2.14-8.i686.rpm) on my RedHat-6.1 install. I get 'svc:'
messages on all configs and crashes on all SMP kernels.
Is there any other workaround than unplugging the second PII-550? even if it
were aesthetic I don't thing my boss will appreciate I display a 1K$ proc on
the wall over my desk :-(
A significant amount of SMP work was done for 2.2.16 - has the 2.2.16 errata
kernel helped >
I just install it today (taking advantage of the fact that the whole team is
now on holidays) to experiment with kernel-2.2.16-3smp and I keep you informed
of th eresult; however I am also leaving for about two weeks so don't expect
anything new before, except if it starts crashing faster than usual :-)
Thanks for the good job :-) It's now about one month I'm running the 2.2.16
kernel errata in SMP mode and I've never crashed!
So this seems to have cured my problem. Note that I still get the "kernel:
svc: " messages from NFS however so that was not related to the SMP crashes at
The svc message is logged when the solaris box tries to talk NFSv3 to us. Its
probably a bit of excess verbosity on the Linux side to log this I agree.
Glad to hear its happier. Reopen the bug if it turns out to be luck only
Back to my problem of SMP kernel crashing.
As said above I've updated to kernel-2.2.16-3smp in July and all works fine
till about end of September. I then got one or two "silent" crashes during
October: not fun but still not too bad except when it crashes during a week-end
But now, I'm starting a new phase in our projects and I have HUGE make batches,
running for several days, getting the source files from a Solaris-7 box and
putting all resulting files on a local SCSI disk, and it crshes about twice a
day consistently since then, with the kernel freezed with the dreaded
nfs: task xxxxx can't get a request slot
(replace xxxxx by your favorite task ID)
Note that since July several users are compiling in parallel, but their current
directory was also NFS-mounted from the Solaris box and we seldomly crash; the
difference now is that the current directory for the make runs is local and
only the source files are picked (using VPATH) from an NFS-mounted tree.
So it seems that the errata kernel do not fix this problem; IIRC I got this
problem when testing the build environment I'm now using and stop compiling
locally at about the time I install the errata. I'm afraid I've not enforced
strictly enough the "all other things equal" paradigm :-o