Red Hat Bugzilla – Bug 128109
Dell PowerEdge 400SC grinds to a halt during startup.
Last modified: 2007-11-30 17:07:03 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET
CLR 1.0.3705; .NET CLR 1.1.4322)
Description of problem:
System begins booting normally. Then, when individual services are
starting (named, ntpd, etc.) the system will suddenly run extremely
slowly with individual services taking a few minutes each to start.
After 10-15 minutes when the system has finally started it is
impossible to log in - login eventually times out before the
Password: prompt. The only way to recover is to power off the machine
resulting in possible file system corruption - Ctrl-Alt-Del has no
The system has a single P4 2.4 hyperthreading processor. With HT
enabled and running an SMP kernel, this slow startup happens every
time. With the standard kernel I *think* it only happens after
a "shutdown -r". If I "shutdown -h" so that the power is switched
off, it starts normally.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Enable hyperthreading in BIOS
2. Boot with SMP kernel, or "warm boot" with standard kernel.
3. Problem occurs when services are starting
Actual Results: Problem as described above
Expected Results: System should start normally
When the system is in this very slow state it responds normally to
I can't see anything obvious in /var/log/messages during one of these
very slow startups other than the amount of time it takes for
services to log their startup messages.
Had to reboot the box today, so did a shutdown -h to avoid above
issue. Left it a couple of minutes then powered on. Startup
progressed normally until services were starting. It then went slow,
as described above. Powered off, left a few minutes, powered on
again. Same problem. Went through this procedure a few times. On the
5th bootup it started normally. After about 5 minutes I noticed that
it suddenly started behaving slowly again because I couldn't ssh in
from another machine. My existing ssh session was still OK, so I did
a shutdown -h. The machine took 20 minutes to shutdown - each service
was taking 1-2 minutes to stop. After next bootup it was OK and is
still OK after 2 hours.
I don't think that this is a Dell issue - I am running Enterprise 3
on an old PII 400 MHz machine, too, and this same problem has
happened once on shutdown.
No improvements with the kernel from Update 3 which I installed
earlier. Cannot reboot the box at all now - have tried 5 times so far
but it grinds to halt during startup as described above :(
I downloaded and burned a KNOPPIX 3.4 CD earlier today - it boots
I'm sure I don't know what to ask either...
Jeff Burke -- do we have this type of Dell machine in-house?
I've now discovered that if I disconnect the network cable during boot up then
it starts normally, every time. Only with the network cable connected does the
slowness occur. The network interface is detected as
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
02:0c.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet
Controller (rev 02)
It's plugged into a cheap Belkin 8-port 10/100 switch.
Maybe John has some ideas?
If you boot-up w/ the network unplugged, then plug-in the network and ifup the
interface, do you still get the problem?
Have you tried other network cards with this box, plugged-in to the same
network port? And/or other boxes plugged-in to the same port? Do they behave
If you boot-up w/ the card plugged-in (so you get the "slowness"), then unplug
the card, does the slowness disappear? If the slowness disappears after
unplugging (or if you can survive the slowness long enough), please post the
contents of /proc/interrupts.
I would tend to suspect that there is a problem w/ the card, perhaps resulting
in an inordinately large amount of interrupts being processed? Just a guess,
It Comment #2 It was said "I don't think that this is a Dell issue - I am
running Enterprise 3 on an old PII 400 MHz machine, too, and this same problem
has happened once on shutdown." is this still true?
Are both of these systems plugged into the same "cheap Belkin 8-port 10/100
switch"? If so can you move them to a different switch. Also by the sounds of it
the belkin does not have a managment interface. If it does could you get the
port statistics for the system that are having the issues.
I would also like to confirm that this issue happens regardless if you do a
shutdown -h or a shutdown -r correct.
Could you also check if your system is running DKMS.
/sbin/chkconfig --list | grep dkms
If it is could you send the status of the DKMS application.
It has happened on the pII 400 machine just once more, when I rebooted after
installing Update 5. This machine is connected to a Netgear 10/100 hub on a
different LAN segment. However, it sorted itself out and, after about 45
minutes, was accepting SSH sessions again.
Back to the Dell machine - the Belkin is unmanaged.
It happens less frequently with a shutdown -h (including power-off). It also
seems to happen more when the environment is hot. Sounds odd, I know, but it
has happened less in winter when the room is around 20C than summer when it's
up to 30C.
I have a second card in the Dell - a cheap RealTek. This one doesn't appear to
cause any problems but it is plugged into a separate hub.
I'll try a different switch and report back.
Reassigning this to John Linville and reverting state to NEEDINFO.
I'm going to close this based on inactivity. Please reopen if the problem
remains and include the information discussed in comment 9 through comment 11.