Red Hat Bugzilla – Bug 78232
Sytem freezes in random intervalls
Last modified: 2007-04-18 12:48:35 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.7 [de] (WinNT; I)
Description of problem:
DEll PowerEdge with 4 XEON 1.4GHz, 4GB
System hangs. I can't do any ping, telnet, console switch. keybord and Monitor freeze
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Waiting 4 or 5 hours without any heavy work on it
Actual Results: The system hangs
Expected Results: It had to run!
On kernel-smp-2.4.18-3 and kernel-smp-2.4.18-4 we have no Problems. The problem occurs with kernel-bigmem-2.4.18-17.7.x
and kernel-smp-2.4.18-17.7.x .
I found lines like this in dmesg:
megaraid: channel is raid.
scsi3 : LSI Logic MegaRAID 1.72 254 commands 15 targs 5 chans 7 luns
blk: queue f71ee218, I/O limit 4095Mb (mask 0xffffffff)
scsi3: scanning virtual channel 0 for logical drives.
Vendor: MegaRAID Model: LD 0 RAID1 138G Rev: 1.72
Type: Direct-Access ANSI SCSI revision: 02
blk: queue f6fc8e18, I/O limit 4095Mb (mask 0xffffffff)
scsi3: scanning virtual channel 1 for logical drives.
scsi3: scanning virtual channel 2 for logical drives.
Is this normal (blk:...) ?
yes that's normal
do you have a tg3 network card ? (eg broadcom 57xx ?)
may be Broadcom BCM 95701 Gigabit (i am not able to look at the system at the moment)
It is a Bradcomm BCM5700
can you for now try changing "tg3" to "bcm5700" in /etc/modules.conf ?
Ethernet controller: BROADCOM Corporation NetXtreme BCM5700 Gigabit Ethernet (rev 14)
Subsystem: Dell Computer Corporation NetXtreme 1000BaseTX
I changed "tg3" to "bcm5700" in /etc/modules.conf an the System runs up to now 15 hours. When the system doesnt hang until Monday I think that
was the solution.
BTW: 4GB and 4 XEON, is it better to use the bigmem or the smp kernel?
We use Informix Dynamic Server Version 9.30.UC2 on that system. Is it recommendable to enable Hyperthreading?
bigmem vs smp:
4Gb is the exact border case, so the answer is "depends". That needs explaining:
The PCI bus needs a "window" in the address range to operate, and that window
needs to be below 4Gb (32 bit). On PC's with less than 4Gb ram that is obviously
no problem since there is plenty of unused space. If you have 4Gb or more, there
wouldn't be any unused space, so what happens is that the chipset forces a hole.
Some chipsets will "move" the memory that was in the hole to above the 4Gb
range, others just make the ram disappear. We're talking about in the order of
300Mb of ram normally. The SMP kernel (by design) will not use RAM over 4Gb,
while the bigmem kernel can, at the cost of overall performance (supporting this
is more expensive in several places, since 64 bit pointers have to be used for
some things). Our installer will automatically select the bigmem kernel if it
finds ram > 4Gb, but I can also see it by hand if I see the first about 20 lines
of dmesg (eg the so called "e820" table). If the potential extra 300Mb is worth
it? Depends a bit on your workload but usually yes (because it serves as a cache
for the disks, and disks are REALLY slow compared to ram).
As for hyperthreading: for some workloads hyperthreading is a win, for others it
is a loss. In my experience for databases it tends to be a slight loss, but I
have no experience with Informix in specific.
[root@rinnen_lin root]# uptime
7:53am up 3 days, 14:37, 1 user, load average: 0.04, 0.01, 0.00
I think changing the driver was the solution.
Thanks a lot!
PS: BTW where can I mark this Request as "resolved"?