78232 – Sytem freezes in random intervalls

Bug 78232 - Sytem freezes in random intervalls

Summary: Sytem freezes in random intervalls

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	7.3
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Arjan van de Ven
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-11-20 14:19 UTC by Michael Mehlhorn
Modified:	2007-04-18 16:48 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-11-25 07:40:20 UTC
Embargoed:

Attachments	(Terms of Use)

Description Michael Mehlhorn 2002-11-20 14:19:26 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.7 [de] (WinNT; I)

Description of problem:
DEll PowerEdge with 4 XEON 1.4GHz, 4GB
System hangs. I can't do any ping, telnet, console switch. keybord and Monitor freeze

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
 Waiting 4 or 5 hours without any heavy work on it

	

Actual Results:  The system hangs

Expected Results:  It had to run!

Additional info:

On kernel-smp-2.4.18-3 and kernel-smp-2.4.18-4 we have no Problems. The problem occurs with kernel-bigmem-2.4.18-17.7.x
and kernel-smp-2.4.18-17.7.x .
I found lines like this in dmesg:
-------------------
megaraid: channel[2] is raid.
scsi3 : LSI Logic MegaRAID 1.72 254 commands 15 targs 5 chans 7 luns
blk: queue f71ee218, I/O limit 4095Mb (mask 0xffffffff)
scsi3: scanning virtual channel 0 for logical drives.
  Vendor: MegaRAID  Model: LD 0 RAID1  138G  Rev: 1.72
  Type:   Direct-Access                      ANSI SCSI revision: 02
blk: queue f6fc8e18, I/O limit 4095Mb (mask 0xffffffff)
scsi3: scanning virtual channel 1 for logical drives.
scsi3: scanning virtual channel 2 for logical drives.

-------------------
Is this normal (blk:...) ?

Comment 1 Arjan van de Ven 2002-11-20 14:22:44 UTC

yes that's normal
do you have a tg3 network card ? (eg broadcom 57xx ?)

Comment 2 Michael Mehlhorn 2002-11-21 10:53:00 UTC

may be Broadcom BCM 95701 Gigabit (i am not able to look at the system at the moment)

Comment 3 Michael Mehlhorn 2002-11-21 15:08:01 UTC

It is a Bradcomm BCM5700

Comment 4 Arjan van de Ven 2002-11-21 15:10:28 UTC

can you for now try changing "tg3" to "bcm5700" in /etc/modules.conf ?

Comment 5 Michael Mehlhorn 2002-11-21 15:57:21 UTC

Ethernet controller: BROADCOM Corporation NetXtreme BCM5700 Gigabit Ethernet (rev 14)
Subsystem: Dell Computer Corporation NetXtreme 1000BaseTX

Comment 6 Michael Mehlhorn 2002-11-22 07:15:21 UTC

I changed "tg3" to "bcm5700" in /etc/modules.conf  an the System runs up to now 15 hours. When the system doesnt hang until Monday I think that 
was the solution.
BTW: 4GB and 4 XEON, is it better to use the bigmem or the smp kernel?
We use Informix Dynamic Server Version 9.30.UC2 on that system. Is it recommendable to enable Hyperthreading?

Comment 7 Arjan van de Ven 2002-11-22 09:53:48 UTC

bigmem vs smp:
4Gb is the exact border case, so the answer is "depends". That needs explaining:
The PCI bus needs a "window" in the address range to operate, and that window
needs to be below 4Gb (32 bit). On PC's with less than 4Gb ram that is obviously
no problem since there is plenty of unused space. If you have 4Gb or more, there
wouldn't be any unused space, so what happens is that the chipset forces a hole.
Some chipsets will "move" the memory that was in the hole to above the 4Gb
range, others just make the ram disappear. We're talking about in the order of
300Mb of ram normally. The SMP kernel (by design) will not use RAM over 4Gb,
while the bigmem kernel can, at the cost of overall performance (supporting this
is more expensive in several places, since 64 bit pointers have to be used for
some things). Our installer will automatically select the bigmem kernel if it
finds ram > 4Gb, but I can also see it by hand if I see the first about 20 lines
of dmesg (eg the so called "e820" table). If the potential extra 300Mb is worth
it? Depends a bit on your workload but usually yes (because it serves as a cache
for the disks, and disks are REALLY slow compared to ram).

As for hyperthreading: for some workloads hyperthreading is a win, for others it
is a loss. In my experience for databases it tends to be a slight loss, but I
have no experience with Informix in specific.

Comment 8 Michael Mehlhorn 2002-11-25 07:40:13 UTC

[root@rinnen_lin root]# uptime
7:53am  up 3 days, 14:37,  1 user,  load average: 0.04, 0.01, 0.00

I think changing the driver was the solution.
Thanks a lot!

Michael Mehlhorn

PS: BTW where can I mark this Request as "resolved"?

Note You need to log in before you can comment on or make changes to this bug.