Bug 173196 - machine hangs with lpfc intensive IO+HT
machine hangs with lpfc intensive IO+HT
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Tom Coughlan
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2005-11-14 16:21 EST by didi
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-12-15 08:37:54 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description didi 2005-11-14 16:21:41 EST
Description of problem:
An HP DL380 with an Emulex LP10000 connected to an EMC Clariion CX300
hangs when doing intensive IO. Disabling HyperThreading in the BIOS solves

Version-Release number of selected component (if applicable):
kernel version 2.4.21-37.ELsmp, with builtin lpfc 7.3.2.

How reproducible:
Not very easy to reproduce, but it does happen eventually.
Two examples:
One machine has informix, and sometimes crashes when doing dbimport of a DB
of a few GB. When running such imports in a loop, it usually crashes after
a few hours (10-20 times).
Another machine has a tape library connected and netbackup installed. When
running in a loop a backup of data that's on the FC storage, it crashes after
a few hours. Backup of local disks works well.
I tried to run varios copies of files from/to it and did not manage to cause
a crash in a shorter time.

Steps to Reproduce:
Actual results:
The machine completely freezes. SysRQ combinations do not work.

Expected results:
The machine should continue working normally.

Additional info:
As mentioned, disabling HT prevents the hangs. I did not check if it also
happens with two real CPUs.
Comment 1 didi 2005-11-15 05:35:43 EST
A small update - I tried the same dbimport loop with the older driver
lpfc_703 which is shipped as part of kernel-smp-2.4.21-37.EL. The machine
was stuck after 14 hours.
Comment 2 didi 2005-12-14 23:55:14 EST
The problem was solved by upgrading EMC PowerPath (a non-free product that does
multipath) from 4.3.0 to 4.3.4.


Note You need to log in before you can comment on or make changes to this bug.