Description of problem: An HP DL380 with an Emulex LP10000 connected to an EMC Clariion CX300 hangs when doing intensive IO. Disabling HyperThreading in the BIOS solves this. Version-Release number of selected component (if applicable): kernel version 2.4.21-37.ELsmp, with builtin lpfc 7.3.2. How reproducible: Not very easy to reproduce, but it does happen eventually. Two examples: One machine has informix, and sometimes crashes when doing dbimport of a DB of a few GB. When running such imports in a loop, it usually crashes after a few hours (10-20 times). Another machine has a tape library connected and netbackup installed. When running in a loop a backup of data that's on the FC storage, it crashes after a few hours. Backup of local disks works well. I tried to run varios copies of files from/to it and did not manage to cause a crash in a shorter time. Steps to Reproduce: 1. 2. 3. Actual results: The machine completely freezes. SysRQ combinations do not work. Expected results: The machine should continue working normally. Additional info: As mentioned, disabling HT prevents the hangs. I did not check if it also happens with two real CPUs.
A small update - I tried the same dbimport loop with the older driver lpfc_703 which is shipped as part of kernel-smp-2.4.21-37.EL. The machine was stuck after 14 hours.
The problem was solved by upgrading EMC PowerPath (a non-free product that does multipath) from 4.3.0 to 4.3.4. Didi