From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050323 Firefox/1.0.2 Fedora/1.0.2-1.3.1 Description of problem: When you put the file system under load the system starts to spiral with an ever incresing load. Sometimes the only way to recover the system is to power cycle it. RHEL 4 kernel-smp-2.6.9-5.0.5.EL & kernel-smp-2.6.9-5.0.3.EL & Running Dual Xeon system with 1gig of ram QLA2100 card connected to fiber channel chassis Software raid5 across 12 members with 2 spares mounted as /home Version-Release number of selected component (if applicable): kernel-smp-2.6.9-5.0.5.EL, kernel-smp-2.6.9-5.0.3.EL How reproducible: Always Steps to Reproduce: 1. Run bonnie or postmark on the filesystem that is on the QLA2100 card watch as system careens out of control. Loads >150 are not unusual if the system stays responsive. Setup/hardware: RHEL 4 kernel-smp-2.6.9-5.0.5.EL or kernel-smp-2.6.9-5.0.3.EL & Running Dual Xeon system with 1gig of ram QLA2100 card connected to fiber channel chassis 14 drives total. Software RAID5 across 12 members with 2 spare, ext3 fs, mounted /home Actual Results: System load increases dramatically, system becomes almost useless, or locks up. Expected Results: To run the bonnie or postmark and provide the results. Additional info:
This same setup is running under RH 8.0 with out any problems
This bug also appears to have shown up on the LKML http://lkml.org/lkml/2004/9/15/351
Upon further inspection (tearing down the box), the HBA in question is actually a QLA2000. The install detects and says that card is a 2100, and it uses the 2100 driver. This has now been tested on 2 other installs, with 2 other cards. QLA200's are detected as QLA2100's. Putting ANY stress on a software RAID 5 file system that the QLA2000's fiber channel arrays will cause the machine to dramatically increase load. Left unattended, the machine eventually locks hard.
We have updated RHEL 4 U2 to the latest driver from QLogic (kernel version 2.6.9-11.35, or higher). This will be available in beta test soon. Will you be able to do a test to confirm that this problem is fixed during the U2 beta? Or would you be able to try an updated driver that I supply (let me know your kernel version and type)? Thanks.
Have you been able to do a test with RHEL 4 U2?
I am no longer working in an environment where I can test this. I have forwarded the BUG on to the people working in that environment. I hope that they will be able to test and give you an answer.
The QLogic driver has been updated several times since this report. Please re-test with U3 and re-open if the problem persists.