Bug 172793

Summary: Bad I/O throughput when stressing Dell PERC 4/im LSI Fusion
Product: Red Hat Enterprise Linux 3 Reporter: Ingvar Hagelund <ingvar>
Component: kernelAssignee: Tom Coughlan <coughlan>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: dledford, johan.lithander, petrides, trond.nordheim
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 18:51:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Ingvar Hagelund 2005-11-09 20:41:28 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; nb-NO; rv:1.7.12) Gecko/20050920 Firefox/1.0.7

Description of problem:
When stressing the raid, throghput is good for a short period of time (some 10 minutes), and then drops dramaticly. All I/O takes a very long time, so most processes using the disk goes iowait and load rises. System becomes totally unusable. Basic file operation takes several minutes. Login times out. 

We reproduced this by extracting a large (24GB compressed, 86GB uncompressed, 14.5 million files and directories) tarball onto an ext3 filesystem on a 90GB LVM volume. The underlaying disk is a hardware RAID1 mirror, which is a common resource, that is, all local volumes is on that mirror (as it's the only disk resource on the blade).

We can reproduce this both with the driver in the u6 kernel 2.4.21-37.EL, and with Dells newer dkms driver mptlinux-2.05.16-1dkms. With the latest Dell driver, the situation becomes a tiny bit less terrible, but the system is still unusable.

The hardware is a Dell 1855 blade server with 6GB RAM. The Dell PERC 4/im is actually an LSI Fusion-MPT in disguise:

# lspci | grep LSI
04:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Add a very large compressed tarball, consisting of some 760,000 files, spread out over the bottom of a 18 level deep filetree (reverse hashtree directory structure), totally about 14.5 million files and directories.

2. Start untaring the file. Watch for some 10 minutes

3. See throughput for the rest of the system crumble

Actual Results:  All processes but the untarring process using the disk goes into iosleep. Terrible disk throughput. High system load (naturally). System unusable.

Expected Results:  Good throughput. Only short iowait periods.

Additional info:

# uname -a
Linux some.host 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux

# lsmod | grep mpt
mptscsih               43792   3
mptbase                50472   3  [mptscsih]
diskdumplib             6548   0  [mptscsih mptbase]
scsi_mod              130124   4  [usb-storage sg mptscsih sd_mod]

# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: DELL     Model: VIRTUAL DISK  IM Rev: 1998
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: SDR      Model: GEM318P          Rev: 1
  Type:   Processor                        ANSI SCSI revision: 02

exerpt from lspci -vvv
04:04.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)
        Subsystem: Dell: Unknown device 018a
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 72 (4250ns min, 4500ns max), cache line size 10
        Interrupt: pin A routed to IRQ 42
        Region 0: I/O ports at ec00 [size=256]
        Region 1: Memory at dfdf0000 (64-bit, non-prefetchable) [size=64K]
        Region 3: Memory at dfde0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at dfe00000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [68] PCI-X non-bridge device.
                Command: DPERE- ERO- RBC=0 OST=4
                Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM-

We have also tried a similar blade with rhel4 latest kernel 2.6.9-22.0.1.EL. The problem is visible on this blade too, but takes a lot longer (some 70-80 minutes) to trig. After the problem is visible, throughput is very bad, but a little less terrible than with the rhel3 installation, which means, It's possible to log into the system and do small basic file operations like ls and cp without waiting several minutes for answer.

The setup was tested in our customer's lab, using centos4, and worked flawlessly on simple SATA disks with the centos4 2.6.9-11.ELsmp kernel.

Comment 1 RHEL Product and Program Management 2007-10-19 18:51:21 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
For more information of the RHEL errata support policy, please visit:
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.