Bug 129545
Summary: | High iowait and system load while copying files on SATA raid drive | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Martijn Kint <martijn> | ||||
Component: | kernel | Assignee: | Jeff Garzik <jgarzik> | ||||
Status: | CLOSED WONTFIX | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.0 | CC: | daniel, jon, paul, peterm, petrides, richard.cunningham, riel, tnicks | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-10-19 19:21:10 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Martijn Kint
2004-08-10 09:43:37 UTC
Just reconfigured the raid drive to a RAID 1 config (giving me a storage capacity of 240 GB, ext3). In this case the load remains very low ( < ~0.5) system stays responsive, this is with the latest kernel for ES 3.0 (2.4.21-18.ELsmp). I copied a 4GB folder from my workstation to the samba shared folder (/dev/sda9) on the server, as stated above, the load remained low. I'm going to reconfigure it back to a RAID 5 config and see if i'm getting the same results as with a RAID 1 config. iostat results: [root@heinekenserver root]# iostat -k Linux 2.4.21-18.ELsmp (heinekenserver.localdomain) 08/19/2004 avg-cpu: %user %nice %sys %iowait %idle 4.67 0.03 2.50 8.67 84.14 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 37.65 139.00 2549.57 251329 4609821 sda1 0.02 0.05 0.01 97 17 sda2 4.12 19.84 23.32 35881 42164 sda3 5.85 45.15 55.66 81637 100644 sda4 0.00 0.00 0.00 2 0 sda5 0.02 0.08 0.02 141 36 sda6 0.01 0.05 0.00 84 0 sda7 3.63 27.82 48.83 50305 88284 sda8 0.11 0.12 1.11 221 2000 sda9 22.18 0.48 2420.62 865 4376676 Hi Martijn, we've noticed the same problem on a CERC RAID-1 config with RHEL 3 Update 2. We've filed a support ticket with Red Hat; it's ticket 354372 . Also at least one other person on Dell's forums is experiencing a very similar problem - see http://forums.us.dell.com/supportforums/board/message?board.id=pes_hardrive&message.id=1588ssage.id=15850 We've also noticed bug 92129 on bugzilla.redhat.com - different controller (PERC rather than CERC), but we're wondering if the excessive spinlock holds mentioned by one poster in that thread could coincide with this problem. Jeff, any thoughts? - Paul Created attachment 102908 [details]
vmstat during problem occurrence
This 'vmstat 1 600' shows two instances when the bug seems to manifest for us.
Hi Paul, The link you provided seems broken :) About the RAID 1 configuration, that actually "solved" the problems for me, i was getting these insane high iowait and system load figures with a RAID 5 config. Still didn't get to revert it back to RAID 5, which i kind of need. About <a href="show_bug.cgi?id=92129" title="ASSIGNED - (SCSI AACRAID)kernel: aacraid: Host adapter reset request. SCSI hang ?">bug 92129</a> i did look in to that one, but i'm not getting time-outs sorry about that, Martijn, I'll try again: http://forums.us.dell.com/supportforums/board/message?board.id=pes_hardrive&message.id=15850 I can definitely confirm that a similar problem happens here on our RAID-1 setup - perhaps it is worse on RAID-5, or perhaps we're seeing two separate problems with similar symptoms? We are also seeing hi iowait figures on a system using Core 3 and RAID 5 on a 3ware 9500-S. Any resolution to this issue? Does it still exist in RHEL4? We are seeing it with a Clariion RAID 10 connected to a DL585 AMD, using qlogic. It continues to bring down the site, either via a reboot or system degradation. Running the latest linux kernel. Anybody have anything? I am seeing this problem with a Dell CERC 6Ch raid controller on RHEL 4. My driver version is 1.1-5[2412]. I have a: Dell Poweredge 1800 3.0 GHz Dual Pentium Xeon in 64 bit mode 4 GB RAM 3x80 SATA Drives connected to a Dell CERC 6ch RAID controller in a RAID-5 configuration. LVM is being used to manage the RAID device. Up to date, with a non-tainted kernel. Exactly the same cause. Copying files to the disk (in my case from a DVD) results in extremely high iowait. The system becomes almost completely unresponsive until the disk activity stops. The unresponsivness lasts 5 minutes or so, which is long enough to cause network timeouts, and so is a reliability problem (not just a performance problem). I am more than happy to look into this, provide debugging information, try new kernels, etc. This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you. |