From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.3) Gecko/20040803 Description of problem: The kernel shipping with RHEL3 does not correctly report the 'running' value for IDE whole disk devices in /proc/partitions. While on occasion a small negative value 0..-5 might make sense it becomes consistently negative in the -50 range. This hoses up the aveq value and makes sysstat report outlandish results. See? [root@butterbeer sysstat-5.0.5]# cat /proc/partitions major minor #blocks name rio rmerge rsect ruse wio wmerge wsect wuse running use aveq 3 0 19551168 hda 122805 120687 1945522 560190 1500445 1907120 27316770 23839767 4294967251 2102921 16380310 3 1 102280 hda1 46 110 312 300 89 80 338 29500 0 29430 29800 3 2 18400536 hda2 117580 120369 1903626 539430 1498093 1882263 27100104 22964897 0 771510 24042207 3 3 1048320 hda3 5137 42 41168 19910 2263 24777 216328 845370 0 366590 865280 The stats for hda include an in_flight value of -45. Given the fact that all of its constituent partitions are reporting 0 I'd expect the total for the device to be 0. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. on an RHEL3 system with IDE disks 2. cat /proc/partitions 3. look at hda, hdb, etc. Additional info:
*** Bug 128907 has been marked as a duplicate of this bug. ***
*** Bug 98542 has been marked as a duplicate of this bug. ***
Is this IDE disks only? Have we checked SCSI disks? Also, does the number seem to be constant or does it drift a lot?
In the recent cases that I've seen, the reports were against IDE disks. The only SCSI ones that I've seen issues against were using emulex closed source drivers. I don't know if that's representitive though.
Have just stumbled onto this bug in Red Hat Enterprise Linux ES release 2.1 kernel 2.4.9-e.38enterprise for SCSI (actually fiber) disks. "iostat -x" showing 100% busy and very large queue times and similar negative numbers in /proc/partitions
The problem here is that ioctls do not increment kernel data associated with the running field of the /proc/partitions output but the interrupt handler does decrement it. The fix is to only account for the IO if it was a READ or WRITE command. This patch fixes the problem and will be in the next RHEL3 kernel. -- linux-2.4.21/drivers/block/ll_rw_blk.c.orig +++ linux-2.4.21/drivers/block/ll_rw_blk.c @@ -855,11 +855,13 @@ void req_finished_io(struct request *req { struct hd_struct *hd1, *hd2; - locate_hd_struct(req, &hd1, &hd2); - if (hd1) - account_io_end(hd1, req); - if (hd2) - account_io_end(hd2, req); + if (blk_fs_request(req)) { + locate_hd_struct(req, &hd1, &hd2); + if (hd1) + account_io_end(hd1, req); + if (hd2) + account_io_end(hd2, req); + } } EXPORT_SYMBOL(req_finished_io); #endif /* CONFIG_BLK_STATS */ Larry Woodman
Please note that Larry's patch above did not make the RHEL3 U5 code freeze deadline, and thus it is unlikely to be included in the "next RHEL3 kernel" as he wrote in the prior comment.
A fix for this problem was committed to the RHEL3 U6 patch pool on 22-Apr-2005 (in kernel version 2.4.21-32.2.EL).
*** Bug 103706 has been marked as a duplicate of this bug. ***
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html
CRM closed, closing this Internal Status set to 'Resolved' Status set to: Closed by Tech This event sent from IssueTracker by pdemauro issue 44494