From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7) Gecko/20011226 Description of problem: The sard patch uses the following formula for conversion from ticks to milliseconds in linux/drivers/block/genhd.c: #define MSEC(x) ((x) * 1000 / HZ) Normally, a millisecond counter would wrap over after ~1200 hours, but this formula wraps after just ~12 hours of I/O time. I easily achieve that on my home computer with less than two weeks' uptime. I propose a "smarter" definition: #if 1000 % HZ == 0 #define MSEC(x) ((x) * (1000 / HZ)) #elif HZ % 1000 == 0 #define MSEC(x) ((x) / (HZ / 1000)) #else #define MSEC(x) (((x) / HZ) * 1000 + ((x) % HZ) * 1000 / HZ) #endif Another option would be not to convert to milliseconds at all, but use hz_to_std() and leave it up to the userland. iostat currently expects ticks in /proc/partitions anyway (see bug 60029). Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Create a disk I/O load. 2.Wait enough time. 3.cat /proc/partitions Actual Results: major minor #blocks name rio rmerge rsect ruse wio wmerge wsect wuse running use aveq 3 0 30015216 hda 955889 1994254 15902124 9372400 1452474 1668133 22339116 35636184 0 4976800 2835871 Expected Results: Obviously, "wuse" and "aveq" have overflowed: they don't divide by 10 (as a result of x * 1000 / 100 would), and "aveq" is even less than "use," which is impossible. Additional info:
That's not a bug. The sard stats were never designed to track total IO, but rather to support interval IO tracking. "iostat" will take two successive snapshots of these stats and will display the ongoing IO statistics from them by taking the difference, and that should still work even when the counters wrap.
btw, most of the hz_to_std routines have the same overflow property. I'll mull this over --- it may well be worth cleaning this up somewhat.
iostat works fine when counters wrap at 2^32, but these ones wrap at 2^32 / 100. Here's an output of iostat for an interval where aveq wrapped over, resulting in bogus avgqu-sz: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda 27.96 4.19 13.78 4.79 318.40 69.63 20.89 70921.74 2921.21 43.84 8.14
Yes, I realised that just after the previous reply. I'm currently testing a kernel with #define MSEC(x) (((x) / HZ) * 1000 + ((x) % HZ) * 1000 / HZ) defined unconditionally --- the compiler seems to create decent code for that. (gcc has a number of very nice optimisations for such expressions.)
Fix checked in.