Hide Forgot
Created attachment 479223 [details] iostats results hightlighted in screen capture Description of problem: We have discovered that this particular OS is returning sporadic and abnormally high disk IO peak values with iostat. This is occuring at a customer site (CISCO) on about a third of the 170 systems with the exact same hardware and OS. Version-Release number of selected component (if applicable): OS: Linux Version Red Hat Enterprise AS Release 4 Patch Level: 2.6.9-67.0.7.ELsmp Architecture: Intel Xeon 170 Model: ProLiant BL460c G1 How reproducible: Occurs sporadically multiple times during the week. Steps to Reproduce: 1.Turn on iostat 2. Monitor Disk IO on a daily basis 3. Actual results: Extremly high Disk IO values Expected results: Normal values Additional info:
Could you provide some "normal" and "extremely high" values Please confirm this is on RHEL 4 AS x86_64 arch.
Yes, this is on Linux Version Red Hat Enterprise AS Release 4 Patch Level 2.6.9-67.0.7.ELsmp Kernel Bits 64 Normal Values:147,948,032 Bytes/s,34,601 Bytes/s 15,071,113 Bytes/s 46,653,860 Bytes/s 59,713,704 Bytes/s Abnormal Values:54,619,309,970,930,144 Bytes/s, 9,223,372,036,854,775,807 Bytes/s, 1,529,216,978,344,320,256 Bytes/s
Hi there - any update on this ticket since my last clarification.
Hi Subhendu Ghosh, Can we please get an update on this ticket? Regards,Heather
Hi Heather Sorry - lot track of this ticket - pulling in a our kernel team for review.
OK - please keep me posted Cisco is needing a reply on this as soon as possible.
Has the customer actually seen an difference in performance or is this just a tool issue?
Also is there any info in which kernel version this issue was introduced?
This is a tool issue. Exist across 33% of the environment with 64-bit.
It looks like a counter overflow issue to me. The counters in /proc/diskstats can overflow and it is up to the userspace (here iostat) to deal with it. I know that latest versions of iostat do, but I can't tell for the version provided in RHEL4. Ivana, what do you think ?
Thanks Jerome. Just to clarify, is this a known bug fixed in later versions of RHEL?
Also, do you need me to confirm with the customer what version of iostat is installed on their machines? If so, what command do they run to verify the iostat version?
(In reply to comment #12) > Thanks Jerome. Just to clarify, is this a known bug fixed in later versions of > RHEL? Yes, there is a known issue about overflowing counter mishandled by iostat. See BZ 488181. What makes me think this the same (or a similar) problem is that the problem is sporadic and that a ridiculously big number is what is expected in case of an overflow. To be sure, we shall see the /proc/diskstat file when that happen.
Thanks again Jerome for your quick reply! I feel like we are getting somewhere here. I tried to access the Bug 488181 details but I am not authorized to view it. What is the resolution to the bug - is it to upgrade the iostat version or the RHEL Version?
Jerome is right, in RHEL-4 version of systat iostat uses variables which have not the same size as the kernel ones from /proc/diskstats. Thus overflows happens in some special configurations, the majority of them are fixed in later version of RHEL-4 sysstat. Please which version of sysstat have the customer installed?
Ivana, do you confirm this is a sysstat issue ?
It is probable, but at first I like to know the version of sysstat package (to be absolutely sure, I have to have /proc/diskstat file when the problem appear). But the version of sysstat should be enough to have really high probability of it.
Thanks Jerome and Ivana for your very quick response! The sysstat package version on all systems is 5.0.5
Created attachment 485769 [details] Data from iostat command
I also just attached the DISK I/O stats using the iostat command
Please can you paste the full nvr here (rpm -q sysstat output), 5.0.5 is not enough.
Sorry for the delay the customer doesn't have access to the machine to run the command yet. The full information that he had without running the command is 5.0.5 16.rhel4. Tomorrow I will send the results from the command.
Hello Ivana and Jerome, For systems where there are abnormal peaks, the package version is :sysstat-5.0.5-25.el4 For systems that do not have abnormal peaks, they have sysstat-5.0.5-16.rhel4 Regards, Heather
Hi there, Can you please confirm if the sysstat package version sysstat-5.0.5-25.el4 is the reason for the abnormal peaks and if I need to recommend to the customer to upgrade the version to fix this issue? Regards, Heather
Hello, from my point of view this seems to be sysstat bug, thus I'm changing the component to sysstat. The problem have to be investigated more, can you put here /proc/diskstats from the affected computer (ideally after the time when bug appears). There is necessary to generate additional debug data, I will create a test version of sysstat, which should be run on affected system and which show where is the problem.
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.