Bug 678154 - High Disk IO reports with IO Stat with Linux Version Red Hat Enterprise AS Release 4
Summary: High Disk IO reports with IO Stat with Linux Version Red Hat Enterprise AS Re...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: sysstat
Version: 4.0
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: rc
: ---
Assignee: Peter Schiffer
QA Contact: BaseOS QE - Apps
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-16 22:15 UTC by Heather
Modified: 2012-06-20 15:53 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 15:53:44 UTC
Target Upstream Version:


Attachments (Terms of Use)
iostats results hightlighted in screen capture (135.35 KB, application/octet-stream)
2011-02-16 22:15 UTC, Heather
no flags Details
Data from iostat command (2.56 MB, application/x-gzip)
2011-03-16 15:47 UTC, Heather
no flags Details

Description Heather 2011-02-16 22:15:30 UTC
Created attachment 479223 [details]
iostats results hightlighted in screen capture

Description of problem:
We have discovered that this particular OS is returning sporadic and abnormally high disk IO peak values with iostat. This is occuring at a customer site (CISCO) on about a third of the 170 systems with the exact same hardware and OS.

Version-Release number of selected component (if applicable):
OS: Linux Version Red Hat Enterprise AS Release 4 
Patch Level: 2.6.9-67.0.7.ELsmp 
Architecture: Intel Xeon 170  
Model: ProLiant BL460c G1

How reproducible:
Occurs sporadically multiple times during the week.

Steps to Reproduce:
1.Turn on iostat
2. Monitor Disk IO on a daily basis
3.
  
Actual results:
Extremly high Disk IO values

Expected results:
Normal values

Additional info:

Comment 1 Subhendu Ghosh 2011-02-23 04:40:14 UTC
Could you provide some "normal" and "extremely high" values

Please confirm this is on RHEL 4 AS x86_64 arch.

Comment 2 Heather 2011-02-23 14:32:55 UTC
Yes, this is on Linux Version Red Hat Enterprise AS Release 4 
Patch Level 2.6.9-67.0.7.ELsmp 
Kernel Bits 64 


Normal Values:147,948,032 Bytes/s,34,601 Bytes/s 15,071,113 Bytes/s 46,653,860 Bytes/s 59,713,704 Bytes/s 


Abnormal Values:54,619,309,970,930,144 Bytes/s, 9,223,372,036,854,775,807 Bytes/s, 1,529,216,978,344,320,256 Bytes/s

Comment 3 Heather 2011-03-02 21:37:30 UTC
Hi there - any update on this ticket since my last clarification.

Comment 4 Heather 2011-03-08 20:53:00 UTC
Hi Subhendu Ghosh, Can we please get an update on this ticket? Regards,Heather

Comment 6 Subhendu Ghosh 2011-03-10 19:39:42 UTC
Hi Heather 

Sorry - lot track of this ticket - pulling in a our kernel team for review.

Comment 7 Heather 2011-03-14 22:55:33 UTC
OK - please keep me posted Cisco is needing a reply on this as soon as possible.

Comment 8 Ric Wheeler 2011-03-15 00:54:25 UTC
Has the customer actually seen an difference in performance or is this just a tool issue?

Comment 9 Vivek Goyal 2011-03-15 01:50:41 UTC
Also is there any info in which kernel version this issue was introduced?

Comment 10 Heather 2011-03-15 13:21:20 UTC
This is a tool issue. Exist across 33% of the environment with 64-bit.

Comment 11 Jerome Marchand 2011-03-15 13:39:11 UTC
It looks like a counter overflow issue to me. The counters in /proc/diskstats
can overflow and it is up to the userspace (here iostat) to deal with it. I
know that latest versions of iostat do, but I can't tell for the version
provided in RHEL4.
Ivana, what do you think ?

Comment 12 Heather 2011-03-15 15:33:49 UTC
Thanks Jerome. Just to clarify, is this a known bug fixed in later versions of RHEL?

Comment 13 Heather 2011-03-15 15:39:35 UTC
Also, do you need me to confirm with the customer what version of iostat is installed on their machines? If so, what command do they run to verify the iostat version?

Comment 14 Jerome Marchand 2011-03-15 16:26:42 UTC
(In reply to comment #12)
> Thanks Jerome. Just to clarify, is this a known bug fixed in later versions of
> RHEL?

Yes, there is a known issue about overflowing counter mishandled by iostat. See BZ 488181.

What makes me think this the same (or a similar) problem is that the problem is sporadic and that a ridiculously big number is what is expected in case of an overflow. 

To be sure, we shall see the /proc/diskstat file when that happen.

Comment 15 Heather 2011-03-15 17:04:40 UTC
Thanks again Jerome for your quick reply! I feel like we are getting somewhere here. I tried to access the Bug 488181 details but I am not authorized to view it. What is the resolution to the bug - is it to upgrade the iostat version or the RHEL Version?

Comment 16 Ivana Varekova 2011-03-16 07:38:17 UTC
Jerome is right, in RHEL-4 version of systat iostat uses variables which have not the same size as the kernel ones from /proc/diskstats. Thus overflows happens in some special configurations, the majority of them are fixed in later version of RHEL-4 sysstat. Please which version of sysstat have the customer installed?

Comment 17 Jerome Marchand 2011-03-16 09:30:20 UTC
Ivana, do you confirm this is a sysstat issue ?

Comment 18 Ivana Varekova 2011-03-16 09:43:13 UTC
It is probable, but at first I like to know the version of sysstat package (to be absolutely sure, I have to have /proc/diskstat file when the problem appear). But the version of sysstat should be enough to have really high probability of it.

Comment 19 Heather 2011-03-16 15:38:57 UTC
Thanks Jerome and Ivana for your very quick response! The sysstat package version on all systems is 5.0.5

Comment 20 Heather 2011-03-16 15:47:31 UTC
Created attachment 485769 [details]
Data from iostat command

Comment 21 Heather 2011-03-16 15:48:28 UTC
I also just attached the DISK I/O stats using the iostat command

Comment 22 Ivana Varekova 2011-03-17 12:22:03 UTC
Please can you paste the full nvr here (rpm -q sysstat output), 5.0.5 is not enough.

Comment 23 Heather 2011-03-17 22:24:24 UTC
Sorry for the delay the customer doesn't have access to the machine to run the command yet. The full information that he had without running the command is 

5.0.5 16.rhel4. 

Tomorrow I will send the results from the command.

Comment 24 Heather 2011-03-18 14:30:36 UTC
Hello Ivana and Jerome,

For systems where there are abnormal peaks, the package version is :sysstat-5.0.5-25.el4

For systems that do not have abnormal peaks, they have sysstat-5.0.5-16.rhel4


Regards,
Heather

Comment 25 Heather 2011-03-22 15:46:32 UTC
Hi there,

Can you please confirm if the sysstat package version sysstat-5.0.5-25.el4 is the reason for the abnormal peaks and if I need to recommend to the customer to upgrade the version to fix this issue?

Regards,
Heather

Comment 26 Ivana Varekova 2011-03-25 11:08:05 UTC
Hello, 
from my point of view this seems to be sysstat bug, thus I'm changing the component to sysstat. The problem have to be investigated more, can you put here /proc/diskstats from the affected computer (ideally after the time when bug appears). There is necessary to generate additional debug data, I will create a test version of sysstat, which should be run on affected system and which show where is the problem.

Comment 28 Jiri Pallich 2012-06-20 15:53:44 UTC
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.


Note You need to log in before you can comment on or make changes to this bug.