Bug 137595

Summary: iostat information for SAN luns is wrong
Product: Red Hat Enterprise Linux 3 Reporter: Dag Wieers <dag>
Component: sysstatAssignee: Charlie Bennett <ccb>
Status: CLOSED NEXTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
URL: http://dag.wieers.com/home-made/dstat/
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-29 20:17:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dag Wieers 2004-10-29 18:47:35 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20040922 Galeon/1.3.17

Description of problem:
When comparing the information gathered from /proc/stat and from
/proc/partitions (using iostat) there is a large difference. I've
written a tool that gets this information from both /proc/stat and
/proc/partitions (the way iostat does it) and the information from
/proc/stat conforms with the vmstat info _and_ with what we see on the
SAN switch.

So the iostat info is plainly wrong.

The setup is a Linux node connected to a SAN with 2 QLogic 2312 adapters.

I can guide you how to verify this information using dstat with both
/proc/stat and /proc/partitions. (Because the /proc/stat functionality
has currently been commented out in subversion). This is what it looks
like when both are enabled:

[root@node01 san]# /root/dstat -d -D sdb,hires 8
--------disk-i/o------- --------disk-i/o-------
____sdb____ ___hires___|____sdb____ ___hires___
   0B    0B:   0B    0B|   0B    0B:   0B    0B
   0B   15M:   0B  156M|   0B   32M:   0B  325M
   0B   17M:   0B  171M|   0B   29M:   0B  300M
   0B   17M:   0B  173M|   0B   28M:   0B  283M
   0B   16M:   0B  162M|   0B   28M:   0B  283M
   0B   18M:   0B  176M|   0B   28M:   0B  283M
   0B   17M:   0B  170M|   0B   29M:   0B  293M
Exiting on user request

The left is /proc/stat, the right is /proc/partitions (iostat formula,
iostat gives exactly the same info) The hires information is actually
the sum for 10 luns that belong to the same filesystem (striped over
the 10 luns).

The code for dstat is open, so you can experiment with it and verify
for yourself.

I consider this bug critical since this means the iostat data cannot
be trusted and when monitoring more 29 luns, /proc/stat is useless.

Version-Release number of selected component (if applicable):
sysstat-4.0.7-4.EL3.3

How reproducible:
Always

Steps to Reproduce:
1...  
2.
3.
    

Additional info:

Comment 1 Dag Wieers 2004-10-29 19:57:40 UTC
Let me add that this is Red Hat Advanced Server 3 U3 with kernel
2.4.21-20.EL.

Comment 2 Charlie Bennett 2004-10-29 19:58:37 UTC
Hi -

Do you have access the RHEL3 update 4 beta?  There's a new version
of the sysstat package (5.0.5) that uses /proc/partitions.

Otherwise, check out: 

http://people.redhat.com/ccb/sysstat/RHEL3/5.0.5-7.rhel3

Comment 3 Dag Wieers 2004-10-29 20:17:34 UTC
I only found sysstat-5.0.5-1 and it indeed fixed the problem. Now I
have to look what has changed and fix dstat too.

Thanks for the hint.