Bug 112662

Summary: Disk /proc/stat output seems to be incomplete (2.4.22-1.2135.nptlsmp)
Product: [Fedora] Fedora Reporter: Bryce <root>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 1   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-29 19:52:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryce 2003-12-26 19:08:09 UTC
Description of problem:

This all relates to iostat and /proc/stat under kernel
2.4.22-1.2135.nptlsmp

I've a system that has 6 disks as follows

4x300Gb IDE (hda-d)
1x250Gb SATA (sdb)
1x37Gb  SCSI (sda)

From the system bootup messages:



It detects the 4 IDE drives fine,..
-----------------------------------
ICH5: IDE controller at PCI slot 00:1f.1
ICH5: chipset revision 2
ICH5: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
hda: Maxtor 5A300J0, ATA DISK drive
hdb: Maxtor 5A300J0, ATA DISK drive
blk: queue c0496600, I/O limit 4095Mb (mask 0xffffffff)
blk: queue c049674c, I/O limit 4095Mb (mask 0xffffffff)
hdc: Maxtor 5A300J0, ATA DISK drive
hdd: Maxtor 5A300J0, ATA DISK drive
blk: queue c0496a74, I/O limit 4095Mb (mask 0xffffffff)
blk: queue c0496bc0, I/O limit 4095Mb (mask 0xffffffff)



It finds the SCSI drive fine,...
-----------------------------------
SCSI subsystem driver Revision: 1.00
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz,
512 SCBs

scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.10
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI 33 or 66Mhz,
512 SCBs

blk: queue c29c9a18, I/O limit 4095Mb (mask 0xffffffff)
(scsi0:A:0): 320.000MB/s transfers (160.000MHz DT|IU|QAS, 16bit)
  Vendor: SEAGATE   Model: ST336753LW        Rev: 0006
  Type:   Direct-Access                      ANSI SCSI revision: 03
blk: queue c29c9818, I/O limit 4095Mb (mask 0xffffffff)
scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB)
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 >



It finds the SATA drive fine,...
-----------------------------------
libata version 0.81 loaded.
ata_piix version 0.95
PCI: Setting latency timer of device 00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0xC000 ctl 0xC402 bmdma 0xD000 irq 18
ata2: SATA max UDMA/133 cmd 0xC800 ctl 0xCC02 bmdma 0xD008 irq 18
ata1: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4003 85:7c69 86:3e01
87:4003 88:207f
ata1: dev 0 ATA, max UDMA/133, 490234752 sectors (lba48)
ata1: dev 0 configured for UDMA/133
ata2: SATA port has no device. disabling.
ata2: thread exiting
scsi2 : ata_piix
scsi3 : ata_piix
  Vendor: ATA       Model: Maxtor 7Y250M0    Rev: 0.81
  Type:   Direct-Access                      ANSI SCSI revision: 05
Attached scsi disk sdb at scsi2, channel 0, id 0, lun 0
SCSI device sdb: 490234752 512-byte hdwr sectors (251000 MB)
 sdb: sdb1



HOWEVER,...
when I was looking through the iostats I discovered that it was
kicking out stats on only 4 drives

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0            0.00         0.00         0.00          0          0
dev3-1            0.00         0.00         0.00          0          0
dev8-0            0.00         0.00         0.00          0          0
dev8-1            0.00         0.00         0.00          0          0

I pulled dows the src for iostats (sysstat) and went hunting through
it, to find out that it was pulling out information directly from
/proc/stat. Looking at /proc/stat I found that it was only kicking out
the following

[root@ZenIV sysstat-4.0.7]# cat /proc/stat
... disk_io: (3,0):(33,32,392,1,8) (3,1):(36,35,408,1,8)
(8,0):(70759,60152,836256,10607,284666)
(8,1):(289632,222,2186,289410,39552880) 

where did (3,2) and (3,3) go?

I looked into fs/proc/proc_misc.c (+350ish lines) where the code
fragment for printing out the values in /proc/stat resides.

         proc_sprintf(page, &off, &len, "\ndisk_io: ");
         for (major = 0; major < DK_MAX_MAJOR; major++) {
                 for (disk = 0; disk < DK_MAX_DISK; disk++) {
                         int active = kstat.dk_drive[major][disk] +
                                 kstat.dk_drive_rblk[major][disk] +
                                 kstat.dk_drive_wblk[major][disk];

humm,...
I then added in some proc_sprintf's to see what values were being
passed, however that wasn't very conclusive beyond just confirming
that the kstat.dk_drive* structure didn't have any stats for those
devices.

....
Major = >03< Disk = >00<
kstat.dk_drive[major][disk] = 0x001f
kstat.dk_drive_rblk[major][disk] = 0x0140
kstat.dk_drive_wblk[major][disk] = 0x0008
(3,0):(31,30,320,1,8) 
Major = >03< Disk = >01<
kstat.dk_drive[major][disk] = 0x0016
kstat.dk_drive_rblk[major][disk] = 0x00d8
kstat.dk_drive_wblk[major][disk] = 0x0008
(3,1):(22,21,216,1,8) 
Major = >03< Disk = >02<
kstat.dk_drive[major][disk] = 0x0000
kstat.dk_drive_rblk[major][disk] = 0x0000
kstat.dk_drive_wblk[major][disk] = 0x0000

Major = >03< Disk = >03<
kstat.dk_drive[major][disk] = 0x0000
kstat.dk_drive_rblk[major][disk] = 0x0000
kstat.dk_drive_wblk[major][disk] = 0x0000
....
Major = >08< Disk = >00<
kstat.dk_drive[major][disk] = 0x17a3
kstat.dk_drive_rblk[major][disk] = 0x12110
kstat.dk_drive_wblk[major][disk] = 0x8f82
(8,0):(6051,3776,74000,2275,36738) 
Major = >08< Disk = >01<
kstat.dk_drive[major][disk] = 0x003e
kstat.dk_drive_rblk[major][disk] = 0x035a
kstat.dk_drive_wblk[major][disk] = 0x0020
(8,1):(62,58,858,4,32) 
....


Now the way I read the code, I'm expecting output for 3,2 and 3,3
however the kernel structure doesn't seem to be filled out for these
devices.

I did think that perhaps the 3,0 and 3,1 entries were perhaps both
disks on the same channel being counted as one but thats rules out
when catting one of the drives to /dev/null as none of the counters
are updated.

I'd hunt this down further if I understood where and how the
kstat.dk_drive* structures were manipulated but alas thats outside my
knowledge as it degenerates into a mess of MACROS.

Actually a grep -r shows up two possible areas,..
drivers/block/ll_rw_blk.c:712            
drivers/md/md.c:3386




Current results:
avg-cpu:  %user   %nice    %sys   %idle
           0.00    0.00    9.50   90.50

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0            0.00         0.00         0.00          0          0
dev3-1          726.00     92928.00         0.00      92928          0
dev8-0            0.00         0.00         0.00          0          0
dev8-1            0.00         0.00         0.00          0          0

Expected results:
avg-cpu:  %user   %nice    %sys   %idle
           0.00    0.00    9.50   90.50

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0            0.00         0.00         0.00          0          0
dev3-1          726.00     92928.00         0.00      92928          0
dev3-2            0.00         0.00         0.00          0          0
dev3-3            0.00         0.00         0.00          0          0
dev8-0            0.00         0.00         0.00          0          0
dev8-1            0.00         0.00         0.00          0          0


Phil
=--=

Comment 1 David Lawrence 2004-09-29 19:52:47 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/