From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20060601 Firefox/2.0.0.6 (Ubuntu-edgy) Description of problem: Random devices from the SAN with no or little IO are shown at high %util, and avqu-sz stays high. (1) No reads/writes, but %util of affected random devs are > 100%..!! avg-cpu: %user %nice %sys %iowait %idle 0.12 0.00 0.25 0.00 99.62 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdac 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 47.05 0.00 0.00 100.10 sdcm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 100.10 sde, sdac, sdba and sdby are same SAN luns, via different HBAs. Per /proc/diskstats, (2) 8 64 sde 210195 0 2026570 83908 2 0 16 1 0 39948 83885 65 192 sdac 363856 0 3255410 274237 0 0 0 0 47 223095549 370697069 <=== sdac, troubled dev 67 64 sdba 97098 0 1121016 60189 2 0 16 2 0 22840 60176 68 192 sdby 98475 0 1139500 60545 0 0 0 0 0 23123 60528 The # of I/Os currently in progress (47, same as avgqu-sz in (1) ) stays the same even when there's no IO to the devices. This is probably a sysfs bug evident in cases where the sysfs device tree is large. Version-Release number of selected component (if applicable): kernel-smp-2.6.9-55.0.2.EL.x86_64 How reproducible: Always Steps to Reproduce: 1. This machine has about 96 /dev/sdXX entries. 2. Shortly after boot up, do " iostat -x 1 |egrep "^sd" | awk {'print $1 "\t" $14'}|grep -v "0.00" " 3. Actual Results: Expected Results: %util should stay reasonable when no IO is happening to the device. Additional info: not evident on 2.6.9-42.ELsmp
This problem appears to be related to an issue which has been reported before where 1000+LUN's can cause a system to take an excessive amount of time to boot. Excessive as in HOURS! An ancillary issue may be that running pvdisplay on these hosts can take as long as 24 hours to complete.
I am not sure Anu is having the kind of problem mentioned in comment 2. He only has 96 LUNS. Anu, are you having any long delay in booting, or running commands like pvdisplay? Please post a sysreport, so we can try to reproduce your problem. Barry, would you be able to set this up in our lab, and check iostat?
Tom is right. The issue with longer init on nodes with large no. of luns (>1000) is visible on any Kernel ver, and is a separate issue. This problem reported here, is only evident on kernel-smp-2.6.9-55.0.2.EL.x86_64, in which devices per /proc/diskstat show heavy %util, when no IO is underway. Will post sysreport shortly.
Created attachment 212731 [details] sysreport o/p Pls note, sysreport was ran after the kernel was downgraded to 2.6.9-42.ELsmp due to the reported problem.
Problem observed with the updated 2.6.9-55.0.9.ELsmp kernel: # iostat -x sdcb 1 Linux 2.6.9-55.0.9.ELsmp (nyblawdev9) 10/11/2007 avg-cpu: %user %nice %sys %iowait %idle 0.29 0.00 0.14 0.03 99.54 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdcb 0.00 0.00 0.01 0.00 0.19 0.00 0.09 0.00 15.44 0.57 1.95 47464.93 57.48 avg-cpu: %user %nice %sys %iowait %idle 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdcb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 100.10 avg-cpu: %user %nice %sys %iowait %idle 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdcb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 100.10 avg-cpu: %user %nice %sys %iowait %idle 0.00 0.00 0.00 0.00 100.00 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdcb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 100.10 avg-cpu: %user %nice %sys %iowait %idle 0.00 0.00 0.00 0.00 100.00
# nl /proc/diskstats 1 1 0 ram0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 ram1 0 0 0 0 0 0 0 0 0 0 0 3 1 2 ram2 0 0 0 0 0 0 0 0 0 0 0 4 1 3 ram3 0 0 0 0 0 0 0 0 0 0 0 5 1 4 ram4 0 0 0 0 0 0 0 0 0 0 0 6 1 5 ram5 0 0 0 0 0 0 0 0 0 0 0 7 1 6 ram6 0 0 0 0 0 0 0 0 0 0 0 8 1 7 ram7 0 0 0 0 0 0 0 0 0 0 0 9 1 8 ram8 0 0 0 0 0 0 0 0 0 0 0 10 1 9 ram9 0 0 0 0 0 0 0 0 0 0 0 11 1 10 ram10 0 0 0 0 0 0 0 0 0 0 0 12 1 11 ram11 0 0 0 0 0 0 0 0 0 0 0 13 1 12 ram12 0 0 0 0 0 0 0 0 0 0 0 14 1 13 ram13 0 0 0 0 0 0 0 0 0 0 0 15 1 14 ram14 0 0 0 0 0 0 0 0 0 0 0 16 1 15 ram15 0 0 0 0 0 0 0 0 0 0 0 17 3 0 hda 0 0 0 0 0 0 0 0 0 0 0 18 104 0 cciss/c0d0 60966 3927 1479246 158516 301445 644760 7570680 577670 0 188097 736205 19 104 1 cciss/c0d0p1 80 266 11 24 20 104 2 cciss/c0d0p2 64807 1478916 946332 7570656 21 8 0 sda 455 0 4120 1718 0 0 0 0 0 248 1718 22 8 16 sdb 450 0 4080 2425 0 0 0 0 0 380 2425 23 8 32 sdc 454 0 4112 3157 0 0 0 0 0 384 3157 24 8 48 sdd 450 0 4080 2393 0 0 0 0 0 289 2393 25 8 64 sde 890 0 7586 2368 2 0 16 1 0 464 2369 26 8 80 sdf 883 0 7384 2443 0 0 0 0 0 476 2441 27 8 96 sdg 877 0 7472 2747 0 0 0 0 0 544 2747 28 8 112 sdh 879 0 7504 4329 0 0 0 0 0 781 4329 29 8 128 sdi 879 0 7624 3657 0 0 0 0 0 838 3656 30 8 144 sdj 876 0 7264 6441 0 0 0 0 0 771 6441 31 8 160 sdk 888 0 7608 4133 0 0 0 0 0 970 4133 32 8 176 sdl 873 0 7584 3845 0 0 0 0 0 520 3845 33 8 192 sdm 879 0 7512 2478 0 0 0 0 0 461 2478 34 8 208 sdn 876 0 7488 2794 0 0 0 0 0 409 2794 35 8 224 sdo 887 0 7562 2539 2 0 16 2 0 434 2541 36 8 240 sdp 879 0 7368 4121 0 0 0 0 0 514 4121 37 65 0 sdq 876 0 7464 2781 0 0 0 0 0 462 2781 38 65 16 sdr 874 0 7344 4624 0 0 0 0 0 544 4624 39 65 32 sds 878 0 7608 2589 0 0 0 0 0 498 2588 40 65 48 sdt 879 0 7648 1963 0 0 0 0 0 396 1963 41 65 64 sdu 880 0 7496 3298 3 0 24 0 0 480 3298 42 65 80 sdv 870 0 7320 6767 0 0 0 0 0 691 6767 43 65 96 sdw 873 0 7400 14736 0 0 0 0 0 1502 14736 44 65 112 sdx 880 0 7520 3691 0 0 0 0 0 555 3691 45 65 128 sdy 444 0 3552 957 0 0 0 0 0 229 957 46 65 144 sdz 450 0 3600 993 0 0 0 0 0 236 993 47 65 160 sdaa 449 0 3592 1160 0 0 0 0 0 304 1160 48 65 176 sdab 449 0 3592 1107 0 0 0 0 0 259 1107 49 65 192 sdac 885 0 7338 1068 0 0 0 0 0 394 1068 50 65 208 sdad 878 0 7376 1175 0 0 0 0 0 409 1175 51 65 224 sdae 877 0 7248 1263 0 0 0 0 0 430 1263 52 65 240 sdaf 874 0 7320 1611 0 0 0 0 0 750 1611 53 66 0 sdag 878 0 7288 1510 0 0 0 0 0 650 1509 54 66 16 sdah 875 0 7248 1776 0 0 0 0 0 614 1776 55 66 32 sdai 882 0 7424 1943 0 0 0 0 0 1093 1943 56 66 48 sdaj 868 0 7096 1477 0 0 0 0 0 534 1477 57 66 64 sdak 875 0 7200 1127 0 0 0 0 0 402 1127 58 66 80 sdal 872 0 7248 1291 0 0 0 0 0 398 1290 59 66 96 sdam 886 0 7346 1004 0 0 0 0 0 369 1004 60 66 112 sdan 881 0 7376 1497 0 0 0 0 0 542 1496 61 66 128 sdao 879 0 7248 1220 0 0 0 0 0 422 1220 62 66 144 sdap 871 0 7304 1666 0 0 0 0 0 514 1666 63 66 160 sdaq 876 0 7272 1180 0 0 0 0 0 461 1179 64 66 176 sdar 870 0 7328 1124 0 0 0 0 0 347 1124 65 66 192 sdas 884 0 7432 1454 0 0 0 0 0 529 1453 66 66 208 sdat 876 0 7280 1925 0 0 0 0 0 622 1925 67 66 224 sdau 868 0 7088 3768 0 0 0 0 0 1197 3768 68 66 240 sdav 872 0 7368 1680 0 0 0 0 0 554 1680 69 67 0 sdaw 451 0 3632 914 0 0 0 0 0 205 913 70 67 16 sdax 450 0 3624 985 0 0 0 0 0 222 985 71 67 32 sday 446 0 3592 1157 0 0 0 0 0 281 1157 72 67 48 sdaz 448 0 3608 993 0 0 0 0 0 255 993 73 67 64 sdba 886 0 7848 1103 2 0 16 1 0 390 1104 74 67 80 sdbb 874 0 7720 1144 0 0 0 0 0 399 1144 75 67 96 sdbc 880 0 7904 1204 0 0 0 0 0 432 1204 76 67 112 sdbd 872 0 7880 1264 0 0 0 0 0 503 1264 77 67 128 sdbe 874 0 7968 1685 0 0 0 0 0 776 1685 78 67 144 sdbf 872 0 7872 1709 0 0 0 0 0 642 1708 79 67 160 sdbg 877 0 7904 1804 3 0 24 2 0 868 1806 80 67 176 sdbh 877 0 8032 1268 0 0 0 0 0 465 1268 81 67 192 sdbi 870 0 7728 1179 0 0 0 0 0 399 1179 82 67 208 sdbj 872 0 7992 1172 0 0 0 0 0 409 1172 83 67 224 sdbk 884 0 7832 1219 2 0 16 1 0 407 1220 84 67 240 sdbl 874 0 7720 1435 0 0 0 0 0 493 1435 85 68 0 sdbm 880 0 7920 1277 0 0 0 0 0 447 1277 86 68 16 sdbn 875 0 7928 1471 0 0 0 0 0 465 1471 87 68 32 sdbo 878 0 8080 1192 0 0 0 0 0 434 1191 88 68 48 sdbp 873 0 7760 1082 0 0 0 0 0 362 1082 89 68 64 sdbq 879 0 7872 1333 1 0 8 0 0 450 1332 90 68 80 sdbr 873 0 8120 2056 0 0 0 0 0 730 2056 91 68 96 sdbs 875 0 7872 2775 0 0 0 0 0 1397 2775 92 68 112 sdbt 868 0 7720 1398 0 0 0 0 0 462 1398 93 68 128 sdbu 444 0 3552 929 0 0 0 0 0 201 929 94 68 144 sdbv 444 0 3552 1045 0 0 0 0 0 230 1045 95 68 160 sdbw 445 0 3560 1103 0 0 0 0 0 289 1103 96 68 176 sdbx 447 0 3576 1078 0 0 0 0 0 257 1078 97 68 192 sdby 868 0 13468 1178 0 0 0 0 0 414 1178 98 68 208 sdbz 861 0 13360 1198 0 0 0 0 0 398 1198 99 68 224 sdca 860 0 13216 1161 0 0 0 0 0 392 1160 100 68 240 sdcb 850 0 13120 1659 0 0 0 0 1 43983592 43984712 101 69 0 sdcc 858 0 13392 1464 0 0 0 0 0 625 1463 102 69 16 sdcd 852 0 13472 2104 0 0 0 0 0 787 2104 103 69 32 sdce 864 0 13360 1536 1 0 8 0 0 631 1536 104 69 48 sdcf 851 0 13104 1475 0 0 0 0 0 482 1475 105 69 64 sdcg 856 0 13184 1109 0 0 0 0 0 374 1110 106 69 80 sdch 853 0 13120 1148 0 0 0 0 0 370 1148 107 69 96 sdci 865 0 13444 1341 0 0 0 0 0 422 1341 108 69 112 sdcj 859 0 13352 1531 0 0 0 0 0 539 1532 109 69 128 sdck 856 0 13184 1320 0 0 0 0 0 439 1319 110 69 144 sdcl 854 0 13240 1617 0 0 0 0 0 518 1617 111 69 160 sdcm 853 0 13280 1270 0 0 0 0 0 453 1270 112 69 176 sdcn 851 0 13104 1133 0 0 0 0 0 362 1133 113 69 192 sdco 862 0 13448 1472 0 0 0 0 0 558 1472 114 69 208 sdcp 850 0 13096 2127 0 0 0 0 0 639 2127 115 69 224 sdcq 857 0 13208 3102 0 0 0 0 0 1209 3101 116 69 240 sdcr 849 0 13208 1558 0 0 0 0 0 525 1558 117 253 0 dm-0 30922 0 823202 95867 332261 0 2658088 910803 0 82184 1006668 118 253 1 dm-1 27229 0 561706 67126 354802 0 2838416 3090213 0 73791 3157334 119 253 2 dm-2 388 0 3098 1726 48186 0 385488 922346 0 5068 924075 120 253 3 dm-3 5928 0 89082 11160 211083 0 1688664 82367 0 34742 93521 121 253 4 dm-4 22 0 176 145 0 0 0 0 0 25 145 122 253 5 dm-5 17 0 136 166 0 0 0 0 0 16 166 123 2 0 fd0 0 0 0 0 0 0 0 0 0 0 0 124 9 0 md0 0 0 0 0 0 0 0 0 0 0 0 125 120 304 emcpowert 1755 156 22048 660 4 0 32 2 0 645 662 126 120 288 emcpowers 1722 136 21648 681 0 0 0 0 0 656 680 127 120 272 emcpowerr 1720 138 21648 780 0 0 0 0 0 682 780 128 120 256 emcpowerq 1701 155 21632 1090 0 0 0 0 0 1064 1090 129 120 240 emcpowerp 1715 197 22080 1636 0 0 0 0 0 1461 1635 130 120 224 emcpowero 1701 159 21664 935 0 0 0 0 0 898 935 131 120 208 emcpowern 1737 178 22104 2316 4 0 32 2 0 1812 2318 132 120 192 emcpowerm 1695 160 21624 671 0 0 0 0 0 655 671 133 120 176 emcpowerl 1706 125 21432 631 0 0 0 0 0 606 631 134 120 160 emcpowerk 1699 160 21656 510 0 0 0 0 0 497 509 135 120 144 emcpowerj 1748 156 21992 552 4 0 32 3 0 540 555 136 120 128 emcpoweri 1719 136 21624 633 0 0 0 0 0 594 633 137 120 112 emcpowerh 1717 138 21624 661 0 0 0 0 0 603 661 138 120 96 emcpowerg 1700 155 21624 542 0 0 0 0 0 528 542 139 120 80 emcpowerf 1711 197 22048 758 0 0 0 0 0 701 757 140 120 64 emcpowere 1699 159 21648 550 0 0 0 0 0 535 550 141 120 48 emcpowerd 1731 178 22056 802 4 0 32 0 0 742 802 142 120 32 emcpowerc 1695 160 21624 675 0 0 0 0 0 661 675 143 120 16 emcpowerb 1699 125 21376 1018 0 0 0 0 0 950 1018 144 120 0 emcpowera 1695 160 21624 674 0 0 0 0 0 622 675 145 120 368 emcpowerx 20 63 664 50 0 0 0 0 0 50 50 146 120 352 emcpowerw 20 63 664 95 0 0 0 0 0 95 95 147 120 336 emcpowerv 20 63 664 78 0 0 0 0 0 78 78 148 120 320 emcpoweru 20 63 664 52 0 0 0 0 0 52 52 149 253 6 dm-6 380 0 3034 15911 2 0 16 0 0 1751 15911 150 253 7 dm-7 343 0 2738 5185 2 0 16 1 0 428 5186 151 253 8 dm-8 355 0 2834 13852 2 0 16 1 0 787 13853 152 253 9 dm-9 355 0 2834 8684 2 0 16 1 0 462 8685 153 253 10 dm-10 355 0 2834 12223 2 0 16 1 0 784 12224 154 253 11 dm-11 355 0 2834 5576 2 0 16 0 0 336 5576 155 253 12 dm-12 355 0 2834 7579 2 0 16 2 0 414 7581 156 253 13 dm-13 355 0 2834 5382 2 0 16 1 0 349 5383
Problem observed with the updated 2.6.9-55.0.9.ELsmp kernel as well..!
Just checking, have you made any changes to: /etc/sysconfig/sysstat.ioconf (It is not in sysreport, but it should be...) Would you be willing to try this without PowerPath loaded? RHEL 4.6 has been lock-down for a while now, and the RC has been built, so this will havee to come after 4.6, but I'd like to get to the bottom of it.
(1) /etc/sysconfig/sysstat.ioconf : I believe /etc/sysconfig/sysstat.ioconf is present in RHEL5 oly at this time. I am not sure whether /etc/sysconfig/sysstat.ioconf will make it to RHEL4, but I am sure that it is not present at this time in RHEL4. Thats why sysreport does not mention it either. To answer, no I've not made any changes to /etc/sysconfig/sysstat.ioconf file as this file is not present in the deployed ver of rhel4. (2) Without PowerPath: We've tried that, and the problem was found to persistent even with no PP, no PP modules loaded.
Once we get a host system set up in the lab I can try this with the EVA storage.