Bug 102679
Summary: | LTC3931-[perf][tpch][BETA] raw IO on RHEL3 B1 degrades | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | IBM Bug Proxy <bugproxy> | ||||||||
Component: | kernel | Assignee: | Tom Coughlan <coughlan> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 3.0 | CC: | coughlan, johnstul, petrides, riel, sct | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | i386 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | RHEL 3 gold | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2005-11-03 18:26:50 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 101028, 103278 | ||||||||||
Attachments: |
|
Description
IBM Bug Proxy
2003-08-19 20:05:49 UTC
Created attachment 93762 [details]
attach-oprofile_231_4G_dd.kernel
Created attachment 93763 [details]
attach-oprofile_389_4G_dd.kernel
Created attachment 93764 [details]
attach-config.as30b1_389
Why is there a config attached here --- did you create your own private build rather than using a Red Hat one? We can't support custom builds, especially of beta kernels. Please repeat the tests with a recent kernel from the RHN sushi channel --- the B1 kernels had 4g/4g enabled plus debug enabled, which slowed stuff down. It looks like those were off in the config you attached, but please retry with a recent -smp (not hugemem) kernel so that it's unambiguous --- there are also some other fixes in later kernels which might be relevant. ------ Additional Comments From mksully.com 2003-30-08 10:29 ------- Just an update on the reconfirmation of this problem on build 399. Because the qlogic driver doesn't complete it's Inquiry correctly on sparse lun configuration without CONFIG_SCSI_MULTI_LUN set it was necessary to rebuild the kernel. I used the i686-smp from /configs as a base just changing the minimum required to build. Unfortunately even with this kernel IO rates are about 120MB/sec (where we normally get > 700 MB/sec). This is lower than the build we originally reported the problem on (389). If you have a specific binary for us to try could you make sure CONFIG_SCSI_MULTI_LUN is enabled? Other Notes: 1. qlogics still fail (even on single CEC) during install (defect 3761 updated). 2. Needed to turn off mod versions to get the qla2300 module to build successfully. ------ Additional Comments From mksully.com 2003-30-08 11:50 ------- I added options scsi_mod max_scsi_luns=255 to modules.conf and rebuilt the initrd file. With this I was able to boot the binary entsmp kernel and see all of the qlogic disks. Unfortunately the dd's to the devices still had a read throughput of about 125MB/sec where we normally see > 700MB sec. This is a severe degradation from the a4 version of the kernel. ------ Additional Comments From mksully.com 2003-04-09 23:26 ------- I updated the the 414 build and the problem still occurs. I also noticed that I don't see the same degradation when doing raw reads to the disks attached to the aic7xxx on board controller. I dug out an older version (6.05.60) version of the qla2xxx driver and integrated it into the build tree. With this earlier version of the driver the problem no longer occurs on the qlogic 2310 attached disks. Maybe the 6.06 version of the driver provided as an addon is buggy? We now enable IRQ mitigation on the qla2300 driver by default, and we suspect that that's the changed factor here. IRQ mitigation is a significant performance win under load, but the extra latency that it results in for single IOs means that it will show up as a performance degradation under single-client sequential raw IO testing. You can change the controller's IRQ delay when you load the qla2300.o module, though: give it a "ql2xintrdelaytimer=<n>" module parameter to set the IRQ latency to n*100usec. The default is currently 3 (ie. 300usec); for testing raw IO bandwidth you should be able to set it to 0 to disable IRQ mitigation. We suspect that on a more realistic performance test, you'll get better performance with the mitigation turned on, though. Can you verify that this helps, please? ------ Additional Comments From mksully.com 2003-05-09 11:52 ------- I set ql2xintrdelaytimer to zero and it actually degraded a bit more(~9%). I verified that by the message in /var/log/messages that the parameter was used. I'm also trying this on a large database benchmark and it also performs poorly with very low I/O rates similiar to what we saw in the plain dds. ------ Additional Comments From mksully.com 2003-05-09 12:00 ------- The degradation occurs when using the 6.06.00b11 version of the driver shipped with the beta. We are driving eight qla2310 adapters each attached to an IBM FastT200 enclosure with 10 physical disks, 80 disk in all. Just using dd to do raw reads exposes the problem. As a test, I replaced the shipped driver with the 6.06.00b12 version built against the beta tree and the degradation goes away. I am looking into why QA Contact is being removed. ------ Additional Comments From mksully.com 2003-05-09 15:45 ------- The full database run completed successfully with good performance on build 414 with the 6.06.00b12 driver substituted for the one included in the distro. ok, good to know that this isn't a bug in Taroon thank you for testing ------ Additional Comments From mksully.com 2003-05-09 17:24 ------- It appears that RH inadvertantly closed this bug. Please reopen with the following response: But it is a bug. The qlogic driver shipped in taroon (6.06.0011b) appears to be broken. The use of 6.06.0012b was just to demonstrate that the earlier version of the driver was probably the culprit. ------ Additional Comments From mksully.com 2003-05-09 17:35 ------- But it is a bug. The qlogic driver shipped in taroon (6.06.0011b) appears to be broken. The use of 6.06.0012b was just to demonstrate that the earlier version of the driver was probably the culprit. Note, RHEL 3 not RHAS30 Could you try with ql2xintrdelaytimer=1? The firmware may be interpreting "0" as a wrapped "infinity" value. ------ Additional Comments From mksully.com 2003-11-09 21:45 ------- I tried ql2xintrdelaytimer=1 on the 6.06.00b11 driver and it didn't help. I/O was still low. But how did it compare against =3 and =0? I'm trying to work out how much of the observed performance is down to the IRQ mitigation, and how much might be another problem. ------ Additional Comments From mksully.com 2003-12-09 11:32 ------- Using "dd if=/dev/raw/raw<x> of=/dev/null bs=262144 count=4000&" to 40 raw devices I've attached the vmstat data for ql2xinterdelaytimer=0, 1, and 3. Looks like 0=~138MB/sec 1=~153MB/sec 3=~124MB/sec [root@sambaperf tmp]# insmod qla2300 ql2xintrdelaytimer=0 Using /lib/modules/2.4.21- 1.1931.2.399.entsmp/kernel/drivers/addon/qla2200/qla2300.o [root@sambaperf db2inst1]# ./10dd_mln0.sh; ./10dd_mln1.sh [root@sambaperf db2inst1]# vmstat 5 | tee vmstat_0.output procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 41 0 16378172 2204 23848 0 0 36 0 8 2 0 0 99 0 0 40 0 16377940 2208 23852 0 0 138945 2 3200 1117 0 3 12 85 0 40 0 16377940 2208 23852 0 0 138240 0 3199 1090 0 2 12 86 1 40 0 16374248 2208 23852 0 0 140083 2 3226 1123 0 3 12 85 0 40 0 16377776 2208 23852 0 0 138342 0 3201 1092 0 3 12 85 1 39 0 16377776 2208 23852 0 0 137472 0 3175 1085 0 3 12 85 0 40 0 16377748 2208 23852 0 0 138035 27 3205 1110 0 3 12 85 0 40 0 16377748 2208 23852 0 0 137165 8 3164 1082 0 3 12 85 0 40 0 16377560 2208 23852 0 0 137626 2 3185 1106 0 3 12 86 0 40 0 16377560 2208 23852 0 0 138138 0 3202 1088 0 3 9 88 [root@sambaperf db2inst1]# insmod qla2300 ql2xintrdelaytimer=1 Using /lib/modules/2.4.21- 1.1931.2.399.entsmp/kernel/drivers/addon/qla2200/qla2300.o [root@sambaperf db2inst1]# ./10dd_mln0.sh; ./10dd_mln1.sh [root@sambaperf db2inst1]# vmstat 5 | tee vmstat_1.output procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 40 0 16376248 2208 23892 0 0 135 0 11 3 0 0 98 1 0 40 0 16376248 2212 23892 0 0 153857 0 3790 1210 0 3 0 97 0 40 0 16374432 2212 23892 0 0 151757 2 3757 1214 0 3 0 97 0 40 0 16374432 2212 23892 0 0 152730 0 3786 1202 0 4 0 96 0 40 0 16374340 2212 23892 0 0 155443 53 3814 1245 0 4 0 96 0 40 0 16374420 2212 23892 0 0 153907 0 3796 1216 0 3 0 97 0 40 0 16374448 2212 23892 0 0 150067 2 3748 1207 0 4 0 96 0 40 0 16374448 2212 23892 0 0 152576 26 3777 1202 0 3 0 97 [root@sambaperf db2inst1]# insmod qla2300 ql2xintrdelaytimer=3 Using /lib/modules/2.4.21- 1.1931.2.399.entsmp/kernel/drivers/addon/qla2200/qla2300.o [root@sambaperf db2inst1]# ./10dd_mln0.sh; ./10dd_mln1.sh [root@sambaperf db2inst1]# vmstat 5 | tee vmstat_3.output procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 0 40 0 16371784 2212 23908 0 0 239 0 13 4 0 0 97 2 0 40 0 16371784 2212 23908 0 0 122829 26 2728 970 0 2 13 85 0 40 0 16371608 2212 23908 0 0 125491 2 2751 1009 0 2 13 85 1 39 0 16371608 2212 23908 0 0 121651 51 2729 960 0 2 13 85 0 40 0 16371420 2212 23908 0 0 124621 2 2747 1004 0 3 3 94 1 39 0 16371420 2212 23908 0 0 124570 0 2739 982 0 2 0 98 0 40 0 16369560 2212 23908 0 0 124467 2 2742 1003 0 2 0 98 1 39 0 16369560 2212 23908 0 0 124979 23 2748 986 0 2 0 98 0 41 0 16369560 2212 23908 0 0 123187 1 2727 973 0 2 0 98 0 40 0 16371340 2212 23908 0 0 122573 1 2717 987 0 2 0 98 ------ Additional Comments From mksully.com 2003-17-09 10:35 ------- The execution_throttle parameter is being setup to an invalid value in the scsi host structure. 1. During init they did: nv->execution_throttle = __constant_cpu_to_le16(16); (0x100 value). 2. During setup of the host structure they did: ha->execution_throttle = le16_to_cpu(nv->execution_throttle); (0x100 value remains). It doesn't appear that the macro did what was intended. 3. In qla2x00_device_queue_depth() they set int default_depth = max((int)64, (int)p->execution_throttle); (resulting in a value of 0x100 for default_depth) 4. Later when they assign it to device->queue_depth = default_depth; It gets truncated to 0 since queue_depth is only 8 bits. This results in the slowdown. Since execution throttle doesn't seem to be a settable parameter I suggest the following patch: --- qla2x00.c.org 2003-09-18 11:48:30.000000000 -0500 +++ qla2x00.c 2003-09-18 11:48:42.000000000 -0500 @@ -4853,7 +4853,7 @@ void qla2x00_device_queue_depth(scsi_qla_host_t *p, Scsi_Device *device) { - int default_depth = max((int)64, (int)p->execution_throttle); + int default_depth = 64; device->queue_depth = default_depth; if (device->tagged_supported) { I tested this on my setup and IO was restored to expected levels. I don't follow the analysis: 1. During init they did: nv->execution_throttle = __constant_cpu_to_le16(16); (0x100 value). 2. During setup of the host structure they did: ha->execution_throttle = le16_to_cpu(nv->execution_throttle); (0x100 value remains). It doesn't appear that the macro did what was intended. But i386 is little-endian already; "__constant_cpu_to_le16(16)" should evaluate to 16, not "0x100 value". And the file in question is already including <asm/byteorder.h>, which should set up the right endian definitions for all the cpu_to_le* macros. ------ Additional Comments From mksully.com 2003-17-09 13:04 ------- I agree that it should work but it doesn't. To verify I rebuilt the 414 tree with printk output of the execution_throttle value and it shows it to be 0x100. I guess I can dig deeper into the endian macros but since the max assignment in qla2x00_device_queue_depth will always be 64 should we bother? Well, a byte-swap of 16 is 0x1000, not 0x100, so that's not what's happening. And the assignment int default_depth = max((int)64, (int)p->execution_throttle); enforces a _minimum_ of 64, not a maximum, so clipping it there may harm performance. Arjan has suggested that clipping the queue depth to a maximum of 255 might be a better solution. ------ Additional Comments From mksully.com 2003-17-09 13:26 ------- Good points. Are you suggesting a direct assignment of 255, like this? int default_depth = 255; or this? int default_depth = max((int)255, (int)p->execution_throttle); + int default_depth = min(max((int)64, (int)p->execution_throttle), 255); ------ Additional Comments From mksully.com 2003-17-09 14:10 ------- Yep. Works for me. I dug deeper on the original error and it appears that in qla2x00_nvram_config() the value of 0x100 is being read directly out of nvram and into the nvram22_t buffer. You're right, no endianess issues involved (the code path that used the endianess macros would only have been executed if the nvram data was in error). The truncation problem that was occuring when the integer default_depth as being assigned into the 8 bit device->queue_depth was occuring on the raw nvram value. Wrapping it in the min statement should take care of that. +int default_depth = min(max((int)64, (int)p->execution_throttle), 255); This fix is checked in the RHEL 3. changed: What |Removed |Added ---------------------------------------------------------------------------- QAContact|khoa.com |corryk.com ------- Additional Comments From corryk.com(prefers email via kevcorry.com) 2005-03-02 15:57 EST ------- Hi Mike, Please verify that this fix is included in the latest RHEL3 Update. If so, go ahead and close this bug. Thanks! This bug is closed on the IBM side. A fix for this problem was committed to the RHEL3 U3 patch pool on 17-Jun-2004 (in kernel version 2.4.21-15.13.EL). It was released in U3 with the following Errata System message: "An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-433.html" Obviously, now it would be more appropriate to upgrade to U6, which is here: http://rhn.redhat.com/errata/RHSA-2005-663.html |