| Summary: | Iozone incache testing shows me regression on all testing file systems ext4, ext3, xfs | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Kamil Kolakowski <kkolakow> |
| Component: | kernel | Assignee: | John Feeney <jfeeney> |
| Status: | CLOSED DUPLICATE | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.2 | CC: | anil.k.garg, borgan, esandeen, jfeeney, jvillalo, luyu, msnitzer, rwheeler, swhiteho |
| Target Milestone: | rc | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-07-14 15:00:13 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Attachments: | |||
Created attachment 486256 [details]
IOZONE results compared between rhel6.0GA and rhel6.1-20110311.3
Created attachment 486259 [details]
POSTMARK results compared between rhel6.0GA and rhel6.1-20110311.3
Sorry, jumped the gun on the summary edit :) Were other filesystems tested? Hi Eric, Yes I tested ext4, ext3, xfs, ext2 file systems. All file system results shows regression in incache results. I'm going to change bug description and post here rest of results. Created attachment 486276 [details]
IOZONE ext3 results compared between rhel6.0GA and rhel6.1-20110311.3
Created attachment 486278 [details]
IOZONE xfs results compared between rhel6.0GA and rhel6.1-20110311.3
In my testing, I think I may have narrowed down at least some of this regression to some scheduler changes between -95 and -96. I'm doing brew builds of those now, since they got garbage collected; if you could retest those & compare results it'd be great. I'll let you know when the builds are available. Thanks, -Eric I will retest it as soon as you will have this build ready. Thanks Kamil Hm, after a bit more careful testing here, averaging 6 iozone runs and comparing them (iozone -a -y 4k -q 16384k -n 4k -g 1g -f /mnt/test/testfile) I'm not seeing much regression other than FWRITE:
IOZONE Analysis tool V1.4
FILE 1: ./iozone-2.6.32-71.el6.x86_64-avg-ext4.txt
FILE 2: ./iozone-2.6.32-122.el6.x86_64-avg-ext4.txt
TABLE: SUMMARY of ALL FILE and RECORD SIZES
Results in MB/sec
FILE & REC ALL INIT RE RE RANDOM RANDOM
FILE SIZES (KB) IOS WRITE WRITE READ READ READ WRITE
=====-------------------------------------------------------------
1 ALL 1967 770 1085 3237 3400 3200 1438
2 ALL 1949 773 1077 3187 3377 3214 1445
ALL . . . . . . .
BACKWD RECRE STRIDE F FRE F FRE
READ WRITE READ WRITE WRITE READ READ
-------------------------------------------------
2931 2217 3144 763 1037 3000 3216
2941 2173 3102 735 1016 2959 3216
. . . -3.7 . . .
This was tested on an intel SSD, INTEL SSDSA2M160 (entire disk formatted), 2G 2CPU x86_64 box, and all default mount options, cfq, etc. Hi Eric, I have result of -95 -96 running on ibm-x3650m3-01.lab.eng.brq.redhat.com. I don't see regression between those two results. I run iozone with those parameters iozone -U /RHTSspareLUN1 -a -f /RHTSspareLUN1/ext4 -n 4k -g 4096m I attached results. Created attachment 487359 [details]
Result between -95 and -96 kernel
Can you compare -71 to -96 a well? Then at least we'll know if your regression is before or after -96. Since I can't reproduce this, I wonder if you can bisect a little and spot-check some other kernels prior to -122? Thanks, -Eric Hi Eric, I attached comparison between -70 and -95. Here is significant regression. Now I'm running -80 kernel to track where regression first time occurred. Created attachment 488083 [details]
Comparison -70vs-95
I have initial 3 iozone runs on -80 kernel. I see regression between -80 -96 kernel. I will continue with running test on -90. Thanks for narrowing it down! Sorry, if I could reproduce it, I could help :( There is stable result between -90 and -95. Regression must start between -80 and -90. Trying -85. From first run on -82 kernel it looks that regression occurred first time in -83 kernel. I will confirm it when I will have average of 3 results on -82 kernel. Thanks! Hm, only 92 patches now ;) I don't see anything terribly obvious in fs/ mm/ drivers/block/ block/ .... -Eric Eric, I have from 2 other runs confirmed that I see regression is between -82 and -83. Iozone result attached. Now I will going to run all file systems on both IOZONE, POSTMARK tests. I will public here results as I will get it. If you want other reports ie iostat or "echo w > /proc/sysrq-trigger" please let me know. Thanks! -Kamil Created attachment 489002 [details]
Comparison -82 and -83 kernel (ext4, cfq)
Wow, thanks for narrowing that down! That's a huge help. Poring over the 90 or so changes now ... Kamil, is there any chance I could access the box and do a proper git bisect between -82 and -83? or is it running tests overnight for you as well? I have a couple of suspect changes, but nothing is obvious. Thanks, -Eric Eric, I started "big beaker" job across all file systems and running 2 benchmarks on those 2 kernels. But because this test takes few days I can stop it and you will use it for your investigation. Please let me know if I should cancel the job. Thanks a lot Kamil Created attachment 489197 [details]
-82 iozone_incache_default.iozone
Matthew, bisect narrowed down the regression between -82 and -83 to: [x86] Add native Intel cpuidle driver ... we saw ondemand governor regressions too; Arjan added some sort of hook to that to make IO look not-idle ... maybe a similar issue here? Because of comment 32 and no fix moving it to RHEL6.2. The intel_idle driver means that we'll be getting into deeper C states than we were previously. The "simple fix" would be to avoid dropping into deep states if we're in iowait, but that would have a strong impact on power consumption. *** This bug has been marked as a duplicate of bug 714180 *** |
I see regression ~6% on iozone incache testing on ext4 file system. Baseline RHEL6.0GA Tested version RHEL6.1-20110311.3. Testing machine: Hostname = ibm-x3650m3-01.lab.eng.brq.redhat.com Arch = x86_64 Distro = RHEL6.1-20110311.3 Kernel = 2.6.32-122.el6.x86_64 SElinux mode = Permissive CPU count : speeds MHz = 12 : 12 @ 2793.000 CPU model name = Intel(R) Xeon(R) CPU X5650 @ 2.67GHz CPU cache size = 12288 KB BIOS Information Version : Date = -D6E145FUS-1.07- : 04/26/2010 Total Memory = 12029 (MB) NUMA is Enabled. # of nodes = 1 nodes (0) Tuned profile = default I/O scheduler on testing device = cfq SPEED TEST: hdparm -tT /dev/sdb1 = Timing = buffered disk reads => 249.33 MB/sec Timing cached reads => 8704.62 MB/sec Free Diskspace on /RHTSspareLUN1 = 20GB Type of HDD = SSD I'm able to reproduce those results. I used 2 benchmarks iozone and postmark. System includes 2HDD one is used for system second (SSD drive) for testing. There is no LVM and no RAID. Results are attached.