Bug 796312

Summary: disk access performance has deteriorated substantially
Product: [Fedora] Fedora Reporter: Kamil Páral <kparal>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: awilliam, collura, dennis, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mkrizek, tflink
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-23 10:47:43 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 752650    

Description Kamil Páral 2012-02-22 11:53:28 EST
Description of problem:
In my experience (and my colleagues' experience) F17 is *horribly* slow. Installation takes brutally long. Hitting TAB in terminal takes several seconds until path completion reacts. I suspected something is very wrong so I executed several tests on my quite old bare metal machine:

F16 installation time (1228 packages): 25 min
F17 installation time (1208 packages): 58 min

F16 boot time to gdm: 50 sec
F17 boot time to gdm: 1 min 57 sec (but it seemed to be stuck on some network driver loading, probably not related)

F16 boot time to desktop: 1 min 13 sec
F17 boot time to desktop: 2 min 36 sec (above comment applies)

F16 time to create 10 000 files (using touch and sync): 15 sec
F17 time to create 10 000 files (using touch and sync): 2 min 10 sec

F16 time to create 1 GB file (using dd and sync): 40 sec
F17 time to create 1 GB file (using dd and sync): 41 sec

F16 bonnie++ results:
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.locald 4G   391  95 26013  10  8037   4  1047  84 27110   7 104.7   5
Latency             52610us     231ms   10658ms     217ms    1046ms    8063ms
Version  1.96       ------Sequential Create------ --------Random Create--------
localhost.localdoma -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 11208  42 +++++ +++ 23873  47 14165  57 +++++ +++ 24449  47
Latency              2332us    5757us    5660us    2890us    5912us    6068us

F17 bonnie++ results:
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
localhost.locald 4G    26  99 24007  50  9200  37   420  93 32663  35  95.9  29
Latency               462ms    1156ms    4606ms   88101us   89393us    5193ms
Version  1.96       ------Sequential Create------ --------Random Create--------
localhost.localdoma -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  2699  83 18595  96  3840  72  2824  85 25868  93  4036  71
Latency               116ms   27899us   29462us    8564us    7375us    7218us


I never used bonnie++ before, but it seems to indicate what my simple 'touch' test also indicates. Direct file access is OK, but working with inodes (creating and removing files) is horribly slow (up to 800% slower than in F16). That would also explain slow bash completion when working with paths, because it needs to list available inodes.


Version-Release number of selected component (if applicable):
Fedora 16 GOLD
Fedora 17 Alpha RC4

How reproducible:
always

Steps to Reproduce:
1. time (for i in {1.10000}; do touch $i; done; sync)
2. compare in F16 and F17
3. or compare anaconda installation times
Comment 1 Kamil Páral 2012-02-22 11:55:33 EST
We don't have any performance release criteria, but I believe we should carefully discuss this. The user experience is quite bad. Proposing as F17 Final blocker.
Comment 2 Kamil Páral 2012-02-22 11:57:19 EST
Adam, Tim, can you help me pinpoint the relevant component which to report it against (or CC some knowledgeable people)? And maybe reproduce the performance testing on your machines?
Comment 3 Kamil Páral 2012-02-22 12:00:58 EST
> Steps to Reproduce:
> 1. time (for i in {1.10000}; do touch $i; done; sync)

time (for i in {1..10000}; do touch $i; done; sync)
                 ^^ typo, sorry
Comment 4 Bill Nottingham 2012-02-22 17:55:25 EST
For any sort of disk performance question, the component is kernel-until-proven-otherwise.
Comment 5 Adam Williamson 2012-02-22 17:58:46 EST
F17 Alpha is using a debug kernel, so performance is expected to be substantially worse than a release kernel. Can you test with the same kernel but with debug options disabled - run 'make release', 'fedpkg srpm', and do a scratch build of the srpm - and see if that changes the numbers?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 6 Dave Jones 2012-02-22 20:17:00 EST
see bug 795050. This is probably a dupe.
Comment 7 Kamil Páral 2012-02-23 04:34:37 EST
(In reply to comment #5)
> F17 Alpha is using a debug kernel, so performance is expected to be
> substantially worse than a release kernel. Can you test with the same kernel
> but with debug options disabled - run 'make release', 'fedpkg srpm', and do a
> scratch build of the srpm - and see if that changes the numbers?

Adam (or anyone), can you do the scratch build for me? I never built a single package in Koji (I doubt I currently even have sufficient rights to do that) and your guideline is not clear to me. Thanks.
Comment 8 Martin Krizek 2012-02-23 06:32:49 EST
Kamil, packages, once build, will be available on http://koji.fedoraproject.org/koji/taskinfo?taskID=3812329
Comment 9 Kamil Páral 2012-02-23 10:23:30 EST
Josef, Petr and me verified that with release kernel the performance is OK (same as in F16). So it is really caused by the debug kernel.

What is the reason to use debug kernel? When will we switch back?

If using debug kernel is intentional and we won't forget to switch to normal one, this bug can be closed. Whether it is a duplicate of bug 795050 I don't know. The boot speed wasn't different much for me.
Comment 10 Josh Boyer 2012-02-23 10:47:43 EST
(In reply to comment #9)
> Josef, Petr and me verified that with release kernel the performance is OK
> (same as in F16). So it is really caused by the debug kernel.
> 
> What is the reason to use debug kernel? When will we switch back?

To have the most debugging data possible for any issues reported during the Alpha phase.  We build release kernels from Beta through GA.

> If using debug kernel is intentional and we won't forget to switch to normal
> one, this bug can be closed. Whether it is a duplicate of bug 795050 I don't
> know. The boot speed wasn't different much for me.

I'm going to close this out.  Thank you for testing.  We appreciate the performance test.  As an FYI, we're looking at ways to run automated tests to catch actual regressions during the development stages of a kernel so we don't hit them during a release.
Comment 11 Adam Williamson 2012-02-23 12:44:03 EST
Going by the version of the kernel Martin built - kernel-3.3.0-0.rc4.git3.2.fc17.src.rpm - it actually includes the fix for 795050 as well (dropping x86-Avoid-invoking-RCU-when-CPU-is-idle.patch ). So your experiment tells us that *either* disabling debug options *or* dropping that patch (or both, I suppose) fixes the problem for you, but it doesn't tell us which one was the problem. Of course, I guess it's not too important, as both changes are going ahead.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 12 Adam Williamson 2012-02-27 19:42:16 EST

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 13 Adam Williamson 2012-02-28 00:39:20 EST
let's assume it was the debugging thing, and close this as a dupe of the 'debug performance is really bad' bug.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

*** This bug has been marked as a duplicate of bug 735268 ***
Comment 14 Adam Williamson 2012-02-28 00:42:15 EST

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers