Bug 633037
Summary: | Process' user time incorrectly accounted as system time | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michael Chapman <redhat-bugzilla> | ||||
Component: | kernel | Assignee: | Stanislaw Gruszka <sgruszka> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 13 | CC: | anton, dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, sgruszka | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-02-05 21:03:45 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
https://bugzilla.kernel.org/show_bug.cgi?id=16559 This looks like exactly the same issue. I'm able to reproduce problem on 2.6.33 kernel using perl command. Can not reproduce on 2.6.36-rc3 kernel (but on different machine, not sure if that matters). Architecture make difference. I can reproduce bug with 2.6.36-rc3 kernel on i686, but not in x86_64. Posted proposed fix upstream: http://lkml.org/lkml/2010/9/14/180 Fixed in 2.6.35.5-29.fc14 and 2.6.34.7-57.fc13 |
Created attachment 446761 [details] lsmod Description of problem: Processes on my system that have been running for a while are incorrectly accounting their CPU time as system time rather than user time. When a process is launched, it starts off correctly accounting user and system time. However there comes a point after which *all* CPU time for the process is accounted as system time. The user time for the process never increases hereafter, no matter how much time in userspace the process subsequently consumes. The actual transition point between correct and incorrect behaviour seems to be roughly related to the CPU load generated by the process: a process that is attempting to use all CPU time will start to be mis-accounted after approximately 65 seconds, whereas one that's only consuming 50% CPU may take a few minutes. It is not simply a case of reaching a certain amount of user time before the problem occurs, and I have not been able to pin down an exact relation. For a given load, however, the problem will consistently begin at the same time. Version-Release number of selected component (if applicable): kernel-2.6.33.8-149.fc13.i686 AND kernel-2.6.34.6-47.fc13.i686 AND kernel-2.6.34.6-54.fc13.i686 How reproducible: Always Steps to Reproduce: Problem was first noticed while running "openssl speed", which uses as much CPU as it can get. For example: $ openssl speed rsa Doing 512 bit private rsa's for 10s: 18132 512 bit private RSA's in 9.90s Doing 512 bit public rsa's for 10s: 265444 512 bit public RSA's in 9.84s Doing 1024 bit private rsa's for 10s: 4319 1024 bit private RSA's in 9.91s Doing 1024 bit public rsa's for 10s: 86704 1024 bit public RSA's in 9.88s Doing 2048 bit private rsa's for 10s: 536 2048 bit private RSA's in 9.92s Doing 2048 bit public rsa's for 10s: 24839 2048 bit public RSA's in 9.95s Doing 4096 bit private rsa's for 10s: 99 4096 bit private RSA's in 0.00s Doing 4096 bit public rsa's for 10s: 5158 4096 bit public RSA's in 0.00s OpenSSL 1.0.0a-fips 1 Jun 2010 built on: Fri Jun 4 14:20:39 UTC 2010 options:bn(64,32) md2(int) r Expected results: c4(1x,char) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -Wa,--noexecstack -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DWHIRLPOOL_ASM sign verify sign/s verify/s rsa 512 bits 0.000546s 0.000037s 1831.5 26976.0 rsa 1024 bits 0.002295s 0.000114s 435.8 8775.7 rsa 2048 bits 0.018507s 0.000401s 54.0 2496.4 rsa 4096 bits 0.000000s 0.000000s inf inf Note how the 4096-bit RSAs were completed in "0.00s", even though these took 10 seconds of wall-clock time to complete -- user time for the process did not increment during these two benchmarks. Another demonstration of the problem: $ perl -MBSD::Resource -le '$, = " "; while (1) { 1 foreach 1..10_000_000; print +(getrusage)[0,1] }' 0.734888 0.013997 1.44578 0.013997 2.160671 0.013997 2.840568 0.013997 ... 62.268533 0.033994 62.932432 0.033994 63.611329 0.033994 64.321221 0.033994 64.997118 0.033994 64.997118 0.713891 <- Problem starts here 64.997118 1.406786 64.997118 2.120677 64.997118 2.798574 64.997118 3.464473 64.997118 4.133371 Additional info: lsmod attached. My kernel is currently tainted with the NVIDIA driver, however I have verified that the problem occurs without this driver loaded.