Bug 822877
Summary: | Kernel reports high load during idle, from 3.3.5-2 on | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bert DeKnuydt <Bert.Deknuydt> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 17 | CC: | covex, gansalmon, hansvon, itamar, jonathan, kernel-maint, lacombar, madhu.chinakonda, ngaywood, webmaster | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-07-27 15:25:18 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Bert DeKnuydt
2012-05-18 11:30:00 UTC
Does also affect F17 and 3.3.7: 1st machine: MacBookAir4,2 (Intel Sandybridge Core i7) $ uname -a; uptime Linux karei 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 10:26:49 up 1:00, 4 users, load average: 0.67, 0.68, 0.57 2nd machine: Desktop SNB Core i3 $ uname -a;uptime Linux buta 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 10:40:26 up 11 min, 1 user, load average: 0.44, 0.32, 0.24 Doesn't seem to affect my 3rd machine (Sony laptop with Core i3 Arrandale) $ uname -a;uptime Linux hebi 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 10:42:00 up 9 min, 2 users, load average: 0.02, 0.27, 0.28 Nothing seems to stand out in "perf top" or "powertop"... Same here, on several machines with F16. The 3.3.4 shows in init 3 idle load 0. With 3.3.7 I see constant load 0.5, in Gnome above 1.0. This happens on both Intel and AMD cpus I have here. I am not sure this is just load shown incorrectly. IMHO there is a performance penalty too, but I did no objective measurement. same here, here is the bug report I was about to fill: Description of problem: Kernel on a mostly idle box shows a very high load: A Fedora 15 box (2.6.43.5) showing the problem: top - 10:47:11 up 24 min, 1 user, load average: 0.95, 0.99, 0.89 Tasks: 239 total, 1 running, 238 sleeping, 0 stopped, 0 zombie Cpu0 : 1.0%us, 0.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st Cpu1 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st A Fedora 16 box (3.3.1) *not* showing the problem: top - 10:47:28 up 8:44, 2 users, load average: 0.00, 0.01, 0.05 Tasks: 201 total, 1 running, 200 sleeping, 0 stopped, 0 zombie Cpu0 : 2.1%us, 0.9%sy, 0.0%ni, 95.8%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 2.1%us, 0.9%sy, 0.0%ni, 95.9%id, 1.1%wa, 0.0%hi, 0.0%si, 0.0%st Version-Release number of selected component (if applicable): Known bad: on Fedora 15: 2.6.43.5-2.fc15.x86_64 on Fedora 16: 3.3.5-2.fc16.x86_64 Known good: on Fedora 16: 3.3.1-3.fc16.x86_64 How reproducible: always Steps to Reproduce: 1. boot 2. start a virtual console 3. run top Actual results: load near 1.00 Expected results: load near 0.00 Additional info: The Fedora 16 box do not show the problem on 3.3.1-3.fc16.x86_64 but do show it starting 3.3.5-2.fc16.x86_64 and up. This directly impact my multimedia experience as youtube videos fail to play fluidly and I am not even able to correctly watch webm encoded video from mplayer... Switching to 3.3.1 fixes the issue. I've also tried: kernel-3.4.0-1.fc17.x86_64 kernel-3.4.1-1.fc17.x86_64 (from koji) kernel-3.4.2-1.fc17.x86_64 (from koji) and I still get an unexpectly high load. ok, I moved back my Fedora 15 machine back to 2.6.43.2-6.fc15.x86_64. After 25min of normal desktop usage (ie. thunderbird, chrome, couple of virtual terminal, plus .flv audio decoding with mplayer), here is top output: top - 13:32:24 up 21 min, 1 user, load average: 0.15, 0.14, 0.27 Tasks: 246 total, 1 running, 245 sleeping, 0 stopped, 0 zombie Cpu0 : 7.0%us, 1.7%sy, 0.0%ni, 89.7%id, 0.7%wa, 0.7%hi, 0.3%si, 0.0%st Cpu1 : 1.7%us, 1.7%sy, 0.0%ni, 95.3%id, 1.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu2 : 2.7%us, 1.0%sy, 0.0%ni, 95.7%id, 0.3%wa, 0.3%hi, 0.0%si, 0.0%st Cpu3 : 0.7%us, 0.3%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st while the machine here is actually more used than previously :/ Hi, It would seem that 3.3.1-3.fc16.x86_64 from F16 is also affected. I am unable to currently flawlessly play youtube's video at standard resolution. Load average is ~0.5 while having nothing more than XFCE a couple of terminal and Chromium opened (with this single tab opened) top currently report: top - 11:57:11 up 2 days, 9:53, 2 users, load average: 0.41, 0.50, 0.48 Tasks: 253 total, 1 running, 252 sleeping, 0 stopped, 0 zombie Cpu0 : 1.7%us, 0.7%sy, 0.0%ni, 97.4%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 2.6%us, 1.0%sy, 0.0%ni, 93.4%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3049184k total, 1451524k used, 1597660k free, 48036k buffers Swap: 3145724k total, 254716k used, 2891008k free, 923840k cached However, `perf top' might have a little bit more insight: ... Events: 87K cycles 41.29% [kernel] [k] native_read_tsc 18.09% [kernel] [k] delay_tsc 6.58% [kernel] [k] ioread32 3.52% [unknown] [.] 0x7f8268248dc0 1.86% [kernel] [k] read_hpet ... Events: 101K cycles 20.33% [kernel] [k] native_read_tsc 8.62% [kernel] [k] delay_tsc 6.94% [unknown] [.] 0x7f8268248dc0 3.38% [kernel] [k] read_hpet 2.63% [kernel] [k] ioread32 ... Events: 118K cycles 15.01% [kernel] [k] native_read_tsc 6.87% [kernel] [k] delay_tsc 4.85% [unknown] [.] 0x7f8268248dc0 3.97% libpixman-1.so.0.24.4 [.] 0x5602d 3.19% [kernel] [k] read_hpet ... Events: 174K cycles 22.23% [unknown] [.] 0x7f8268248dc0 12.62% [kernel] [k] native_read_tsc 5.50% [kernel] [k] delay_tsc 3.19% [kernel] [k] read_hpet 2.48% chromium-browser [.] 0x4c4b05 This might be a wild guess, but activity in native_read_tsc()/delay_tsc would seem to match to higher load while idle. It looks like this really affects only 64b systems as I tried 3.3.7 on i686 and it seems OK. From kernel-devel: Michal Schmidt: If 3.3.4 works and 3.3.5 is broken, perhaps someone who's seeing the bug can bisect it. There are not many patches there. Maybe it's this patch that went into 3.3.5: commit 6b7b95e774e2e2b32631511ad7d4c2256f1b3162 Author: Peter Zijlstra <peterz> Date: Thu Mar 1 15:04:46 2012 +0100 sched: Fix nohz load accounting -- again! ---- This seems to affect only the calculation, but IMHO we see also a performance hit. With 3.4.2-1.fc16.i686 I see this on i686 too. The load is constantly around 1. @adam: out of curiosity, what does `perf top' report on the box while idle and showing the issue ? FWIW, here is my output on 3.3.1, while *not* experiencing the issue: Events: 72K cycles 17.87% libspeexdsp.so.1.5.0 [.] 0x96e8 11.93% libflashplayer.so [.] 0x33bb54 7.27% [kernel] [k] read_hpet 2.55% [kernel] [k] _raw_spin_lock_irqsave 2.46% chromium-browser [.] 0x1461d45 [...] 0.31% [kernel] [k] native_read_tsc Created attachment 593993 [details]
load change with 3.4.2
This is a server, there is still running something mainly DVB streaming, but in general I do not see anything suspicious, except the load changed with the new kernel.
17.08% [kernel] [k] acpi_idle_do_entry
3.95% getstream [.] send_ts_packet
3.86% [kernel] [k] memcpy
3.80% [kernel] [k] __copy_to_user_ll
2.53% [kernel] [k] _raw_spin_lock_irqsave
2.17% [kernel] [k] _raw_spin_unlock_irqrestore
1.98% [kernel] [k] __wake_up_common
1.83% [saa7134] [k] saa7134_irq
1.56% [dvb_core] [k] dvb_dmx_swfilter_packet
1.55% libevent-2.0.so.5.0.1.#prelink#.I54qsK (deleted) [.] 0x0000ce4a
1.43% [dvb_core] [k] dvb_ringbuffer_write
1.32% [kernel] [k] __schedule
1.24% [kernel] [k] find_busiest_group
1.20% getstream [.] dvr_read
1.09% [dvb_core] [k] dvb_dmxdev_ts_callback
0.93% [kernel] [k] ep_poll_callback
0.92% [kernel] [k] update_curr
0.80% [kernel] [k] menu_select
0.79% [kernel] [k] delay_tsc
0.79% [kernel] [k] timerqueue_add
0.77% [kernel] [k] do_timer
0.76% [kernel] [k] __hrtimer_start_range_ns
0.71% [kernel] [k] update_cfs_shares
0.70% [kernel] [k] update_cfs_load
0.65% [dvb_core] [k] dvb_dmxdev_buffer_write
0.63% [kernel] [k] irqtime_account_process_tick
0.63% [kernel] [k] native_sched_clock
0.61% [kernel] [k] rb_erase
0.60% [kernel] [k] try_to_wake_up
0.60% [kernel] [k] select_task_rq_fair
0.59% [kernel] [k] usb_hcd_irq
0.58% [kernel] [k] dequeue_entity
0.56% [kernel] [k] sysenter_past_esp
0.55% [kernel] [k] read_tsc
0.54% libslang.so.2.2.4 [.] 0x000669f0
0.53% [kernel] [k] cpuacct_charge
0.50% libc-2.14.90.so (deleted) [.] 0x000e34b4
0.50% [kernel] [k] effective_load
Here is another computer - laptop with Intel Atom Z520 (i686), same problem after update to 3.4.2 - I'd love to know if there is anybody with a computer where the problem does not exibit with 3.4.2. 21.29% [kernel] [k] delay_tsc 2.76% [kernel] [k] read_hpet 2.72% libslang.so.2.2.4 [.] 0x0006695c 2.60% perf [.] 0x000468e0 2.26% libslang.so.2.2.4 [.] SLtt_smart_puts 1.87% [kernel] [k] ioread32 1.65% libslang.so.2.2.4 [.] SLsmg_write_chars 1.49% [kernel] [k] do_timer 1.33% libdbus-1.so.3.5.6 [.] 0x000253ae 1.25% dbus-daemon [.] 0x000353b0 1.16% [kernel] [k] __schedule 1.12% [kernel] [k] menu_select 0.92% [kernel] [k] tick_nohz_idle_exit 0.88% [kernel] [k] __switch_to 0.85% [kernel] [k] find_busiest_group 0.71% [kernel] [k] get_next_timer_interrupt 0.67% [kernel] [k] tg_load_down 0.67% [kernel] [k] copy_user_highpage 0.66% [kernel] [k] _raw_spin_lock_irqsave 0.66% [kernel] [k] tick_do_update_jiffies64 0.63% perf [.] rb_next 0.58% [kernel] [k] rcu_note_context_switch 0.57% [kernel] [k] hrtimer_interrupt 0.55% [kernel] [k] sched_clock_local 0.54% libc-2.14.90.so [.] _int_malloc 0.53% perf [.] dso__find_symbol 0.53% [kernel] [k] ktime_get 0.51% [kernel] [k] tick_nohz_stop_sched_tick Sorry for the noise -forget about the Atom, this one is OK. I probably saw what is not there. The load drops to 0 at the Atom. It seems like kernel 3.4.6-2.fc17.x86_64 solves this problem, while kernel-3.4.5-2.fc17.x86_64 still shows excessive load. |