Description of problem: Load, as reported from e.g. 'uptime', on a completely idle machine is significantly higher then it was on 3.3.4 and earlier. Version-Release number of selected component (if applicable): kernel-3.3.5-2 How reproducible: Always Steps to Reproduce: 1. Boot 3.3.5-2 or 3.3.6-3 2. Leave the machine to become totally idle 3. Monitor the load with e.g. 'uptime' Actual results: On one machine, with 3.3.6-3, after xx minutes idle-ness: 13:11:30 up 10 min, 1 user, load average: 0.16, 0.17, 0.15 13:16:30 up 15 min, 1 user, load average: 0.19, 0.17, 0.14 Load remains at around 0.20. The exact number it settles on differs from machine to machine, and seems to reach 0.80 on certain machines. Expected results: Same machine, with 3.3.4-3, after xx minutes of idle-ness: 12:53:59 up 10 min, 1 user, load average: 0.05, 0.14, 0.10 12:59:00 up 15 min, 1 user, load average: 0.00, 0.05, 0.08 Load further drops and settles at 0.00 to 0.05 for this machine. Additional info: Top, dstat and powertop do not show anything obvious to me. Remembers me of the load mis-calculation of a year or so ago. 'load' is of course only an arbitrary number, but it's pretty important as unexpected values disturb e.g. condor's job scheduling algorithms. I see that ArchLinux has similar problems: https://bbs.archlinux.org/viewtopic.php?pid=1103189
Does also affect F17 and 3.3.7: 1st machine: MacBookAir4,2 (Intel Sandybridge Core i7) $ uname -a; uptime Linux karei 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 10:26:49 up 1:00, 4 users, load average: 0.67, 0.68, 0.57 2nd machine: Desktop SNB Core i3 $ uname -a;uptime Linux buta 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 10:40:26 up 11 min, 1 user, load average: 0.44, 0.32, 0.24 Doesn't seem to affect my 3rd machine (Sony laptop with Core i3 Arrandale) $ uname -a;uptime Linux hebi 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux 10:42:00 up 9 min, 2 users, load average: 0.02, 0.27, 0.28 Nothing seems to stand out in "perf top" or "powertop"...
Same here, on several machines with F16. The 3.3.4 shows in init 3 idle load 0. With 3.3.7 I see constant load 0.5, in Gnome above 1.0. This happens on both Intel and AMD cpus I have here. I am not sure this is just load shown incorrectly. IMHO there is a performance penalty too, but I did no objective measurement.
same here, here is the bug report I was about to fill: Description of problem: Kernel on a mostly idle box shows a very high load: A Fedora 15 box (2.6.43.5) showing the problem: top - 10:47:11 up 24 min, 1 user, load average: 0.95, 0.99, 0.89 Tasks: 239 total, 1 running, 238 sleeping, 0 stopped, 0 zombie Cpu0 : 1.0%us, 0.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st Cpu1 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st A Fedora 16 box (3.3.1) *not* showing the problem: top - 10:47:28 up 8:44, 2 users, load average: 0.00, 0.01, 0.05 Tasks: 201 total, 1 running, 200 sleeping, 0 stopped, 0 zombie Cpu0 : 2.1%us, 0.9%sy, 0.0%ni, 95.8%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 2.1%us, 0.9%sy, 0.0%ni, 95.9%id, 1.1%wa, 0.0%hi, 0.0%si, 0.0%st Version-Release number of selected component (if applicable): Known bad: on Fedora 15: 2.6.43.5-2.fc15.x86_64 on Fedora 16: 3.3.5-2.fc16.x86_64 Known good: on Fedora 16: 3.3.1-3.fc16.x86_64 How reproducible: always Steps to Reproduce: 1. boot 2. start a virtual console 3. run top Actual results: load near 1.00 Expected results: load near 0.00 Additional info: The Fedora 16 box do not show the problem on 3.3.1-3.fc16.x86_64 but do show it starting 3.3.5-2.fc16.x86_64 and up. This directly impact my multimedia experience as youtube videos fail to play fluidly and I am not even able to correctly watch webm encoded video from mplayer... Switching to 3.3.1 fixes the issue.
I've also tried: kernel-3.4.0-1.fc17.x86_64 kernel-3.4.1-1.fc17.x86_64 (from koji) kernel-3.4.2-1.fc17.x86_64 (from koji) and I still get an unexpectly high load.
ok, I moved back my Fedora 15 machine back to 2.6.43.2-6.fc15.x86_64. After 25min of normal desktop usage (ie. thunderbird, chrome, couple of virtual terminal, plus .flv audio decoding with mplayer), here is top output: top - 13:32:24 up 21 min, 1 user, load average: 0.15, 0.14, 0.27 Tasks: 246 total, 1 running, 245 sleeping, 0 stopped, 0 zombie Cpu0 : 7.0%us, 1.7%sy, 0.0%ni, 89.7%id, 0.7%wa, 0.7%hi, 0.3%si, 0.0%st Cpu1 : 1.7%us, 1.7%sy, 0.0%ni, 95.3%id, 1.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu2 : 2.7%us, 1.0%sy, 0.0%ni, 95.7%id, 0.3%wa, 0.3%hi, 0.0%si, 0.0%st Cpu3 : 0.7%us, 0.3%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.3%hi, 0.0%si, 0.0%st while the machine here is actually more used than previously :/
Hi, It would seem that 3.3.1-3.fc16.x86_64 from F16 is also affected. I am unable to currently flawlessly play youtube's video at standard resolution. Load average is ~0.5 while having nothing more than XFCE a couple of terminal and Chromium opened (with this single tab opened) top currently report: top - 11:57:11 up 2 days, 9:53, 2 users, load average: 0.41, 0.50, 0.48 Tasks: 253 total, 1 running, 252 sleeping, 0 stopped, 0 zombie Cpu0 : 1.7%us, 0.7%sy, 0.0%ni, 97.4%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 2.6%us, 1.0%sy, 0.0%ni, 93.4%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3049184k total, 1451524k used, 1597660k free, 48036k buffers Swap: 3145724k total, 254716k used, 2891008k free, 923840k cached However, `perf top' might have a little bit more insight: ... Events: 87K cycles 41.29% [kernel] [k] native_read_tsc 18.09% [kernel] [k] delay_tsc 6.58% [kernel] [k] ioread32 3.52% [unknown] [.] 0x7f8268248dc0 1.86% [kernel] [k] read_hpet ... Events: 101K cycles 20.33% [kernel] [k] native_read_tsc 8.62% [kernel] [k] delay_tsc 6.94% [unknown] [.] 0x7f8268248dc0 3.38% [kernel] [k] read_hpet 2.63% [kernel] [k] ioread32 ... Events: 118K cycles 15.01% [kernel] [k] native_read_tsc 6.87% [kernel] [k] delay_tsc 4.85% [unknown] [.] 0x7f8268248dc0 3.97% libpixman-1.so.0.24.4 [.] 0x5602d 3.19% [kernel] [k] read_hpet ... Events: 174K cycles 22.23% [unknown] [.] 0x7f8268248dc0 12.62% [kernel] [k] native_read_tsc 5.50% [kernel] [k] delay_tsc 3.19% [kernel] [k] read_hpet 2.48% chromium-browser [.] 0x4c4b05 This might be a wild guess, but activity in native_read_tsc()/delay_tsc would seem to match to higher load while idle.
It looks like this really affects only 64b systems as I tried 3.3.7 on i686 and it seems OK. From kernel-devel: Michal Schmidt: If 3.3.4 works and 3.3.5 is broken, perhaps someone who's seeing the bug can bisect it. There are not many patches there. Maybe it's this patch that went into 3.3.5: commit 6b7b95e774e2e2b32631511ad7d4c2256f1b3162 Author: Peter Zijlstra <peterz> Date: Thu Mar 1 15:04:46 2012 +0100 sched: Fix nohz load accounting -- again! ---- This seems to affect only the calculation, but IMHO we see also a performance hit.
With 3.4.2-1.fc16.i686 I see this on i686 too. The load is constantly around 1.
@adam: out of curiosity, what does `perf top' report on the box while idle and showing the issue ? FWIW, here is my output on 3.3.1, while *not* experiencing the issue: Events: 72K cycles 17.87% libspeexdsp.so.1.5.0 [.] 0x96e8 11.93% libflashplayer.so [.] 0x33bb54 7.27% [kernel] [k] read_hpet 2.55% [kernel] [k] _raw_spin_lock_irqsave 2.46% chromium-browser [.] 0x1461d45 [...] 0.31% [kernel] [k] native_read_tsc
Created attachment 593993 [details] load change with 3.4.2 This is a server, there is still running something mainly DVB streaming, but in general I do not see anything suspicious, except the load changed with the new kernel. 17.08% [kernel] [k] acpi_idle_do_entry 3.95% getstream [.] send_ts_packet 3.86% [kernel] [k] memcpy 3.80% [kernel] [k] __copy_to_user_ll 2.53% [kernel] [k] _raw_spin_lock_irqsave 2.17% [kernel] [k] _raw_spin_unlock_irqrestore 1.98% [kernel] [k] __wake_up_common 1.83% [saa7134] [k] saa7134_irq 1.56% [dvb_core] [k] dvb_dmx_swfilter_packet 1.55% libevent-2.0.so.5.0.1.#prelink#.I54qsK (deleted) [.] 0x0000ce4a 1.43% [dvb_core] [k] dvb_ringbuffer_write 1.32% [kernel] [k] __schedule 1.24% [kernel] [k] find_busiest_group 1.20% getstream [.] dvr_read 1.09% [dvb_core] [k] dvb_dmxdev_ts_callback 0.93% [kernel] [k] ep_poll_callback 0.92% [kernel] [k] update_curr 0.80% [kernel] [k] menu_select 0.79% [kernel] [k] delay_tsc 0.79% [kernel] [k] timerqueue_add 0.77% [kernel] [k] do_timer 0.76% [kernel] [k] __hrtimer_start_range_ns 0.71% [kernel] [k] update_cfs_shares 0.70% [kernel] [k] update_cfs_load 0.65% [dvb_core] [k] dvb_dmxdev_buffer_write 0.63% [kernel] [k] irqtime_account_process_tick 0.63% [kernel] [k] native_sched_clock 0.61% [kernel] [k] rb_erase 0.60% [kernel] [k] try_to_wake_up 0.60% [kernel] [k] select_task_rq_fair 0.59% [kernel] [k] usb_hcd_irq 0.58% [kernel] [k] dequeue_entity 0.56% [kernel] [k] sysenter_past_esp 0.55% [kernel] [k] read_tsc 0.54% libslang.so.2.2.4 [.] 0x000669f0 0.53% [kernel] [k] cpuacct_charge 0.50% libc-2.14.90.so (deleted) [.] 0x000e34b4 0.50% [kernel] [k] effective_load
Here is another computer - laptop with Intel Atom Z520 (i686), same problem after update to 3.4.2 - I'd love to know if there is anybody with a computer where the problem does not exibit with 3.4.2. 21.29% [kernel] [k] delay_tsc 2.76% [kernel] [k] read_hpet 2.72% libslang.so.2.2.4 [.] 0x0006695c 2.60% perf [.] 0x000468e0 2.26% libslang.so.2.2.4 [.] SLtt_smart_puts 1.87% [kernel] [k] ioread32 1.65% libslang.so.2.2.4 [.] SLsmg_write_chars 1.49% [kernel] [k] do_timer 1.33% libdbus-1.so.3.5.6 [.] 0x000253ae 1.25% dbus-daemon [.] 0x000353b0 1.16% [kernel] [k] __schedule 1.12% [kernel] [k] menu_select 0.92% [kernel] [k] tick_nohz_idle_exit 0.88% [kernel] [k] __switch_to 0.85% [kernel] [k] find_busiest_group 0.71% [kernel] [k] get_next_timer_interrupt 0.67% [kernel] [k] tg_load_down 0.67% [kernel] [k] copy_user_highpage 0.66% [kernel] [k] _raw_spin_lock_irqsave 0.66% [kernel] [k] tick_do_update_jiffies64 0.63% perf [.] rb_next 0.58% [kernel] [k] rcu_note_context_switch 0.57% [kernel] [k] hrtimer_interrupt 0.55% [kernel] [k] sched_clock_local 0.54% libc-2.14.90.so [.] _int_malloc 0.53% perf [.] dso__find_symbol 0.53% [kernel] [k] ktime_get 0.51% [kernel] [k] tick_nohz_stop_sched_tick
Sorry for the noise -forget about the Atom, this one is OK. I probably saw what is not there. The load drops to 0 at the Atom.
It seems like kernel 3.4.6-2.fc17.x86_64 solves this problem, while kernel-3.4.5-2.fc17.x86_64 still shows excessive load.