Hide Forgot
Description of problem: After upgrading to 2.6.35.12-88.fc14.x86_64 from the previous version from the stable repository I see strange system load values on our build machine. When the server is idle (no i/o and no cpu load) there is a constant load. When the machine starts compiling the real load is added: top - 00:51:25 up 1 day, 12:47, 1 user, load average: 12.09, 12.04, 12.05 Tasks: 483 total, 1 running, 482 sleeping, 0 stopped, 0 zombie top - 00:55:11 up 1 day, 12:51, 1 user, load average: 12.04, 12.04, 12.05 Tasks: 483 total, 1 running, 482 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 2.0%sy, 0.0%ni, 98.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st I'm not sure if it is normal that half of the cpu's have neither load nor do they idle or if this is only another symptom of the issue. I remember that after F14 was released there was a similar bug where the system load was high while the machine was idle - maybe a regression? Version-Release number of selected component (if applicable): 2.6.35.12-88.fc14.x86_64 How reproducible: sporadic - it happened at least twice today Steps to Reproduce: unknown Actual results: Expected results: Additional info:
Ok, the current load issue is probably caused by this (just seen in the log): Apr 13 18:14:52 ping kernel: [108388.423067] ------------[ cut here ]------------ Apr 13 18:14:52 ping kernel: [108388.423294] kernel BUG at fs/btrfs/extent-tree.c:4852! Apr 13 18:14:52 ping kernel: [108388.423516] invalid opcode: 0000 [#1] SMP Apr 13 18:14:52 ping kernel: [108388.423737] last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map Apr 13 18:14:52 ping kernel: [108388.424162] CPU 16 Apr 13 18:14:52 ping kernel: [108388.424169] Modules linked in: nfs fscache tun btrfs zlib_deflate libcrc32c nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ipv6 kvm_intel kvm serio_raw i2c_i801 i2c_core ftdi_sio iTCO_wdt igb iTCO_vendor_support i7core_edac edac_core ioatdma dca joydev pl2303 usbserial microcode [last unloaded: scsi_wait_scan] Apr 13 18:14:52 ping kernel: [108388.425323] Apr 13 18:14:52 ping kernel: [108388.425533] Pid: 10882, comm: ccache Not tainted 2.6.35.12-88.fc14.x86_64 #1 X8DT3/X8DT3 Apr 13 18:14:52 ping kernel: [108388.425970] RIP: 0010:[<ffffffffa0271d94>] [<ffffffffa0271d94>] find_free_extent.clone.52+0x6da/0x8be [btrfs] Apr 13 18:14:52 ping kernel: [108388.426445] RSP: 0018:ffff8802be401528 EFLAGS: 00010202 Apr 13 18:14:52 ping kernel: [108388.426672] RAX: 0000000000000003 RBX: ffff8802d5afc000 RCX: 0000000000498000 Apr 13 18:14:52 ping kernel: [108388.427112] RDX: 0000000000006ec2 RSI: 0000000000000001 RDI: 0000000000000008 Apr 13 18:14:52 ping kernel: [108388.427557] RBP: ffff8802be401628 R08: 0000000000000000 R09: 0000000000298000 Apr 13 18:14:52 ping kernel: [108388.427999] R10: 00000000002970cd R11: 0000000000000001 R12: ffff880331186258 Apr 13 18:14:52 ping kernel: [108388.428433] R13: 0000000000000001 R14: 00000000ffffffe4 R15: ffff88032ad2b800 Apr 13 18:14:52 ping kernel: [108388.428862] FS: 00007fc72a8e4720(0000) GS:ffff880002140000(0000) knlGS:0000000000000000 Apr 13 18:14:52 ping kernel: [108388.429290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 13 18:14:52 ping kernel: [108388.429505] CR2: 00007fc72a8f7000 CR3: 00000005ed3a4000 CR4: 00000000000006e0 Apr 13 18:14:52 ping kernel: [108388.429934] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 13 18:14:52 ping kernel: [108388.430364] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 13 18:14:52 ping kernel: [108388.430790] Process ccache (pid: 10882, threadinfo ffff8802be400000, task ffff880072700000) Apr 13 18:14:52 ping kernel: [108388.431227] Stack: Apr 13 18:14:52 ping kernel: [108388.431439] 0000000000200000 ffff8802be401528 0000000000000000 0000000000000000 Apr 13 18:14:52 ping kernel: [108388.431668] <0> 0000000000498000 0000000000200000 ffffffffffffffff 0000000000298000 Apr 13 18:14:52 ping kernel: [108388.432108] <0> 00000001be401578 ffff880323c0a7c0 00000000be401588 0000ffff00000001 Apr 13 18:14:52 ping kernel: [108388.432754] Call Trace: Apr 13 18:14:52 ping kernel: [108388.432983] [<ffffffffa02714fa>] ? do_chunk_alloc+0x1b6/0x1c8 [btrfs] Apr 13 18:14:52 ping kernel: [108388.433213] [<ffffffffa0274308>] btrfs_reserve_extent+0xc9/0x164 [btrfs] Apr 13 18:14:52 ping kernel: [108388.433447] [<ffffffffa0283baa>] cow_file_range+0x19a/0x2e4 [btrfs] Apr 13 18:14:52 ping kernel: [108388.433675] [<ffffffffa0298650>] ? set_extent_bit+0x318/0x366 [btrfs] Apr 13 18:14:52 ping kernel: [108388.433905] [<ffffffffa028431d>] run_delalloc_range+0xb1/0x381 [btrfs] Apr 13 18:14:52 ping kernel: [108388.434138] [<ffffffffa02979b1>] ? free_extent_state+0x1f/0x4f [btrfs] Apr 13 18:14:52 ping kernel: [108388.434371] [<ffffffffa02993f9>] ? find_lock_delalloc_range.clone.25+0x1a6/0x1cb [btrfs] Apr 13 18:14:52 ping kernel: [108388.434803] [<ffffffff8121cd5d>] ? __lookup_tag+0xb9/0x123 Apr 13 18:14:52 ping kernel: [108388.435034] [<ffffffffa0299a14>] __extent_writepage+0x181/0x55f [btrfs] Apr 13 18:14:52 ping kernel: [108388.435257] [<ffffffff8121ce48>] ? radix_tree_gang_lookup_tag_slot+0x81/0xa2 Apr 13 18:14:52 ping kernel: [108388.435694] [<ffffffffa029b8e6>] ? unmap_extent_buffer+0xe/0x10 [btrfs] Apr 13 18:14:52 ping kernel: [108388.435929] [<ffffffffa0299f26>] extent_write_cache_pages.clone.18.clone.24+0x134/0x2a1 [btrfs] Apr 13 18:14:52 ping kernel: [108388.436375] [<ffffffffa029a30d>] extent_writepages+0x47/0x5c [btrfs] Apr 13 18:14:52 ping kernel: [108388.436614] [<ffffffffa0297b0f>] ? merge_state+0xa6/0x11a [btrfs] Apr 13 18:14:52 ping kernel: [108388.436846] [<ffffffffa0281ee3>] ? btrfs_get_extent+0x0/0x738 [btrfs] Apr 13 18:14:52 ping kernel: [108388.437078] [<ffffffffa02810ae>] btrfs_writepages+0x27/0x29 [btrfs] Apr 13 18:14:52 ping kernel: [108388.437301] [<ffffffff810dbc5f>] do_writepages+0x21/0x2a Apr 13 18:14:52 ping kernel: [108388.437529] [<ffffffff81132afc>] writeback_single_inode+0xa5/0x1e0 Apr 13 18:14:52 ping kernel: [108388.437751] [<ffffffff81132d06>] write_inode_now+0x94/0xef Apr 13 18:14:52 ping kernel: [108388.437983] [<ffffffffa0289414>] btrfs_start_one_delalloc_inode+0x10c/0x135 [btrfs] Apr 13 18:14:52 ping kernel: [108388.438417] [<ffffffffa026dcbc>] shrink_delalloc+0x95/0x135 [btrfs] Apr 13 18:14:52 ping kernel: [108388.438643] [<ffffffffa0272873>] btrfs_delalloc_reserve_metadata+0x13c/0x150 [btrfs] Apr 13 18:14:52 ping kernel: [108388.439074] [<ffffffffa027295c>] btrfs_delalloc_reserve_space+0x2a/0x48 [btrfs] Apr 13 18:14:52 ping kernel: [108388.439514] [<ffffffffa028a61d>] btrfs_file_aio_write+0x4d5/0x7e7 [btrfs] Apr 13 18:14:52 ping kernel: [108388.439735] [<ffffffff811124e0>] ? try_get_mem_cgroup_from_mm+0x33/0x3f Apr 13 18:14:52 ping kernel: [108388.439970] [<ffffffff8111726e>] do_sync_write+0xcb/0x108 Apr 13 18:14:52 ping kernel: [108388.440192] [<ffffffff810f214e>] ? do_mmap_pgoff+0x2ab/0x305 Apr 13 18:14:52 ping kernel: [108388.440419] [<ffffffff811dce11>] ? security_file_permission+0x16/0x18 Apr 13 18:14:52 ping kernel: [108388.445534] [<ffffffff81117948>] vfs_write+0xac/0x100 Apr 13 18:14:52 ping kernel: [108388.445759] [<ffffffff81117b51>] sys_write+0x4a/0x6e Apr 13 18:14:52 ping kernel: [108388.445984] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b Apr 13 18:14:52 ping kernel: [108388.446207] Code: 48 89 41 09 73 76 4c 89 c2 4c 89 ee 48 89 df 4c 29 ea e8 9a 92 03 00 eb 63 c6 45 a0 01 48 8b 7b 60 e8 b5 d4 ff ff 39 45 c8 74 02 <0f> 0b 48 89 df e8 fa dd ff ff 48 8b 9b d0 00 00 00 c6 85 5f ff Apr 13 18:14:52 ping kernel: [108388.447026] RIP [<ffffffffa0271d94>] find_free_extent.clone.52+0x6da/0x8be [btrfs] Apr 13 18:14:52 ping kernel: [108388.447474] RSP <ffff8802be401528> Apr 13 18:14:52 ping kernel: [108388.447945] ---[ end trace 01ebe07e296986de ]---
Got the same bug again today - and the idle load remains at the level it was at that time.
Are you still seeing this with the latest f14 or f15 kernel?
Well, in the meantime I switched away from btrfs back to ext4 because there were problems every couple of days - so I can't tell.
This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping