Bug 696380 - kernel BUG at fs/btrfs/extent-tree.c:4852! with 2.6.35.12-88.fc14.x86_64 (secondary: idle system load wrong afterwards)
Summary: kernel BUG at fs/btrfs/extent-tree.c:4852! with 2.6.35.12-88.fc14.x86_64 (sec...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 14
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-13 23:07 UTC by Ronald Wahl
Modified: 2012-08-16 13:49 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-16 13:49:04 UTC
Type: ---


Attachments (Terms of Use)

Description Ronald Wahl 2011-04-13 23:07:50 UTC
Description of problem:
After upgrading to 2.6.35.12-88.fc14.x86_64 from the previous version from the stable repository I see strange system load values on our build machine. When the server is idle (no i/o and no cpu load) there is a constant load. When the machine starts compiling the real load is added:

top - 00:51:25 up 1 day, 12:47,  1 user,  load average: 12.09, 12.04, 12.05
Tasks: 483 total,   1 running, 482 sleeping,   0 stopped,   0 zombie
top - 00:55:11 up 1 day, 12:51,  1 user,  load average: 12.04, 12.04, 12.05
Tasks: 483 total,   1 running, 482 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  2.0%sy,  0.0%ni, 98.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu16 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu17 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu18 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu19 :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu20 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu21 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu22 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu23 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

I'm not sure if it is normal that half of the cpu's have neither load nor do they idle or if this is only another symptom of the issue.
I remember that after F14 was released there was a similar bug where the system load was high while the machine was idle - maybe a regression?


Version-Release number of selected component (if applicable):
2.6.35.12-88.fc14.x86_64

How reproducible:
sporadic - it happened at least twice today

Steps to Reproduce:
unknown
  
Actual results:


Expected results:


Additional info:

Comment 1 Ronald Wahl 2011-04-13 23:27:11 UTC
Ok, the current load issue is probably caused by this (just seen in the log):

Apr 13 18:14:52 ping kernel: [108388.423067] ------------[ cut here ]------------
Apr 13 18:14:52 ping kernel: [108388.423294] kernel BUG at fs/btrfs/extent-tree.c:4852!
Apr 13 18:14:52 ping kernel: [108388.423516] invalid opcode: 0000 [#1] SMP 
Apr 13 18:14:52 ping kernel: [108388.423737] last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
Apr 13 18:14:52 ping kernel: [108388.424162] CPU 16 
Apr 13 18:14:52 ping kernel: [108388.424169] Modules linked in: nfs fscache tun btrfs zlib_deflate libcrc32c nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ipv6 kvm_intel kvm serio_raw i2c_i801 i2c_core ftdi_sio iTCO_wdt igb iTCO_vendor_support i7core_edac edac_core ioatdma dca joydev pl2303 usbserial microcode [last unloaded: scsi_wait_scan]
Apr 13 18:14:52 ping kernel: [108388.425323] 
Apr 13 18:14:52 ping kernel: [108388.425533] Pid: 10882, comm: ccache Not tainted 2.6.35.12-88.fc14.x86_64 #1 X8DT3/X8DT3
Apr 13 18:14:52 ping kernel: [108388.425970] RIP: 0010:[<ffffffffa0271d94>]  [<ffffffffa0271d94>] find_free_extent.clone.52+0x6da/0x8be [btrfs]
Apr 13 18:14:52 ping kernel: [108388.426445] RSP: 0018:ffff8802be401528  EFLAGS: 00010202
Apr 13 18:14:52 ping kernel: [108388.426672] RAX: 0000000000000003 RBX: ffff8802d5afc000 RCX: 0000000000498000
Apr 13 18:14:52 ping kernel: [108388.427112] RDX: 0000000000006ec2 RSI: 0000000000000001 RDI: 0000000000000008
Apr 13 18:14:52 ping kernel: [108388.427557] RBP: ffff8802be401628 R08: 0000000000000000 R09: 0000000000298000
Apr 13 18:14:52 ping kernel: [108388.427999] R10: 00000000002970cd R11: 0000000000000001 R12: ffff880331186258
Apr 13 18:14:52 ping kernel: [108388.428433] R13: 0000000000000001 R14: 00000000ffffffe4 R15: ffff88032ad2b800
Apr 13 18:14:52 ping kernel: [108388.428862] FS:  00007fc72a8e4720(0000) GS:ffff880002140000(0000) knlGS:0000000000000000
Apr 13 18:14:52 ping kernel: [108388.429290] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 13 18:14:52 ping kernel: [108388.429505] CR2: 00007fc72a8f7000 CR3: 00000005ed3a4000 CR4: 00000000000006e0
Apr 13 18:14:52 ping kernel: [108388.429934] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 13 18:14:52 ping kernel: [108388.430364] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 13 18:14:52 ping kernel: [108388.430790] Process ccache (pid: 10882, threadinfo ffff8802be400000, task ffff880072700000)
Apr 13 18:14:52 ping kernel: [108388.431227] Stack:
Apr 13 18:14:52 ping kernel: [108388.431439]  0000000000200000 ffff8802be401528 0000000000000000 0000000000000000
Apr 13 18:14:52 ping kernel: [108388.431668] <0> 0000000000498000 0000000000200000 ffffffffffffffff 0000000000298000
Apr 13 18:14:52 ping kernel: [108388.432108] <0> 00000001be401578 ffff880323c0a7c0 00000000be401588 0000ffff00000001
Apr 13 18:14:52 ping kernel: [108388.432754] Call Trace:
Apr 13 18:14:52 ping kernel: [108388.432983]  [<ffffffffa02714fa>] ? do_chunk_alloc+0x1b6/0x1c8 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.433213]  [<ffffffffa0274308>] btrfs_reserve_extent+0xc9/0x164 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.433447]  [<ffffffffa0283baa>] cow_file_range+0x19a/0x2e4 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.433675]  [<ffffffffa0298650>] ? set_extent_bit+0x318/0x366 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.433905]  [<ffffffffa028431d>] run_delalloc_range+0xb1/0x381 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.434138]  [<ffffffffa02979b1>] ? free_extent_state+0x1f/0x4f [btrfs]
Apr 13 18:14:52 ping kernel: [108388.434371]  [<ffffffffa02993f9>] ? find_lock_delalloc_range.clone.25+0x1a6/0x1cb [btrfs]
Apr 13 18:14:52 ping kernel: [108388.434803]  [<ffffffff8121cd5d>] ? __lookup_tag+0xb9/0x123
Apr 13 18:14:52 ping kernel: [108388.435034]  [<ffffffffa0299a14>] __extent_writepage+0x181/0x55f [btrfs]
Apr 13 18:14:52 ping kernel: [108388.435257]  [<ffffffff8121ce48>] ? radix_tree_gang_lookup_tag_slot+0x81/0xa2
Apr 13 18:14:52 ping kernel: [108388.435694]  [<ffffffffa029b8e6>] ? unmap_extent_buffer+0xe/0x10 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.435929]  [<ffffffffa0299f26>] extent_write_cache_pages.clone.18.clone.24+0x134/0x2a1 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.436375]  [<ffffffffa029a30d>] extent_writepages+0x47/0x5c [btrfs]
Apr 13 18:14:52 ping kernel: [108388.436614]  [<ffffffffa0297b0f>] ? merge_state+0xa6/0x11a [btrfs]
Apr 13 18:14:52 ping kernel: [108388.436846]  [<ffffffffa0281ee3>] ? btrfs_get_extent+0x0/0x738 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.437078]  [<ffffffffa02810ae>] btrfs_writepages+0x27/0x29 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.437301]  [<ffffffff810dbc5f>] do_writepages+0x21/0x2a
Apr 13 18:14:52 ping kernel: [108388.437529]  [<ffffffff81132afc>] writeback_single_inode+0xa5/0x1e0
Apr 13 18:14:52 ping kernel: [108388.437751]  [<ffffffff81132d06>] write_inode_now+0x94/0xef
Apr 13 18:14:52 ping kernel: [108388.437983]  [<ffffffffa0289414>] btrfs_start_one_delalloc_inode+0x10c/0x135 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.438417]  [<ffffffffa026dcbc>] shrink_delalloc+0x95/0x135 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.438643]  [<ffffffffa0272873>] btrfs_delalloc_reserve_metadata+0x13c/0x150 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.439074]  [<ffffffffa027295c>] btrfs_delalloc_reserve_space+0x2a/0x48 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.439514]  [<ffffffffa028a61d>] btrfs_file_aio_write+0x4d5/0x7e7 [btrfs]
Apr 13 18:14:52 ping kernel: [108388.439735]  [<ffffffff811124e0>] ? try_get_mem_cgroup_from_mm+0x33/0x3f
Apr 13 18:14:52 ping kernel: [108388.439970]  [<ffffffff8111726e>] do_sync_write+0xcb/0x108
Apr 13 18:14:52 ping kernel: [108388.440192]  [<ffffffff810f214e>] ? do_mmap_pgoff+0x2ab/0x305
Apr 13 18:14:52 ping kernel: [108388.440419]  [<ffffffff811dce11>] ? security_file_permission+0x16/0x18
Apr 13 18:14:52 ping kernel: [108388.445534]  [<ffffffff81117948>] vfs_write+0xac/0x100
Apr 13 18:14:52 ping kernel: [108388.445759]  [<ffffffff81117b51>] sys_write+0x4a/0x6e
Apr 13 18:14:52 ping kernel: [108388.445984]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
Apr 13 18:14:52 ping kernel: [108388.446207] Code: 48 89 41 09 73 76 4c 89 c2 4c 89 ee 48 89 df 4c 29 ea e8 9a 92 03 00 eb 63 c6 45 a0 01 48 8b 7b 60 e8 b5 d4 ff ff 39 45 c8 74 02 <0f> 0b 48 89 df e8 fa dd ff ff 48 8b 9b d0 00 00 00 c6 85 5f ff 
Apr 13 18:14:52 ping kernel: [108388.447026] RIP  [<ffffffffa0271d94>] find_free_extent.clone.52+0x6da/0x8be [btrfs]
Apr 13 18:14:52 ping kernel: [108388.447474]  RSP <ffff8802be401528>
Apr 13 18:14:52 ping kernel: [108388.447945] ---[ end trace 01ebe07e296986de ]---

Comment 2 Ronald Wahl 2011-04-14 18:56:24 UTC
Got the same bug again today - and the idle load remains at the level it was at that time.

Comment 3 Josh Boyer 2011-09-06 17:10:23 UTC
Are you still seeing this with the latest f14 or f15 kernel?

Comment 4 Ronald Wahl 2011-09-06 17:24:32 UTC
Well, in the meantime I switched away from btrfs back to ext4 because there were problems every couple of days - so I can't tell.

Comment 5 Fedora End Of Life 2012-08-16 13:49:07 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.