Bug 1282180

Summary: Kernel panic on btrfs rebalance - kernel BUG at fs/btrfs/extent-tree.c:1833!
Product: [Fedora] Fedora Reporter: Chris Smart <fedora>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: gansalmon, igeorgex, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-01 01:18:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Chris Smart 2015-11-15 10:34:37 UTC
Description of problem:
Running btrfs rebalance on a RAID 6 btrfs volume causes the kernel to panic.

Version-Release number of selected component (if applicable):


How reproducible:
I think always (for me at least).

Steps to Reproduce:
1. Install Fedora 23 with btrfs on /
2. Run btrfs rebalance /

Actual results:
After a while, kernel segfaults with:

Nov 14 06:03:42 localhost.localdomain kernel: ------------[ cut here ]------------
Nov 14 06:03:42 localhost.localdomain kernel: kernel BUG at fs/btrfs/extent-tree.c:1833!
Nov 14 06:03:42 localhost.localdomain kernel: invalid opcode: 0000 [#1] SMP
Nov 14 06:03:42 localhost.localdomain kernel: Modules linked in: fuse joydev synaptics_usb uas usb_storage rfcomm cmac nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtab
Nov 14 06:03:42 localhost.localdomain kernel:  snd_soc_core snd_hda_codec rfkill snd_compress snd_hda_core snd_pcm_dmaengine ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm mei_me dw_dmac i2c_designware_platform snd_timer snd_soc_sst_a
Nov 14 06:03:42 localhost.localdomain kernel: CPU: 0 PID: 6274 Comm: btrfs Not tainted 4.2.5-300.fc23.x86_64 #1
Nov 14 06:03:42 localhost.localdomain kernel: Hardware name: Gigabyte Technology Co., Ltd. Z97N-WIFI/Z97N-WIFI, BIOS F5 12/08/2014
Nov 14 06:03:42 localhost.localdomain kernel: task: ffff88006fd69d80 ti: ffff88000e344000 task.ti: ffff88000e344000
Nov 14 06:03:42 localhost.localdomain kernel: RIP: 0010:[<ffffffffa0932af7>]  [<ffffffffa0932af7>] insert_inline_extent_backref+0xe7/0xf0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: RSP: 0018:ffff88000e3476a8  EFLAGS: 00010293
Nov 14 06:03:42 localhost.localdomain kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: RDX: ffff880000000000 RSI: 0000000000000001 RDI: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: RBP: ffff88000e347728 R08: 0000000000004000 R09: ffff88000e3475a0
Nov 14 06:03:42 localhost.localdomain kernel: R10: 0000000000000000 R11: 0000000000000002 R12: ffff88021522f000
Nov 14 06:03:42 localhost.localdomain kernel: R13: ffff88013f868480 R14: 0000000000000000 R15: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: FS:  00007f66268a08c0(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 06:03:42 localhost.localdomain kernel: CR2: 000055a79c7e6fd0 CR3: 00000000576ce000 CR4: 00000000001406f0
Nov 14 06:03:42 localhost.localdomain kernel: Stack:
Nov 14 06:03:42 localhost.localdomain kernel:  0000000000000000 0000000000000005 0000000000000001 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel:  0000000000000001 ffffffff81200176 0000000000270026 ffffffffa0925d4a
Nov 14 06:03:42 localhost.localdomain kernel:  0000000000002158 00000000a7c0ba4c ffff88021522d800 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: Call Trace:
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff81200176>] ? kmem_cache_alloc+0x1d6/0x210
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa0925d4a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa0932f99>] __btrfs_inc_extent_ref.isra.52+0xa9/0x270 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa09386b4>] __btrfs_run_delayed_refs+0xc84/0x1080 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa093b674>] btrfs_run_delayed_refs.part.73+0x74/0x270 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa0925ecb>] ? btrfs_release_path+0x2b/0xa0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa093b885>] btrfs_run_delayed_refs+0x15/0x20 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa094ff26>] btrfs_commit_transaction+0x56/0xad0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa09a43be>] prepare_to_merge+0x1fe/0x210 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa09a4e5e>] relocate_block_group+0x25e/0x6b0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa09a547a>] btrfs_relocate_block_group+0x1ca/0x2c0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa0978b6e>] btrfs_relocate_chunk.isra.39+0x3e/0xb0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa097a494>] btrfs_balance+0x9c4/0xf80 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa0986d54>] btrfs_ioctl_balance+0x3c4/0x3d0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffffa0988501>] btrfs_ioctl+0x541/0x2750 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff811b341c>] ? lru_cache_add+0x1c/0x50
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff811b3572>] ? lru_cache_add_active_or_unevictable+0x32/0xd0
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff811d5ffa>] ? handle_mm_fault+0xc8a/0x17d0
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff81223303>] ? cp_new_stat+0xb3/0x190
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff812313b5>] do_vfs_ioctl+0x295/0x470
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff8132944d>] ? selinux_file_ioctl+0x4d/0xc0
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff81231609>] SyS_ioctl+0x79/0x90
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff810656cf>] ? do_page_fault+0x2f/0x80
Nov 14 06:03:42 localhost.localdomain kernel:  [<ffffffff817791ee>] entry_SYSCALL_64_fastpath+0x12/0x71
Nov 14 06:03:42 localhost.localdomain kernel: Code: 10 49 89 d9 48 8b 55 c0 4c 89 7c 24 10 4c 89 f1 4c 89 ee 4c 89 e7 89 44 24 08 48 8b 45 20 48 89 04 24 e8 5d d5 ff ff 31 c0 eb ac <0f> 0b e8 92 b7 76 e0 66 90 0f 1f 44 00 00 55 48 89 e5
Nov 14 06:03:42 localhost.localdomain kernel: RIP  [<ffffffffa0932af7>] insert_inline_extent_backref+0xe7/0xf0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel:  RSP <ffff88000e3476a8>
Nov 14 06:03:42 localhost.localdomain kernel: ---[ end trace 63b75c57d2feac56 ]---

Expected results:
Rebalance should complete successfully without kernel segfault.

Additional info:
Looks like this upstream issue:

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg47782.html

http://thread.gmane.org/gmane.comp.file-systems.btrfs/49517

Comment 1 Chris Smart 2016-04-01 00:19:55 UTC
I don't seem to hit this bug anymore, using Fedora 23 with 4.4.5-300.fc23.x86_64 kernel.

Comment 2 Laura Abbott 2016-04-01 01:18:30 UTC
Thanks for letting us know