Hide Forgot
Description of problem: btrfsck got segfault when checking a btrfs with one device missing. The original fs is metadata raid5 data raid1 btrfs with three devices, I losetup -d one of the devices and umount & degraded mount & do some I/O test & umount again. Then I got a btrfs with 2 out of 3 fs images. [root@ibm-x3650m4-02-vm-06 btrfs-progs]# losetup -a /dev/loop0: [64769]:203380888 (/root/0.img) /dev/loop1: [64769]:203380889 (/root/2.img) [root@ibm-x3650m4-02-vm-06 btrfs-progs]# btrfsck /dev/loop0 warning, device 2 is missing warning devid 2 not found already Check tree block failed, want=147914752, have=65536 Check tree block failed, want=147914752, have=65536 read block failed check_tree_block Checking filesystem on /dev/loop0 UUID: fe7bdf36-9ff4-4169-ad34-5bab459b6950 checking extents Segmentation fault And dmesg shows btrfsck[8555]: segfault at 1d3 ip 0000000000412a1a sp 00007fff6f5082b0 error 4 in btrfsck[400000+4c000] Version-Release number of selected component (if applicable): btrfs-progs-0.20.rc1.20130308git704a08c-1.el7 current upstream btrfs-progs has this issue too How reproducible: always Steps to Reproduce: 1. download btrfs fs images from comment 1 2. setup loop devices 3. btrfsc one of the device Actual results: btrfsck segfaults Expected results: No segfault, I suppose to see no fs corruption too. Additional info: This is how I got the
Btrfs fs image http://lacrosse.corp.redhat.com/~eguan/btrfs/bz1013311-btrfsck-segfault-img.tar.bz2
A simple way to reproduce mkfs -t btrfs -f -K /dev/loop{0..3} losetup -d /dev/loop2 btrfs check /dev/loop0 This patch could fix the segfault, I'll send it to upstream for review. After applying this patch btrfs check still reports failure though. diff --git a/cmds-check.c b/cmds-check.c index a65670e..375c563 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -5514,6 +5514,10 @@ again: btrfs_root_bytenr(&ri), btrfs_level_size(root, btrfs_root_level(&ri)), 0); + if (!buf) { + ret = -EIO; + goto out; + } add_root_to_pending(buf, &extent_cache, &pending, &seen, &nodes, &found_key); free_extent_buffer(buf); diff --git a/disk-io.c b/disk-io.c index 0af3898..b37f0bd 100644 --- a/disk-io.c +++ b/disk-io.c @@ -644,7 +644,10 @@ out: blocksize = btrfs_level_size(root, btrfs_root_level(&root->root_item)); root->node = read_tree_block(root, btrfs_root_bytenr(&root->root_item), blocksize, generation); - BUG_ON(!root->node); + if (!root->node) { + free(root); + return ERR_PTR(-EIO); + } insert: root->ref_cows = 1; return root;
The mkfs command in comment 2 should be adding "-m raid5 -d raid5", only raid5 profile could trigger segfault.
Still segfaults with btrfs-progs-3.12-4.el7
This should be fixed with: commit b2e99e1819d967828edf149db5a203e59a40e379 Author: Eryu Guan <guaneryu> Date: Fri Jan 10 22:50:02 2014 +0800 Btrfs-progs: check return value of read_tree_block() in check_chunks_and_extents() in v3.14.2; I'll set as MODIFIED for now
reproduced in RHEL-7.0-20140507.0 as such following job: https://beaker.engineering.redhat.com/jobs/821682 [ 196.810884] btrfsck[11595]: segfault at e0 ip 0000000000415c64 sp 00007fff9fcbfe80 error 4 in btrfsck[400000+62000] verified in RHEL-7.1-20141204.2 as such following job: https://beaker.engineering.redhat.com/jobs/821684, no segment fault, but the btrfsck returns 1.(And it seems good, because one device in raid5 is missed) Also run some regressions for btrfs-progs-3.16.2-1: J:803063 xfstests-btrfs: RHEL-7.1-20141113.0,s390x J:803061 xfstests-btrfs: RHEL-7.1-20141113.0-ppc64 J:801031 xfstests-btrfs: RHEL-7.1-20141111.0 J:798051 ltp-aiodio-btrfs: RHEL-LE-7.1-20141105.n.2 J:796795 ltp-btrfs: RHEL-7.1-20141107.n.0, kernel-3.10.0-199.el7 Also I run some cases munually: /kernel/filesystems/btrfs/degraded-mount-replace--panic for kernel reason /kernel/filesystems/btrfs/profile-conversion--good /kernel/filesystems/btrfs/online-resize--good /kernel/filesystems/btrfs/regression--good /kernel/filesystems/btrfs/mkfs--good /kernel/filesystems/btrfs/mount--good /kernel/filesystems/btrfs/clone--good /kernel/filesystemd/btrfs/compress--good /kernel/filesystem/btrfs/defragment--good /kernel/filesystem/btrfs/online-device-add-delete-balance--good So I think I can change this but status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0534.html