Bug 816304

Summary: btrfs filesystem sync prints errors like 'btrfs bad tree block' and 'btrfs read error corrected'
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: btrfs-progsAssignee: Josef Bacik <josef>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: jbacik, josef, mmahut
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-13 13:22:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Richard W.M. Jones 2012-04-25 18:41:25 UTC
Description of problem:

I'm using libguestfs to construct a test which tries to
abuse btrfs.  One thing I have noticed -- after not very
much abuse -- is that 'btrfs filesystem sync' creates lots
of read errors, although they don't seem to affect operation.

[    3.907545] Btrfs loaded
[    4.061485] device fsid 334388e6-aea8-49da-bcc6-8306f9656e78 devid 1 transid 4 /dev/vda1
[    4.243522] device fsid 334388e6-aea8-49da-bcc6-8306f9656e78 devid 2 transid 4 /dev/vdb1
[    4.297841] device fsid 334388e6-aea8-49da-bcc6-8306f9656e78 devid 1 transid 7 /dev/vda1
[    4.307988] btrfs bad tree block start 0 20971520
[    4.310280] btrfs read error corrected: ino 1 off 20971520 (dev /dev/vda1 sector 40960)
[    4.312063] btrfs bad tree block start 0 29364224
[    4.315691] btrfs read error corrected: ino 1 off 29364224 (dev /dev/vda1 sector 57352)
[    4.318313] btrfs bad tree block start 0 29368320
[    4.321102] btrfs read error corrected: ino 1 off 29368320 (dev /dev/vda1 sector 57360)
[    4.323097] btrfs bad tree block start 0 29372416
[    4.328297] btrfs read error corrected: ino 1 off 29372416 (dev /dev/vda1 sector 57368)
[    4.329439] btrfs bad tree block start 0 29376512
[    4.330811] btrfs read error corrected: ino 1 off 29376512 (dev /dev/vda1 sector 57376)
[    4.332861] btrfs bad tree block start 0 29380608
[    4.334349] btrfs read error corrected: ino 1 off 29380608 (dev /dev/vda1 sector 57384)
[    4.335639] btrfs bad tree block start 0 29360128
[    4.337106] btrfs read error corrected: ino 1 off 29360128 (dev /dev/vda1 sector 57344)

Version-Release number of selected component (if applicable):

btrfs-progs-0.19-17.fc17.x86_64
kernel 3.3.2-1.fc17.x86_64.debug

How reproducible:

100%

Steps to Reproduce:
1. mkfs.btrfs /dev/vda1 /dev/vdb1
2. mount /dev/vda1 /sysroot
3. a small script creates about 10 MB of data on /sysroot
4. btrfs filesystem sync /sysroot
5. btrfs device add /dev/vdc1 /dev/vdd1
6. btrfs device delete /dev/vda1 /dev/vdb1

Note that steps 5 and 6 are not necessary to reproduce the
bug -- I'm just showing you what my test script does next.

The errors appear after step 4.

Actual results:

Errors as shown above.

Expected results:

No errors.

Additional info:

Source for the original test is here:
https://github.com/libguestfs/libguestfs/blob/master/tests/btrfs/test-btrfs-devices.sh

Comment 1 Josef Bacik 2012-04-25 19:05:42 UTC
Well that is kind of cool, but it may be related to some weirdness with the error handling patches that has since been fixed, can you try running on this tree which has the most recent fixes

git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git

I ran the commands you gave me and I didn't see the errors on this kernel.

Comment 2 Josef Bacik 2012-04-25 19:08:59 UTC
Or Chris mentioned a problem we used to have with xfstests where we'd do mkfs but the pages wouldn't make it to disk before we mounted so you'd see weird things right after mkfs.  This is fixed upstream but I don't think it's in fedora kernels, but test on btrfs-next to make sure it really does go away.

Comment 3 Richard W.M. Jones 2012-04-25 20:39:00 UTC
I have confirmed that btrfs-next kernel does NOT print
any of these errors.

This kernel is also able to complete the stress test (except
if btrfs-progs is updated to Rawhide -- I filed a separate
bug 816346 to track that issue).

Comment 4 Richard W.M. Jones 2013-02-12 18:37:41 UTC
This bug is back in Rawhide (kernel-3.8.0-0.rc7.git0.1.fc19.x86_64):

mount -o  /dev/sda2 /sysroot/
[    8.474934] device label ROOT devid 1 transid 2 /dev/sda2
[    8.570619] device label ROOT devid 1 transid 2 /dev/sda2
[    8.581891] btrfs: disk space caching is enabled
[    8.594146] btrfs bad tree block start 0 4194304
[    8.595144] btrfs: failed to read tree root on sda2
[    8.605308] btrfs: open_ctree failed
mount: wrong fs type, bad option, bad superblock on /dev/sda2,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail or so

Comment 5 Richard W.M. Jones 2013-02-13 13:22:16 UTC
Likely to be a generic caching problem, ie. the
same as bug 863978.

*** This bug has been marked as a duplicate of bug 863978 ***