Bug 450641 - gfs2 in 2.6.26-rc2 appears busted; data corruption, wrong statfs info
gfs2 in 2.6.26-rc2 appears busted; data corruption, wrong statfs info
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
All Linux
low Severity low
: ---
: ---
Assigned To: Ben Marzinski
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2008-06-09 23:34 EDT by Eric Sandeen
Modified: 2008-06-30 15:47 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-06-30 15:47:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
patch to fix the block allocation (517 bytes, patch)
2008-06-17 17:58 EDT, Ben Marzinski
no flags Details | Diff

  None (edit)
Description Eric Sandeen 2008-06-09 23:34:01 EDT
Make a 2T local / single-node gfs2 filesystem:

[root@east-10 ~]# mkfs.gfs2 -p lock_nolock /dev/sdc
This will destroy any data on /dev/sdc.
  It appears to contain a ext3 filesystem.

Are you sure you want to proceed? [y/n] y 

Device:                    /dev/sdc
Blocksize:                 4096
Device Size                2326.31 GB (609827840 blocks)
Filesystem Size:           2326.31 GB (609827839 blocks)
Journals:                  1
Resource Groups:           9306
Locking Protocol:          "lock_nolock"
Lock Table:                ""

Mount it:

[root@east-10 ~]# mount /dev/sdc /mnt/test

Write a 1G file (using xfs_io to write a pattern, 0x01 in this case):

[root@east-10 ~]# xfs_io -f -F -c "pwrite -S 1 0 1G" /mnt/test/file
wrote 1073741824/1073741824 bytes at offset 0
1 GiB, 262144 ops; 0:00:17.00 (57.244 MiB/sec and 14654.5153 ops/sec)

All done.  Check df:

[root@east-10 ~]# df -h /mnt/test
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdc              2.3T     0  2.3T   0% /mnt/test

0 blocks used?

The file claims to be using space:

[root@east-10 ~]# du -hc /mnt/test/*
1.1G	/mnt/test/file
1.1G	total

unmount, remount:

[root@east-10 ~]# umount /mnt/test
[root@east-10 ~]# mount /dev/sdc /mnt/test

Check file contents, find large swath of 0s where our data should be:

[root@east-10 ~]# hexdump -C /mnt/test/file
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
3c057000  01 01 01 01 01 01 01 01  01 01 01 01 01 01 01 01  |................|

Try repairing the filesystem, find corruption:

[root@east-10 ~]# gfs2_fsck /dev/sdc
Initializing fsck
Recovering journals (this may take a while).
Journal recovery complete.
Validating Resource Group index.
Level 1 RG check.
(level 1 passed)
Starting pass1
Inode 33342 (0x823e): Ondisk block count (262664) does not match what fsck found
Fix ondisk block count? (y/n) y
Pass1 complete      
Starting pass1b


[root@east-10 ~]# rpm -q gfs2-utils
[root@east-10 ~]# uname -a
Linux east-10 2.6.26-rc2 #3 SMP Mon Jun 9 11:20:13 CDT 2008 x86_64 x86_64 x86_64

Comment 1 Steve Whitehouse 2008-06-10 06:26:48 EDT
Ben, this might be a clue to what you are looking for.
Comment 2 Ben Marzinski 2008-06-11 20:31:58 EDT
Well, the reason that there is no pattern until byte 1006989312 (0x3c057000) is
because the first 483 pointers in the indirect pointer block are all zero,
according to gfs2_edit. 483 * 509 * 4096 = 1006989312.  Now the real question is
why are the first 483 pointers all zero. That I'm still looking into.
Incidentally, When I try to grow a file to this size, I get the exact same
thing, no data until byte 1006989312. It happens somewhere between growing the
file to 100Mb and 1000Mb. This should make it pretty easy to track down.
Comment 3 Steve Whitehouse 2008-06-13 11:58:45 EDT
I wonder whether the data thats left is the data from the start of the file or
the data which actually belongs in that place. If the former, then I suspect
that the order of addition of the new indirect blocks to the metadata tree might
be to blame.
Comment 4 Ben Marzinski 2008-06-17 17:58:14 EDT
Created attachment 309673 [details]
patch to fix the block allocation

This patch changes the computation for zero_metapath_length(). When you are
extending the metadata tree, The indirect blocks that point to the new data
block must either diverge from the existing tree either at the inode, or at the
first indirect block. They can diverge at the first indirect block because the
inode has room for 483 pointers while the indirect blocks have room for 509
pointers, so when the tree is grown, there is some free space in the first
indirect block. What zero_metapath_length now computes is the height where the
first indirect block for the new data block is located.  It can either be 1 (if
the indirect block diverges from the inode) or 2 (if it diverges from the first
indirect block).
Comment 5 Steve Whitehouse 2008-06-25 09:00:56 EDT
The patch is now upstream in Linus' kernel. Can we close this bz now, or are
there other issues still left unresolved?
Comment 6 Eric Sandeen 2008-06-25 09:47:20 EDT
Perhaps I should have filed 2 bugs; does the statfs issue remain?

Comment 7 Ben Marzinski 2008-06-26 12:04:15 EDT
It works fine for me.
Comment 8 Steve Whitehouse 2008-06-30 07:38:09 EDT
So we ought to be able to close this now?

Note You need to log in before you can comment on or make changes to this bug.