Bug 624689 - fsck.gfs2 deletes directories if they get too big
Summary: fsck.gfs2 deletes directories if they get too big
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs2-utils
Version: 5.6
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: rc
: 5.6
Assignee: Robert Peterson
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 575968 620384
Blocks: 622576 624691
TreeView+ depends on / blocked
 
Reported: 2010-08-17 13:56 UTC by Robert Peterson
Modified: 2011-01-13 23:21 UTC (History)
8 users (show)

Fixed In Version: gfs2-utils-0.1.62-26.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 624691 (view as bug list)
Environment:
Last Closed: 2011-01-13 23:21:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
RHEL56 patch (11.57 KB, patch)
2010-08-17 16:57 UTC, Robert Peterson
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0135 0 normal SHIPPED_LIVE gfs2-utils bug fix update 2011-01-12 19:26:41 UTC

Description Robert Peterson 2010-08-17 13:56:28 UTC
+++ This bug was initially created as a clone of Bug #620384 +++

Yesterday I did more rigorous testing for bug #620384 and
discovered this bug:  Basically, the latest and greatest
fsck.gfs2 doesn't like when directories get really big (i.e.
lots of entries).  This happens easier with a small block size
like the 512B blocks.

For almost all normal directories, the metadata structure looks
like this:

height  structure
------  -------------------------------------------------
0.      dinode
1.      journaled data block (hash table block pointers)
2.      directory leaf blocks

When directories get really big their metadata structure gets
more complex and ends up looking like this:

height  structure
------  -------------------------------------------------
0.      dinode
1.      indirect block (block pointers to block pointers)
2.      journaled data block (hash table block pointers)
3.      directory leaf blocks

If there are enough directory entries, the structure can
reach more heights, with level 2 being another level of
indirect blocks:

height  structure
------  -------------------------------------------------
0.      dinode
1.      indirect block (block pointers to block pointers)
2.      indirect block (block pointers to block pointers)
3.      journaled data block (hash table block pointers)
4.      directory leaf blocks

Right now, fsck.gfs2 can only handle directories of the first
form.  Large directories with four different metadata types
are flagged as errors and data is destroyed.  This is very
serious and needs to get fixed ASAP.  I've written a patch for
this issue and I'm testing it now.  So far the patch has passed
a simple unit test using a four-level directory.

Comment 1 Robert Peterson 2010-08-17 15:27:29 UTC
I did some testing and discovered this bug does not exist in
gfs1's fsck, gfs_fsck.  So gfs_fsck does not have this problem.
I also tested gfs2-utils-0.1.60-1.el5 and it does not have this
problem, so it is, in fact, a regression.

Here's how to recreate the failure and what it looks like:

[root@kool ~]# mkfs.gfs2 -O -b512 -p lock_nolock -t "kool:bob" -j1
/dev/kool_vg/kool_bob 
Device:                    /dev/kool_vg/kool_bob
Blocksize:                 512
Device Size                40.00 GB (83886080 blocks)
Filesystem Size:           40.00 GB (83886078 blocks)
Journals:                  1
Resource Groups:           160
Locking Protocol:          "lock_nolock"
Lock Table:                "kool:bob"
UUID:                      05060249-E9BB-1DCF-C9C8-112EF09BD56C

You have new mail in /var/spool/mail/root
[root@kool ~]# sync
[root@kool ~]# mount -tgfs2 /dev/kool_vg/kool_bob  /mnt/bob
[root@kool ~]# mkdir /mnt/bob/bob
[root@kool ~]# for i in `seq 1 10000` ; do touch /mnt/bob/bob/file_name_$i ;
done
[root@kool ~]# !umo
umount /mnt/bob
[root@kool ~]# /sbin/fsck.gfs2 -V
GFS2 fsck DEVEL.1274286054 (built May 19 2010 11:22:48)
Copyright (C) Red Hat, Inc.  2004-2006  All rights reserved.
[root@kool ~]# fsck.gfs2 /dev/kool_vg/kool_bob 
Initializing fsck
Validating Resource Group index.
Level 1 RG check.
(level 1 passed)
Starting pass1
Block 287425 (0x462c1) seems to be free space, but is marked as data in the
bitmap.
Okay to fix the bitmap? (y/n)y
The bitmap was fixed.
Block 287426 (0x462c2) seems to be free space, but is marked as data in the
bitmap.
Okay to fix the bitmap? (y/n)y
The bitmap was fixed.
Error: inode 269038 (0x41aee) has unrecoverable errors; invalidating.
Block 269038 (0x41aee) seems to be free space, but is marked as inode in the
bitmap.
Okay to fix the bitmap? (y/n)y
The bitmap was fixed.
Pass1 complete      
Starting pass1b
Pass1b complete
Starting pass1c
Pass1c complete
Starting pass2
Directory entry 'bob' referencing inode 269038 (0x41aee) in dir inode 398
(0x18e) block type 0: was deleted or is not an inode.
Clear directory entry to non-inode block? (y/n) 

Obviously, the fsck of the file system should come up clean.

Comment 2 Robert Peterson 2010-08-17 16:57:54 UTC
Created attachment 439165 [details]
RHEL56 patch

Here is the RHEL5.6 patch I'm testing for this problem.
This one is separated from the other patch, so final form.
I'll crosswrite this to RHEL6.0 and attach that shortly.

Comment 3 Robert Peterson 2010-08-17 17:17:35 UTC
Patch tested on system kool and found to fix the problem.

Comment 4 Robert Peterson 2010-08-18 14:22:35 UTC
I pushed the patch to the RHEL56 branch of the cluster git tree
for inclusion into 5.6.  Changing status to POST until we get
this built.

Comment 5 Robert Peterson 2010-09-20 14:50:29 UTC
Build 2770902 successful.  Changing status to Modified.
This fix is in gfs2-utils-0.1.62-26.el5.

Comment 7 Nate Straz 2010-11-11 22:16:51 UTC
Verified that fsck.gfs2 does not remove entries if di_height = 3.

gfs2-utils-0.1.62-28.el5

Comment 9 errata-xmlrpc 2011-01-13 23:21:11 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0135.html


Note You need to log in before you can comment on or make changes to this bug.