+++ This bug was initially created as a clone of Bug #620384 +++ Yesterday I did more rigorous testing for bug #620384 and discovered this bug: Basically, the latest and greatest fsck.gfs2 doesn't like when directories get really big (i.e. lots of entries). This happens easier with a small block size like the 512B blocks. For almost all normal directories, the metadata structure looks like this: height structure ------ ------------------------------------------------- 0. dinode 1. journaled data block (hash table block pointers) 2. directory leaf blocks When directories get really big their metadata structure gets more complex and ends up looking like this: height structure ------ ------------------------------------------------- 0. dinode 1. indirect block (block pointers to block pointers) 2. journaled data block (hash table block pointers) 3. directory leaf blocks If there are enough directory entries, the structure can reach more heights, with level 2 being another level of indirect blocks: height structure ------ ------------------------------------------------- 0. dinode 1. indirect block (block pointers to block pointers) 2. indirect block (block pointers to block pointers) 3. journaled data block (hash table block pointers) 4. directory leaf blocks Right now, fsck.gfs2 can only handle directories of the first form. Large directories with four different metadata types are flagged as errors and data is destroyed. This is very serious and needs to get fixed ASAP. I've written a patch for this issue and I'm testing it now. So far the patch has passed a simple unit test using a four-level directory.
I did some testing and discovered this bug does not exist in gfs1's fsck, gfs_fsck. So gfs_fsck does not have this problem. I also tested gfs2-utils-0.1.60-1.el5 and it does not have this problem, so it is, in fact, a regression. Here's how to recreate the failure and what it looks like: [root@kool ~]# mkfs.gfs2 -O -b512 -p lock_nolock -t "kool:bob" -j1 /dev/kool_vg/kool_bob Device: /dev/kool_vg/kool_bob Blocksize: 512 Device Size 40.00 GB (83886080 blocks) Filesystem Size: 40.00 GB (83886078 blocks) Journals: 1 Resource Groups: 160 Locking Protocol: "lock_nolock" Lock Table: "kool:bob" UUID: 05060249-E9BB-1DCF-C9C8-112EF09BD56C You have new mail in /var/spool/mail/root [root@kool ~]# sync [root@kool ~]# mount -tgfs2 /dev/kool_vg/kool_bob /mnt/bob [root@kool ~]# mkdir /mnt/bob/bob [root@kool ~]# for i in `seq 1 10000` ; do touch /mnt/bob/bob/file_name_$i ; done [root@kool ~]# !umo umount /mnt/bob [root@kool ~]# /sbin/fsck.gfs2 -V GFS2 fsck DEVEL.1274286054 (built May 19 2010 11:22:48) Copyright (C) Red Hat, Inc. 2004-2006 All rights reserved. [root@kool ~]# fsck.gfs2 /dev/kool_vg/kool_bob Initializing fsck Validating Resource Group index. Level 1 RG check. (level 1 passed) Starting pass1 Block 287425 (0x462c1) seems to be free space, but is marked as data in the bitmap. Okay to fix the bitmap? (y/n)y The bitmap was fixed. Block 287426 (0x462c2) seems to be free space, but is marked as data in the bitmap. Okay to fix the bitmap? (y/n)y The bitmap was fixed. Error: inode 269038 (0x41aee) has unrecoverable errors; invalidating. Block 269038 (0x41aee) seems to be free space, but is marked as inode in the bitmap. Okay to fix the bitmap? (y/n)y The bitmap was fixed. Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Directory entry 'bob' referencing inode 269038 (0x41aee) in dir inode 398 (0x18e) block type 0: was deleted or is not an inode. Clear directory entry to non-inode block? (y/n) Obviously, the fsck of the file system should come up clean.
Created attachment 439165 [details] RHEL56 patch Here is the RHEL5.6 patch I'm testing for this problem. This one is separated from the other patch, so final form. I'll crosswrite this to RHEL6.0 and attach that shortly.
Patch tested on system kool and found to fix the problem.
I pushed the patch to the RHEL56 branch of the cluster git tree for inclusion into 5.6. Changing status to POST until we get this built.
Build 2770902 successful. Changing status to Modified. This fix is in gfs2-utils-0.1.62-26.el5.
Verified that fsck.gfs2 does not remove entries if di_height = 3. gfs2-utils-0.1.62-28.el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0135.html