Red Hat Bugzilla – Bug 704328
fsck.gfs can not check file system with 500,000 files
Last modified: 2012-02-21 01:40:22 EST
Description of problem: This is a test that was written to verify bug 628013. Running on a GFS filesystem and checking it with fsck.gfs causes fsck.gfs to hit an operational error. SCENARIO - [big_directory] Creating 4G LV bigdir on dash-01 Creating directory with 500,000 entries Creating file system on /dev/fsck/bigdir with options '-p lock_nolock -j 1 -b 512' on dash-01 Device: /dev/fsck/bigdir Blocksize: 512 Filesystem Size: 8122120 Journals: 1 Resource Groups: 16 Locking Protocol: lock_nolock Lock Table: Syncing... All Done Mounting gfs /dev/fsck/bigdir on dash-01 with opts '' Populating directory, this could take a while Created 500000 files Unmounting /mnt/fsck on dash-01 Starting fsck.gfs of /dev/fsck/bigdir on dash-01 fsck.gfs output in /tmp/gfs_fsck_stress.26807/2.big_directory/1.fsck-dash-01.log fsck.gfs returned 8 fsck.gfs of /dev/fsck/bigdir on dash-01 failed (8) Version-Release number of selected component (if applicable): gfs-utils-0.1.20-10.el5 How reproducible: Every time Steps to Reproduce: 1. Create 500,000 small files in a directory on GFS 2. umount and run fsck.gfs Actual results: Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Entries is 500002 - should be 368677 for inode block 399 Pass2 complete Found unlinked inode at 286849 Starting pass3 Pass3 complete Starting pass4 Adjusting free block count (507560 -> 507624). Adjusting freemeta block count (63 -> 0). Adjusting used dinode block count (1 -> 0). l+f directory at 1524058 Unlinked inode added to l+f Added inode #286849 to l+f dir Added inode #519477 to l+f dir Added inode #614241 to l+f dir ... Found unlinked inode at 1105462 Added inode #1105462 to l+f dir Unlinked inode added to l+f Found unlinked inode at 1141813 ATTENTION -- Not doing copy_tail... Expected results: fsck.gfs should run cleanly. Additional info:
Created attachment 500660 [details] Patch to fix the problem The problem was that metawalk.c was not properly taking leaf continuation blocks into account. The fsck.gfs2 code was doing the same thing until I wrote this patch: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=6f1e7d39673a0c8d145d923204f7daa4dd56a0aa So this is essentially a crosswrite of that patch for GFS, but the GFS2 version was necessarily more complex because of the way gfs2-utils handles buffering. This patch was tested on system roth-01 with the failing scenario: [root@roth-01 ../gfs2/fsck]# gfs_mkfs -O -b512 -t bobs_roth:roth_lv -p lock_nolock -j 1 /dev/roth_vg/roth_lv3 Device: /dev/roth_vg/roth_lv3 Blocksize: 512 Filesystem Size: 8122120 Journals: 1 Resource Groups: 16 Locking Protocol: lock_nolock Lock Table: bobs_roth:roth_lv Syncing... All Done [root@roth-01 ../gfs2/fsck]# mount -tgfs /dev/roth_vg/roth_lv3 /mnt/gfs [root@roth-01 ../gfs2/fsck]# mkdir /mnt/gfs/dir500K [root@roth-01 ../gfs2/fsck]# time for i in `seq 1 500000` ; do echo $i > /mnt/gfs/dir500K/big_dir_file_number_$i ; done real 3m38.930s user 0m39.688s sys 2m42.736s [root@roth-01 ../gfs2/fsck]# umount /mnt/gfs [root@roth-01 ../gfs2/fsck]# cd /home/bob/cluster.RHEL57.704328/gfs/gfs_fsck/ [root@roth-01 ../gfs/gfs_fsck]# ./gfs_fsck /dev/roth_vg/roth_lv3 Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Converting 31 unused metadata blocks to free data blocks... Converting 36 unused metadata blocks to free data blocks... Pass5 complete Writing changes to disk ------ then I ran the stock version: [root@roth-01 ../gfs/gfs_fsck]# gfs_fsck /dev/roth_vg/roth_lv3 Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Entries is 500002 - should be 473432 for inode block 399 Fix the entry count? (y/n) n The entry count was not fixed. Pass2 complete Starting pass3 Pass3 complete Starting pass4 Found unlinked inode at 449679 Add unlinked inode to l+f? (y/n)
This patch was pushed to the RHEL58 branch of the cluster git tree for inclusion into 5.8. The patch was taken from fsck.gfs2 upstream, so there is no need to crosswrite to other releases. It was tested on system roth-01 using the test above. Changing status to POST until it gets built.
Build 3752738 successful. This bug is fixed in gfs-utils-0.1.20-11.el5. Changing status to MODIFIED.
Verified against gfs-utils-0.1.20-11.el5 SCENARIO - [big_directory] Creating 4G LV bigdir on buzz-05 Creating directory with 500,000 entries Creating file system on /dev/fsck/bigdir with options '-p lock_nolock -j 1 -b 512' on buzz-05 Device: /dev/fsck/bigdir Blocksize: 512 Filesystem Size: 8122120 Journals: 1 Resource Groups: 16 Locking Protocol: lock_nolock Lock Table: Syncing... All Done Mounting gfs /dev/fsck/bigdir on buzz-05 with opts '' Populating directory, this could take a while Created 500000 files Reclaiming unused metadata on buzz-05 Don't do this if this file system is being exported by NFS (on any machine). Are you sure you want to proceed? [y/n] Reclaimed: version 0 inodes 0 metadata 62 Unmounting /mnt/fsck on buzz-05 Starting fsck.gfs of /dev/fsck/bigdir on buzz-05 fsck.gfs output in /tmp/vedder.REG.buzzez.201112021311/r11/4.GFS/03.gfs_fsck_stress/4.big_directory/1.fsck-buzz-05.log fsck.gfs of /dev/fsck/bigdir on buzz-05 took 675 seconds Mounting gfs /dev/fsck/bigdir on buzz-05 with opts '' Checking files after fsck on buzz-05 Unmounting /mnt/fsck on buzz-05 Removing LV bigdir on buzz-05
Tested on rhel-5.7 with gfs-utils-0.1.20-11.el5. Tested with different bs sizes in fs (512, 1024, 2048) created 500500 files in two sizes (8 bytes, 1024 bytes) All tests passed successful. [root@a1 ~]# gfs_mkfs -O -b512 -t a_cluster:test_lv -p lock_nolock -j 2 /dev/test_vg/lv_testing Device: /dev/test_vg/lv_testing Blocksize: 512 Filesystem Size: 30916960 Journals: 2 Resource Groups: 60 Locking Protocol: lock_nolock Lock Table: a_cluster:test_lv Syncing... All Done gfs_fsck /dev/test_vg/lv_testing Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Converting 51 unused metadata blocks to free data blocks... Converting 43 unused metadata blocks to free data blocks... Converting 35 unused metadata blocks to free data blocks... Pass5 complete Writing changes to disk ====================== gfs_mkfs -O -b1024 -t a_cluster:test_lv -p lock_nolock -j 2 /dev/test_vg/lv_testing Device: /dev/test_vg/lv_testing Blocksize: 1024 Filesystem Size: 15462412 Journals: 2 Resource Groups: 60 Locking Protocol: lock_nolock Lock Table: a_cluster:test_lv Syncing... All Done gfs_fsck /dev/test_vg/lv_testing Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Converting 25 unused metadata blocks to free data blocks... Converting 47 unused metadata blocks to free data blocks... Converting 7 unused metadata blocks to free data blocks... Converting 8 unused metadata blocks to free data blocks... Converting 51 unused metadata blocks to free data blocks... Pass5 complete Writing changes to disk ======================= [root@a1 ~]# gfs_mkfs -O -b2048 -t a_cluster:test_lv -p lock_nolock -j 2 /dev/test_vg/lv_testing Device: /dev/test_vg/lv_testing Blocksize: 2048 Filesystem Size: 7732108 Journals: 2 Resource Groups: 60 Locking Protocol: lock_nolock Lock Table: a_cluster:test_lv Syncing... All Done [root@a1 ~]# gfs_fsck /dev/test_vg/lv_testing Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Pass1b complete Starting pass1c Pass1c complete Starting pass2 Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 Converting 22 unused metadata blocks to free data blocks... Converting 17 unused metadata blocks to free data blocks... Converting 26 unused metadata blocks to free data blocks... Converting 29 unused metadata blocks to free data blocks... Converting 39 unused metadata blocks to free data blocks... Converting 38 unused metadata blocks to free data blocks... Pass5 complete Writing changes to disk
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0276.html