Hide Forgot
Description of problem: Today's fsck.gfs2 tool doesn't know when a quota_changeX system file is corrupt, nor does it attempt to fix it. I discovered this bug due experimenting with some very old metadata. Long ago, the "gfs2_edit savemeta" tool had a bug whereby it failed to save the quota_changeX indirect blocks. Therefore, the quota_changeX files were saved, but when they were restored to a fresh device, those indirect blocks would be missing. I ran fsck.gfs2, but it failed to fix the quota_changeX files, and therefore they were still corrupt. Attempts to mount the file system resulted in an error, even though fsck.gfs2 reported the file system was clean. Version-Release number of selected component (if applicable): All How reproducible: Always Steps to Reproduce: Restore an old GFS2 metadata set that's missing its quota_change records, but has the quota_change inode itself. Actual results: fsck.gfs2 does not complain nor fix any problems related to the quota_changeX files related to contents not being QC records. Expected results: fsck.gfs2 should detect when the quota_changeX files don't contain QC records, and fix them appropriately. Additional info: This is best illustrated with example output: # gfs2_edit restoremeta /home/bob/metadata/gfs2/503938.home.metadata /dev/mpathc/scratch File system size: 5177349 (0x4f0005) blocks, aka 19.768GB There are 536870912 blocks of 4096 bytes in the destination device. 536870912 inodes processed, 200434 blocks saved (100%) processed, File /home/bob/metadata/gfs2/503938.home.metadata restore successful. # gfs2_tool -O sb /dev/mpathc/scratch proto "lock_nolock" current lock protocol name = "lock_dlm" new lock protocol name = "lock_nolock" Done # fsck.gfs2 -y /dev/mpathc/scratch &> /tmp/gronk # echo $? 1 # fsck.gfs2 /dev/mpathc/scratch &> /dev/null # echo $? 0 # mount -tgfs2 /dev/mpathc/scratch /mnt/gfs2 error mounting /dev/dm-8 on /mnt/gfs2: Input/output error # dmesg GFS2: fsid=: Trying to join cluster "lock_nolock", "MUSKETEER:home" GFS2: fsid=MUSKETEER:home.0: Now mounting FS... GFS2: fsid=MUSKETEER:home.0: jid=0, already locked for use GFS2: fsid=MUSKETEER:home.0: jid=0: Looking at journal... GFS2: fsid=MUSKETEER:home.0: jid=0: Done GFS2: fsid=MUSKETEER:home.0: jid=1: Trying to acquire journal lock... GFS2: fsid=MUSKETEER:home.0: jid=1: Looking at journal... GFS2: fsid=MUSKETEER:home.0: jid=1: Done GFS2: fsid=MUSKETEER:home.0: jid=2: Trying to acquire journal lock... GFS2: fsid=MUSKETEER:home.0: jid=2: Looking at journal... GFS2: fsid=MUSKETEER:home.0: jid=2: Done GFS2: fsid=MUSKETEER:home.0: jid=3: Trying to acquire journal lock... GFS2: fsid=MUSKETEER:home.0: jid=3: Looking at journal... GFS2: fsid=MUSKETEER:home.0: jid=3: Done GFS2: fsid=MUSKETEER:home.0: jid=4: Trying to acquire journal lock... GFS2: fsid=MUSKETEER:home.0: jid=4: Looking at journal... GFS2: fsid=MUSKETEER:home.0: jid=4: Done GFS2: fsid=MUSKETEER:home.0: fatal: invalid metadata block GFS2: fsid=MUSKETEER:home.0: bh = 164212 (type: exp=14, found=5) GFS2: fsid=MUSKETEER:home.0: function = gfs2_quota_init, file = fs/gfs2/quota.c, line = 1239 GFS2: fsid=MUSKETEER:home.0: about to withdraw this file system GFS2: fsid=MUSKETEER:home.0: withdrawn Pid: 40313, comm: mount.gfs2 Not tainted 2.6.32-398.el6.x86_64 #1 Call Trace: [<ffffffffa0b36c78>] ? gfs2_lm_withdraw+0x128/0x160 [gfs2] [<ffffffffa0b36d80>] ? gfs2_metatype_check_ii+0x50/0x60 [gfs2] [<ffffffffa0b2cfd7>] ? gfs2_quota_init+0x357/0x3c0 [gfs2] [<ffffffffa0b337fe>] ? gfs2_make_fs_rw+0xbe/0x160 [gfs2] [<ffffffffa0b3378b>] ? gfs2_make_fs_rw+0x4b/0x160 [gfs2] [<ffffffff81063db5>] ? wake_up_process+0x15/0x20 [<ffffffffa0b27e8a>] ? gfs2_get_sb+0x96a/0xa30 [gfs2] [<ffffffffa0b1b299>] ? gfs2_glock_nq_num+0x59/0xa0 [gfs2] [<ffffffff8116224a>] ? alloc_pages_current+0xaa/0x110 [<ffffffff81185d3b>] ? vfs_kern_mount+0x7b/0x1b0 [<ffffffff81185ee2>] ? do_kern_mount+0x52/0x130 [<ffffffff811a6954>] ? do_mount+0x2f4/0x920 [<ffffffff811a7010>] ? sys_mount+0x90/0xe0 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b GFS2: fsid=MUSKETEER:home.0: can't make FS RW: -5 As you can see, the last fsck.gfs2 came up clean, but the file system was not mountable because the system quota_changeX files had bad contents: indirect blocks rather than quota_change blocks.
Created attachment 793333 [details] Try #1 patch This patch gives fsck.gfs2 the ability to check and repair all the system files in the per_node directory. I tested it on system gfs-i16c-03.mpc.lab.eng.bos.redhat.com using (1) the failing scenario, (2) several mock-ups where I used gfs2_edit to damage things.
I pushed the patch to the master and RHEL7 branches of the gfs2-utils git tree. Changing status to POST.
https://brewweb.devel.redhat.com/buildinfo?buildID=295427
Verified in gfs2-utils-3.1.6-12.el7: [root@dash-02 ~]# gfs2_edit restoremeta 503938.home.metadata /dev/sdc1 [root@dash-02 ~]# fsck.gfs2 -y /dev/sdc1 &> /root/gronk [root@dash-02 ~]# echo $? 1 [root@dash-02 ~]# fsck.gfs2 -y /dev/sdc1 &> /dev/null [root@dash-02 ~]# echo $? 0 [root@dash-02 ~]# mount -t gfs2 /dev/sdc1 /mnt/gfs2/ [root@dash-02 ~]# mount |grep gfs2 /dev/sdc1 on /mnt/gfs2 type gfs2 (rw,relatime,seclabel,localflocks)
This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request.
*** Bug 1596757 has been marked as a duplicate of this bug. ***