Bug 1003059 - fsck.gfs2 doesn't fix corrupt quota_change system files
fsck.gfs2 doesn't fix corrupt quota_change system files
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: gfs2-utils (Show other bugs)
7.0
Unspecified Unspecified
unspecified Severity low
: rc
: ---
Assigned To: Robert Peterson
Cluster QE
:
Depends On:
Blocks: 1062742
  Show dependency treegraph
 
Reported: 2013-08-30 12:26 EDT by Robert Peterson
Modified: 2014-06-17 20:15 EDT (History)
7 users (show)

See Also:
Fixed In Version: gfs2-utils-3.1.6-7.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1062742 (view as bug list)
Environment:
Last Closed: 2014-06-13 09:22:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Try #1 patch (7.14 KB, patch)
2013-09-03 15:01 EDT, Robert Peterson
no flags Details | Diff

  None (edit)
Description Robert Peterson 2013-08-30 12:26:24 EDT
Description of problem:
Today's fsck.gfs2 tool doesn't know when a quota_changeX system
file is corrupt, nor does it attempt to fix it. I discovered this
bug due experimenting with some very old metadata. Long ago,
the "gfs2_edit savemeta" tool had a bug whereby it failed to save
the quota_changeX indirect blocks. Therefore, the quota_changeX
files were saved, but when they were restored to a fresh device,
those indirect blocks would be missing. I ran fsck.gfs2, but
it failed to fix the quota_changeX files, and therefore they were
still corrupt. Attempts to mount the file system resulted in an
error, even though fsck.gfs2 reported the file system was clean.

Version-Release number of selected component (if applicable):
All

How reproducible:
Always

Steps to Reproduce:
Restore an old GFS2 metadata set that's missing its quota_change
records, but has the quota_change inode itself.

Actual results:
fsck.gfs2 does not complain nor fix any problems related to
the quota_changeX files related to contents not being QC
records.

Expected results:
fsck.gfs2 should detect when the quota_changeX files don't
contain QC records, and fix them appropriately.

Additional info:

This is best illustrated with example output:
# gfs2_edit restoremeta /home/bob/metadata/gfs2/503938.home.metadata /dev/mpathc/scratch
File system size: 5177349 (0x4f0005) blocks, aka 19.768GB
There are 536870912 blocks of 4096 bytes in the destination device.
 
536870912 inodes processed, 200434 blocks saved (100%) processed,
File /home/bob/metadata/gfs2/503938.home.metadata restore successful.
# gfs2_tool -O sb /dev/mpathc/scratch proto "lock_nolock"
current lock protocol name = "lock_dlm"
new lock protocol name = "lock_nolock"
Done
# fsck.gfs2 -y /dev/mpathc/scratch &> /tmp/gronk
# echo $?
1
# fsck.gfs2 /dev/mpathc/scratch &> /dev/null
# echo $?
0
# mount -tgfs2 /dev/mpathc/scratch /mnt/gfs2
error mounting /dev/dm-8 on /mnt/gfs2: Input/output error
# dmesg
GFS2: fsid=: Trying to join cluster "lock_nolock", "MUSKETEER:home"
GFS2: fsid=MUSKETEER:home.0: Now mounting FS...
GFS2: fsid=MUSKETEER:home.0: jid=0, already locked for use
GFS2: fsid=MUSKETEER:home.0: jid=0: Looking at journal...
GFS2: fsid=MUSKETEER:home.0: jid=0: Done
GFS2: fsid=MUSKETEER:home.0: jid=1: Trying to acquire journal lock...
GFS2: fsid=MUSKETEER:home.0: jid=1: Looking at journal...
GFS2: fsid=MUSKETEER:home.0: jid=1: Done
GFS2: fsid=MUSKETEER:home.0: jid=2: Trying to acquire journal lock...
GFS2: fsid=MUSKETEER:home.0: jid=2: Looking at journal...
GFS2: fsid=MUSKETEER:home.0: jid=2: Done
GFS2: fsid=MUSKETEER:home.0: jid=3: Trying to acquire journal lock...
GFS2: fsid=MUSKETEER:home.0: jid=3: Looking at journal...
GFS2: fsid=MUSKETEER:home.0: jid=3: Done
GFS2: fsid=MUSKETEER:home.0: jid=4: Trying to acquire journal lock...
GFS2: fsid=MUSKETEER:home.0: jid=4: Looking at journal...
GFS2: fsid=MUSKETEER:home.0: jid=4: Done
GFS2: fsid=MUSKETEER:home.0: fatal: invalid metadata block
GFS2: fsid=MUSKETEER:home.0:   bh = 164212 (type: exp=14, found=5)
GFS2: fsid=MUSKETEER:home.0:   function = gfs2_quota_init, file = fs/gfs2/quota.c, line = 1239
GFS2: fsid=MUSKETEER:home.0: about to withdraw this file system
GFS2: fsid=MUSKETEER:home.0: withdrawn
Pid: 40313, comm: mount.gfs2 Not tainted 2.6.32-398.el6.x86_64 #1
Call Trace:
 [<ffffffffa0b36c78>] ? gfs2_lm_withdraw+0x128/0x160 [gfs2]
 [<ffffffffa0b36d80>] ? gfs2_metatype_check_ii+0x50/0x60 [gfs2]
 [<ffffffffa0b2cfd7>] ? gfs2_quota_init+0x357/0x3c0 [gfs2]
 [<ffffffffa0b337fe>] ? gfs2_make_fs_rw+0xbe/0x160 [gfs2]
 [<ffffffffa0b3378b>] ? gfs2_make_fs_rw+0x4b/0x160 [gfs2]
 [<ffffffff81063db5>] ? wake_up_process+0x15/0x20
 [<ffffffffa0b27e8a>] ? gfs2_get_sb+0x96a/0xa30 [gfs2]
 [<ffffffffa0b1b299>] ? gfs2_glock_nq_num+0x59/0xa0 [gfs2]
 [<ffffffff8116224a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff81185d3b>] ? vfs_kern_mount+0x7b/0x1b0
 [<ffffffff81185ee2>] ? do_kern_mount+0x52/0x130
 [<ffffffff811a6954>] ? do_mount+0x2f4/0x920
 [<ffffffff811a7010>] ? sys_mount+0x90/0xe0
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
GFS2: fsid=MUSKETEER:home.0: can't make FS RW: -5

As you can see, the last fsck.gfs2 came up clean, but the file
system was not mountable because the system quota_changeX files
had bad contents: indirect blocks rather than quota_change blocks.
Comment 2 Robert Peterson 2013-09-03 15:01:41 EDT
Created attachment 793333 [details]
Try #1 patch

This patch gives fsck.gfs2 the ability to check and repair all
the system files in the per_node directory. I tested it on
system gfs-i16c-03.mpc.lab.eng.bos.redhat.com using (1) the
failing scenario, (2) several mock-ups where I used gfs2_edit
to damage things.
Comment 3 Robert Peterson 2013-09-06 15:39:17 EDT
I pushed the patch to the master and RHEL7 branches of the
gfs2-utils git tree. Changing status to POST.
Comment 6 Justin Payne 2014-02-19 15:23:30 EST
Verified in gfs2-utils-3.1.6-12.el7:


[root@dash-02 ~]# gfs2_edit restoremeta 503938.home.metadata /dev/sdc1
[root@dash-02 ~]# fsck.gfs2 -y /dev/sdc1 &> /root/gronk
[root@dash-02 ~]# echo $?
1
[root@dash-02 ~]# fsck.gfs2 -y /dev/sdc1 &> /dev/null
[root@dash-02 ~]# echo $?
0
[root@dash-02 ~]# mount -t gfs2 /dev/sdc1 /mnt/gfs2/
[root@dash-02 ~]# mount |grep gfs2
/dev/sdc1 on /mnt/gfs2 type gfs2 (rw,relatime,seclabel,localflocks)
Comment 8 Ludek Smid 2014-06-13 09:22:32 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.