Bug 803362

Summary: fsck.gfs2: check of grown GFS file system fails
Product: Red Hat Enterprise Linux 6 Reporter: Nate Straz <nstraz>
Component: clusterAssignee: Andrew Price <anprice>
Status: CLOSED NOTABUG QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: anprice, ccaulfie, cluster-maint, lhh, rpeterso, teigland
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-15 15:55:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 675723    

Description Nate Straz 2012-03-14 14:34:33 UTC
Description of problem:

fsck.gfs2 can't handle an empty but grown GFS file system.

[root@west-02 ~]# fsck.gfs2 -n /dev/fsck/aftergrow
Initializing fsck
Clearing GFS journals (this may take a while)
.
Journals cleared.

Journal recovery complete.
Validating Resource Group index.
Level 1 rgrp check: Checking if all rgrp and rindex values are good.
(level 1 failed at block 524288 (0x80000): Some damage was found; we need to take remedial measures)
Level 2 rgrp check: Checking if rindex values may be easily repaired.
L2: number of rgs expected     = 11.
Block #524288 (0x80000) (1 of 3) is not GFS2_METATYPE_RG.
Block #-1 (0xffffffffffffffff) (-560738 of 15) is not GFS2_METATYPE_RB.
Block #-1 (0xffffffffffffffff) (-804657 of 15) is not GFS2_METATYPE_RB.
(level 2 failed at block 524288 (0x80000): rindex is unevenly spaced: either gfs1-style or corrupt)
Level 3 rgrp check: Calculating where the rgrps should be if evenly spaced.
L3: number of rgs in the index = 11.
L3: number of rgs expected     = 128.
L3: They don't match; either (1) the fs was extended, (2) an odd
L3: rgrp size was used, or (3) we have a corrupt rg index.
(level 3 failed at block 0 (0x0): rindex calculations don't match: uneven rgrp boundaries)
Level 4 rgrp check: Trying to rebuild rindex assuming evenly spaced rgrps.
rgrp 2 is damaged: getting dist from index: 0xeffe
* rgrp 3 at block 0xf1bd *** DAMAGED *** [length 0x1ae]
* rgrp 4 at block 0xf36b *** DAMAGED *** [length 0x1ae]
* rgrp 5 at block 0xf519 *** DAMAGED *** [length 0x1ae]
* rgrp 6 at block 0xf6c7 *** DAMAGED *** [length 0x1ae]
Error: too many missing or damaged rgrps using this method. Time to try another method.
Error rebuilding rgrp list.
(level 4 failed at block 0 (0x0): Too many rgrp misses: rgrps must be unevenly spaced)
Level 5 rgrp check: Trying to rebuild rindex assuming unevenly spaced rgrps.
rgrp 2 is damaged: getting dist from index: 0xeffe
* rgrp 5 at block 0x3c006 *** DAMAGED *** [length 0x244a]
* rgrp 6 at block 0x3e450 *** DAMAGED *** [length 0x4cb0]
* rgrp 7 at block 0x43100 *** DAMAGED *** [length 0xb01b]
* rgrp 8 at block 0x4e11b *** DAMAGED *** [length 0x3fe5]
Error: too many missing or damaged rgrps using this method. Time to try another method.
Error rebuilding rgrp list.
(level 5 failed at block 0 (0x0): Too much damage found: we cannot rebuild this rindex)
Resource Group recovery impossible; I can't fix this file system.
[root@west-02 ~]# echo $?
8


Version-Release number of selected component (if applicable):
gfs2-utils-3.0.12.1-28.el6.x86_64

How reproducible:
Easily

Steps to Reproduce:
On RHEL5 system:
1. mkfs -t gfs -O -p lock_nolock -j 1 /dev/fsck/aftergrow
2. mount -t gfs /dev/fsck/aftergrow /mnt/fsck
3. lvextend -L +2G fsck/aftergrow
4. gfs_grow /dev/fsck/aftergrow
On RHEL6 system
1. fsck.gfs2 /dev/fsck/aftergrow -y

  
Actual results:

Output from test case's fsck.gfs2 -y:

Initializing fsck
Clearing GFS journals (this may take a while)
.
Journals cleared.

Journal recovery complete.
Validating Resource Group index.
Level 1 rgrp check: Checking if all rgrp and rindex values are good.
Level 2 rgrp check: Checking if rindex values may be easily repaired.
(level 1 failed at block 524288 (0x80000): Some damage was found; we need to take remedial measures)
L2: number of rgs expected     = 11.
Block #524288 (0x80000) (1 of 3) is not GFS2_METATYPE_RG.
Attempting to repair the rgrp.
Block #-1 (0xffffffffffffffff) (-560738 of 15) is not GFS2_METATYPE_RB.
Attempting to repair the rgrp.
bad seek: Invalid argument from rewrite_rg_block:671: block 18446744073709551615 (0xffffffffffffffff)


Expected results:

fsck.gfs2 should find a clean GFS file system and return 0

Additional info:

Clean fsck.gfs output from RHEL5 node:
[root@west-01 ~]# fsck.gfs -n /dev/fsck/aftergrow
Initializing fsck
Starting pass1
Pass1 complete
Starting pass1b
Pass1b complete
Starting pass1c
Pass1c complete
Starting pass2
Pass2 complete
Starting pass3
Pass3 complete
Starting pass4
Pass4 complete
Starting pass5
Pass5 complete
Writing changes to disk

Comment 2 Andrew Price 2012-03-14 17:04:39 UTC
Just trying to reproduce this at the moment. What was the original size of the lv?

Comment 3 Nate Straz 2012-03-14 17:13:37 UTC
The original LV was 2G

Comment 4 Nate Straz 2012-03-15 15:55:49 UTC
There was an issue with the test case.  Since the LV was extended, it needed to be reactivated on the RHEL6 node to see the correct size.  After adding that step to the test, no error were found.