Bug 732921

Summary: fsck.gfs: invalid option -- a
Product: Red Hat Enterprise Linux 5 Reporter: Mohd Fazlee <mfazlee>
Component: gfs-utilsAssignee: Robert Peterson <rpeterso>
Status: CLOSED WONTFIX QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 5.5CC: ayobcode, edamato, huei.wong
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-24 14:55:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mohd Fazlee 2011-08-24 07:36:58 UTC
Description of problem:
During boot up, system enter maintenance mode due to "fsck.gfs: invalid option -- a". There are 3 gfs filesystem specified in fstab. All of them shared between 9 nodes. 

Version-Release number of selected component (if applicable):
gfs-utils-0.1.20-8.el5

How reproducible:


Steps to Reproduce:
1. Join a node to cluster (RHEL5.5)
2. specify the shared gfs filesystem in fstab.
3. reboot the machine
  
Actual results:
System enter maintenance mode at boot up with "fsck.gfs: invalid option -- a".

Expected results:
System to continue boot up.

Additional info:
# fsck.gfs -h
Usage: fsck.gfs [-hnqvVy] <device>

Red Hat Enterprise Linux Server release 5.5 (Tikanga)
gfs-utils-0.1.20-8.el5
cman-2.0.115-34.el5
kmod-gfs-0.1.34-12.el5
lvm2-cluster-2.02.56-7.el5
ricci-0.12.2-12.el5

Comment 1 Robert Peterson 2011-08-24 13:16:48 UTC
This is the same as bug #507596.  Closing as duplicate.
This was fixed in gfs2-utils-0.1.62-2.el5.

*** This bug has been marked as a duplicate of bug 507596 ***

Comment 2 Robert Peterson 2011-08-24 13:34:41 UTC
Sorry, I closed this by mistake: This is for GFS1's gfs_fsck,
not fsck.gfs2.  This was a bug discovered in GFS2's fsck.gfs2
tool, but the fix was never ported to GFS1.

Comment 3 ayobcode 2011-08-24 13:55:19 UTC
Hi Robert,

since you mention fsck.gfs2 I do have doubt as I do run a redhat cluster with 3 nodes. 

# rpm -qf /sbin/fsck.gfs2
gfs2-utils-0.1.62-20.el5

fstab entry:
/dev/VolG/lv_doc /doc   gfs2   defaults      1 2

will return the same output as mention by Fazlee
System enter maintenance mode at boot up with "fsck.gfs: invalid option -- a".

Check option there's -a in it.
# fsck.gfs2 -h
Usage: fsck.gfs2 [-afhnpqvVy] <device>

I do comment out from /etc/fstab and mount manually.

# df /doc
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolG-lv_doc 62903680   1588600  61315080   3% /doc

As there any missing step? or the bug still exist in gfs2-utils-0.1.62-20.el5?

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.5 (Tikanga)

Comment 4 Robert Peterson 2011-08-24 14:55:37 UTC
Hi,

We normally don't recommend a GFS volume be checked at boot
time.  The reason is: gfs_fsck can't possibly tell at boot time
whether the GFS volume is mounted by another node in the cluster.
The gfs_fsck tool doesn't include any of the cluster infrastructure
to be able to determine that, and we can't really add it since
gfs needs to be able to run without the infrastructure.  Even if
we added enough of the infrastructure to gfs_fsck, it would only
work for clustered lvm2 volumes.  Volumes on raw devices (like
/dev/sdc1) would never tell us whether the volume is mounted on
another node.

Therefore, we don't want gfs_fsck to ever run at boot time.
We always recommend that fsck be run after the system boots,
and run manually.

To fix that, you have to change your /etc/fstab to have
"0 0" after the GFS mounts, rather than "1 2" or anything else.
When you run gfs_fsck you should always make sure the volume
is not mounted from another node, then run it manually.
The exception, of course, is when GFS is mounted as root, which
adds a whole other set of complications.  That's why we decided
to add the support to gfs2.

I've looked at the patch that gets this to work for GFS2 and
discussed this with my fellow gfs/gfs2 developers.  Adding the
capability to fsck gfs1 at boot time with "-a" would require
adding the ability for gfs_fsck to tell whether the journals
are clean or dirty.  That would require a port of a lot of
code from the kernel.  We did that for gfs2, but it would be
a big undertaking for GFS.  And since GFS is deprecated in
RHEL6 and upstream, we have much higher priorities.

The GFS2 version of the code isn't ideal either for the same
reasons.  If fsck.gfs2 determines any of the journals are
dirty, there's a risk that the file system is mounted by
another node.  So it decides running is unsafe and ends with
a good return code.  That means the volume can go unchecked.
There is special code in there for "lock_nolock" protocol
which should be safe to assume.

If you want to push for a fix for GFS, please contact Red Hat
Support and make a business case.  Otherwise, please use it
with "0 0" in fstab and always run gfs_fsck manually as
suggested above.

Comment 5 huei.wong 2011-09-03 04:26:15 UTC
We normally don't recommend a GFS volume be checked at boot
time.

When was this recommendation posted? 


you should always make sure the volume is not mounted from another node, then run it manually.
The exception, of course, is when GFS is mounted as root, which adds a whole other set of complications.

If the volume is still mounted from another node and run the GFS check, what will happen? Will it corrupt the GFS and beyond repair?

Comment 6 Robert Peterson 2011-09-06 15:43:06 UTC
I haven't done a thorough search through our documentation, but
I agree that the documentation about this is lacking.  I have
spoken to the guy who maintains the GFS and GFS2 manuals and he
has agreed to add notes to the manuals.  Also, I have spoken to
one of our GSS guys and he has agreed to write a KBase article
about the subject as well.

There are some useful notes about it in the current fsck.gfs2
man page, but the documentation still needs work.