130300 – mounting gfs readonly can fail due to error recovering journal

Bug 130300 - mounting gfs readonly can fail due to error recovering journal

Summary: mounting gfs readonly can fail due to error recovering journal

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	gfs
Sub Component:
Version:	3
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Ken Preslan
QA Contact:	GFS Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-08-18 21:30 UTC by Corey Marthaler
Modified:	2010-01-12 02:56 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-08-19 16:24:39 UTC
Embargoed:

Attachments	(Terms of Use)

Description Corey Marthaler 2004-08-18 21:30:10 UTC

Description of problem:
I've seen this while running mount_stress on both RHEL3 and RHEL4.

The simpliest case is to:

mount a gfs filesystem on whole cluster
umount on two of the nodes 
remount on one with -o locktable=clustername:newname
remount on the other one with -o ro 

There must be some kind of race condition because this does not
always cause the problem. I have seen the error without the preceeding
mount -o locktable=clustername:newname attempt so it's not required
but seems to help.

When this bug occurs, I see this standard error:
mount: wrong fs type, bad option, bad superblock on /dev/pool/corey1,
       or too many mounted file systems

Along with this on the console:
GFS:  fsid=morph-cluster:corey1, jid=0:  Trying to acquire journal lock...
GFS:  fsid=morph-cluster:corey1, jid=0:  Looking at journal...
GFS:  fsid=morph-cluster:corey1, jid=0:  can't replay:  read-only FS
GFS:  fsid=morph-cluster:corey1, jid=0:  Failed
GFS:  error recovering my journal (-30)

Once in this state all other mount -ro attempts to this fs on that
node will result in the same error. However, this can fixed by
mounting the filesystem rw, umounting, and then the ro attempt will work.


How reproducible:
Sometimes

Comment 1 Adam "mantis" Manthei 2004-08-18 23:16:08 UTC

1)  Are you trying to mount the same filesystem into two different
lockspaces?  If so... what are you expecting to happen?  Are you
deliberately trying to cause corruption and/or panics?

For example, on node1: 
   mount -t gfs /dev/pool/pool0 /gfs -o locktable=cluster:foo
and on node 2:
   mount -t gfs /dev/pool/pool0 /gfs -o locktable=cluster:bar


2) What do you expect GFS to do when it encounters a journal in need
of repair when told to mount read-only?  If it modifies the
filesystem, it really isn't read-only at that point (and arguably a bug).

Perhaps remounting the filesystem readonly after the journal has been
replayed is what you are after?  e.g.:
    mount -o remount,ro /gfs
In which case, I don't know what would happen if a read-write node
crashes and the read-only node trys to recover.  Hopefully it would
fail to replay the journal and allow another a node to retry.

Comment 2 Corey Marthaler 2004-08-19 16:24:39 UTC

I was trying to mount the same filesystem into two different 
lockspaces just to see that the locktable flag worked knowing that I 
might corrupt my data. But that's a different "I have a loaded gun 
pointed at my foot" issue. :) 
 
I guess this isn't really a bug then as that is the expected 
behavior if the journal needs to be replayed. It's just that the 
error given was a little scarey. :( But if one looks in the log, 
it's clear what happened.

Note You need to log in before you can comment on or make changes to this bug.