Bug 425421 - gfs mount attempt hangs if no more journals available
Summary: gfs mount attempt hangs if no more journals available
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs-kmod
Version: 5.1
Hardware: All
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Robert Peterson
QA Contact: GFS Bugs
URL:
Whiteboard:
: 460233 (view as bug list)
Depends On:
Blocks: 475312
TreeView+ depends on / blocked
 
Reported: 2007-12-14 20:41 UTC by Corey Marthaler
Modified: 2010-01-12 03:28 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 21:18:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to fix the problem (473 bytes, patch)
2008-02-01 20:19 UTC, Robert Peterson
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0132 0 normal SHIPPED_LIVE gfs-kmod bug-fix update 2009-01-20 16:04:57 UTC

Description Corey Marthaler 2007-12-14 20:41:58 UTC
Description of problem:
I created a gfs filesystem on a cluster with three nodes.

[root@grant-02 ~]# gfs_mkfs -O -j 2 -p lock_dlm -t GRANT-CLUSTER:gfs2
/dev/grant/gfs2
Device:                    /dev/grant/gfs2
Blocksize:                 4096
Filesystem Size:           19593744
Journals:                  2
Resource Groups:           300
Locking Protocol:          lock_dlm
Lock Table:                GRANT-CLUSTER:gfs2

Syncing...
All Done

When I attempt mount on the third node, it hangs.

Trying to join cluster "lock_dlm", "GRANT-CLUSTER:gfs2"
Joined cluster. Now mounting FS...
GFS: fsid=GRANT-CLUSTER:gfs2.2: can't mount journal #2
GFS: fsid=GRANT-CLUSTER:gfs2.2: there are only 2 journals (0 - 1)
Dec 14 14:34:57 grant-03 kernel: GFS 0.1.19-7.el5_1.1 (built Nov 12 2007 19:27:d
Dec 14 14:34:57 grant-03 kernel: Trying to join cluster "lock_dlm", "GRANT-CLUS"
Dec 14 14:34:57 grant-03 kernel: Joined cluster. Now mounting FS...
Dec 14 14:34:57 grant-03 kernel: GFS: fsid=GRANT-CLUSTER:gfs2.2: can't mount jo2
Dec 14 14:34:57 grant-03 kernel: GFS: fsid=GRANT-CLUSTER:gfs2.2: there are only)


[root@grant-03 ~]# strace mount /dev/grant/gfs2 /mnt/gfs2
execve("/bin/mount", ["mount", "/dev/grant/gfs2", "/mnt/gfs2"], [/* 21 vars */]) = 0
brk(0)                                  = 0xee55000

[...]

stat("/dev/grant/gfs2", {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 3), ...}) = 0
rt_sigprocmask(SIG_BLOCK, ~[TRAP SEGV RTMIN RT_1], NULL, 8) = 0
open("/dev/grant/gfs2", O_RDONLY)       = 3
fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 3), ...}) = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
69632) = 69632
close(3)                                = 0
stat("/sbin/mount.gfs", {st_mode=S_IFREG|0755, st_size=42000, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x2aaaaaaba540) = 3262
wait4(-1,

Version-Release number of selected component (if applicable):
[root@grant-03 ~]# rpm -qa | grep gfs
gfs2-utils-0.1.38-1.el5
gfs-utils-0.1.12-1.el5
kmod-gfs-0.1.19-7.el5
kmod-gfs-0.1.19-7.el5_1.1

How reproducible:
everytime

Comment 1 Nate Straz 2007-12-19 20:03:51 UTC
Planning on removing GFS-kernel.  Moving all bugs to gfs-kmod.

Comment 2 Robert Peterson 2008-02-01 20:19:53 UTC
Created attachment 293764 [details]
Patch to fix the problem

This was an easy one.  When Wendy did the fast statfs patch, she
forgot to deallocate the statfs (formerly license) file on errors.

Comment 3 Robert Peterson 2008-02-01 20:25:16 UTC
Setting flags for inclusion in RHEL5.  It's already fixed in RHEL4.


Comment 4 Robert Peterson 2008-02-01 20:32:08 UTC
Looks like Wendy tried to remedy this problem with a commit to the RHEL5
branch of CVS but she got it wrong.  Her comment referenced bug #231904.
Here is the diff that shows the put against the wrong inode, the quota inode
rather than the license inode.

http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/gfs-kernel/src/gfs/ops_fstype.c.diff?r1=1.28.2.4&r2=1.28.2.5&cvsroot=cluster&f=h

So I guess we should still fix this.  Again, it looks okay in RHEL4.


Comment 5 Robert Peterson 2008-04-14 20:44:06 UTC
This patch was tested on the roth-0{1,2,3} cluster at RHEL5.
I pushed the patch to the master, STABLE2 and RHEL5 branches of
the cluster git tree.  Changing status to modified.


Comment 6 Corey Marthaler 2008-08-27 18:58:28 UTC
*** Bug 460233 has been marked as a duplicate of this bug. ***

Comment 7 Corey Marthaler 2008-08-27 19:39:49 UTC
Is there any way we could have a better error message?

[root@taft-03 tmp]# mount /dev/taft/mirror1 /mnt/taft1
/sbin/mount.gfs: error mounting /dev/mapper/taft-mirror1 on /mnt/taft1: Invalid
argument

Granted, the info is in the log...
Aug 27 14:32:11 taft-03 kernel: Trying to join cluster "lock_dlm", "TAFT:1"
Aug 27 14:32:11 taft-03 kernel: Joined cluster. Now mounting FS...
Aug 27 14:32:11 taft-03 kernel: GFS: fsid=TAFT:1.3: can't mount journal #3
Aug 27 14:32:11 taft-03 kernel: GFS: fsid=TAFT:1.3: there are only 3 journals
(0 - 2)


...but "Invalid argument" makes me think I misspelled something or that the device is bad or failed. If not, then this bug can be marked verified.

Comment 9 Nate Straz 2008-12-08 20:25:01 UTC
Verified with gfs-utils-0.1.18-1.el5 and kernel-2.6.18-124.el5.

Comment 11 errata-xmlrpc 2009-01-20 21:18:33 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0132.html


Note You need to log in before you can comment on or make changes to this bug.