Bug 231080 - Can't mount GFS: gfs_controld join error: -16
Can't mount GFS: gfs_controld join error: -16
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman (Show other bugs)
5.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: David Teigland
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-03-05 18:26 EST by Robert Peterson
Modified: 2009-04-16 18:49 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-06-24 13:30:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
group_tool dump gfs for node that failed to mount (1.00 MB, text/plain)
2007-03-05 18:26 EST, Robert Peterson
no flags Details
group_tool dump from trin-09 (12.90 KB, text/plain)
2007-03-05 18:33 EST, Robert Peterson
no flags Details
group_tool dump from trin-10 (11.20 KB, text/plain)
2007-03-05 18:34 EST, Robert Peterson
no flags Details
group_tool dump from trin-11 (12.74 KB, text/plain)
2007-03-05 18:35 EST, Robert Peterson
no flags Details
group_tool dump gfs from trin-09 (5.80 KB, text/plain)
2007-03-05 18:39 EST, Robert Peterson
no flags Details
group_tool dump gfs from trin-11 (4.51 KB, text/plain)
2007-03-05 18:40 EST, Robert Peterson
no flags Details
Script that contains the commands to recreate (547 bytes, text/plain)
2007-03-29 15:42 EDT, Robert Peterson
no flags Details
Output from mount -vvvv on trin-09 (worked) (1.26 KB, text/plain)
2007-03-29 15:43 EDT, Robert Peterson
no flags Details
Output from 2nd mount -vvvv on trin-09 (succ) (1.14 KB, text/plain)
2007-03-29 15:46 EDT, Robert Peterson
no flags Details
Output from mount -vvvv on trin-10 (expected -19) (1.26 KB, text/plain)
2007-03-29 15:47 EDT, Robert Peterson
no flags Details
Output from mount -vvvv on trin-10 (error -16) (519 bytes, text/plain)
2007-03-29 15:48 EDT, Robert Peterson
no flags Details
Output from mount -vvvv on trin-11 (expected -19) (1.26 KB, text/plain)
2007-03-29 15:49 EDT, Robert Peterson
no flags Details
Output from mount -vvvv on trin-11 (worked) (1.14 KB, text/plain)
2007-03-29 15:50 EDT, Robert Peterson
no flags Details
"group_tool dump gfs" for trin-10 after error -16 received (10.92 KB, text/plain)
2007-03-29 15:56 EDT, Robert Peterson
no flags Details

  None (edit)
Description Robert Peterson 2007-03-05 18:26:22 EST
Description of problem:
I mounted my GFS file system on one out of a three-node cluster.
Then I tried to mount it simultaneously on the other two nodes while
the first node was busy doing a du on the file system.
One of the nodes mounted it correctly.  The other mount failed and
gave me the following message:
[root@trin-10 /home/devel/cluster]# mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1
/sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
/sbin/mount.gfs: error mounting lockproto lock_dlm

Version-Release number of selected component (if applicable):
RHEL5 with upstream 2.6.21 kernel from Steve Whitehouse's latest git tree.

How reproducible:
Unknown: it happened once

Steps to Reproduce:
1. service cman start on all 3 nodes (trin-09,10,11)
2. service clvmd start on all 3 nodes (trin-09,10,11)
3. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1 on one node (trin-09)
4. cd /mnt/gfs1 (from first node, trin-09)
5. du -sh (from first node, should come back with 24G but it takes time)
6. While du is working, do this simultaneously from the other
   two nodes (trin-10,11):

   modprobe gfs
   mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1 (same as step 3)

Actual results:
On trin-11 the mount was successful.  But on trin-10:
[root@trin-10 /home/devel/cluster]# modprobe gfs
[root@trin-10 /home/devel/cluster]# mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1
/sbin/mount.gfs: lock_dlm_join: gfs_controld join error: -16
/sbin/mount.gfs: error mounting lockproto lock_dlm

Expected results:
Mount should have been successful on both nodes (trin-10,11)

Additional info:
Group memberships on all 3 nodes after the failure:
[root@trin-09 /mnt/gfs1]# group_tool -v
type             level name      id       state node id local_done
fence            0     default   00010001 none        
[1 2 3]
dlm              1     clvmd     00010003 none        
[1 2 3]
dlm              1     bobs_gfs  00030001 none        
[1 3]
gfs              2     bobs_gfs  00010002 none        
[1 2 3]

[root@trin-10 /home/devel/cluster]# group_tool -v
type             level name      id       state node id local_done
fence            0     default   00010001 none        
[1 2 3]
dlm              1     clvmd     00010003 none        
[1 2 3]
gfs              2     bobs_gfs  00010002 none        
[1 2 3]

[root@trin-11 /home/devel/cluster]# group_tool -v
type             level name      id       state node id local_done
fence            0     default   00010001 none        
[1 2 3]
dlm              1     clvmd     00010003 none        
[1 2 3]
dlm              1     bobs_gfs  00030001 none        
[1 3]
gfs              2     bobs_gfs  00010002 none        
[1 2 3]

See attached gfs_controld dump for the failing node.
I'll try to collect and attach more info.

Dave, I can work on this if you're too busy with other things.
Comment 1 Robert Peterson 2007-03-05 18:26:22 EST
Created attachment 149305 [details]
group_tool dump gfs for node that failed to mount
Comment 2 Robert Peterson 2007-03-05 18:33:26 EST
Created attachment 149307 [details]
group_tool dump from trin-09
Comment 3 Robert Peterson 2007-03-05 18:34:59 EST
Created attachment 149308 [details]
group_tool dump from trin-10
Comment 4 Robert Peterson 2007-03-05 18:35:58 EST
Created attachment 149309 [details]
group_tool dump from trin-11
Comment 5 Robert Peterson 2007-03-05 18:39:36 EST
Created attachment 149310 [details]
group_tool dump gfs from trin-09
Comment 6 Robert Peterson 2007-03-05 18:40:44 EST
Created attachment 149311 [details]
group_tool dump gfs from trin-11
Comment 7 David Teigland 2007-03-05 20:57:42 EST
mount(2) returned EBUSY, you should look for a kernel error
in dmesg.
Comment 8 Robert Peterson 2007-03-06 13:32:23 EST
There were absolutely no useful messages in dmesg or syslog:

DLM (built Feb 26 2007 10:46:01) installed
GFS2 (built Mar  5 2007 16:15:40) installed
Lock_DLM (built Feb 26 2007 10:46:42) installed
dlm: connecting to 3
dlm: got connection from 3
dlm: connecting to 1
dlm: got connection from 1
GFS <CVS> (built Mar  5 2007 16:48:44) installed
Comment 9 Robert Peterson 2007-03-06 14:16:13 EST
I think this might have something to do with my first mount attempt,
which had failed prior to this.  If I recall correctly, the actual 
sequence of events was something like this:

1. rebooted all nodes.
2. service cman start on all nodes
3. service clvmd start on all nodes
4. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1 from all nodes
5. All nodes gave me a mount error (-19 perhaps?) on the first attempt 
   because mount couldn't find the gfs.ko module due to the fact that 
   I was running on a newly compiled upstream kernel.
6. cd /home/devel/cluster
7. ./configure --kernel_src=/home/devel/gfs2-2.6.steves/
8. make
9. make install
10. depmod -a
11. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1 from trin-09 (successful)
12. cd /mnt/gfs1 from trin-09
13. du -sh from trin-09
14. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1 from trin-11 (successful)
15. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1 from trin-10 (unsuccessful)

In theory, the first (failed) mount attempt should have taken trin-10
back out of the mount group.  The second attempt's kernel mount shouldn't
have failed, or at least there should have been a dmesg as you said
to indicate why it failed.

I tried to recreate the problem by rebooting trin-10, deleting the
gfs.ko module, and trying the mount first without gfs.ko (getting the
expected error) then trying the second mount with gfs.ko in place.
Unfortunately, the problem did not recreate.  The first failed mount
successfully removed trin-10 from the mount group as expected, 
clearing the way for the second mount, which was successful.

I'll reboot all three nodes and try again to recreate the scenario
with pretty much the sequence of events listed above.
Comment 10 Robert Peterson 2007-03-06 14:24:17 EST
The good news is that the problem recreates with the sequence given
in comment #9 above.  Again, nothing in dmesg to indicate the problem.
I'll try and track down what's going on.
Comment 11 Robert Peterson 2007-03-06 14:40:37 EST
Unfortunately, I've tried two more times to recreate the scenario
in comment #9 and now it won't recreate.  I'll try again later when
I get time.  That makes me wonder if it has something to do with
recovery.  Either that, or it's timing related.  Perhaps my timing
was different on my second two attempts.
Comment 12 Robert Peterson 2007-03-12 13:20:23 EDT
I was able to recreate this problem again after a fresh reboot of
all cluster nodes.  I recreated it by simply doing this sequence of
commands on all three cluster nodes simultaneously:

1. cd /home/devel/gfs2-2.6.steves.git (kernel source)
2. make modules
3. make modules_install (which deletes references to gfs.ko)
4. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1
   (Since gfs.ko isn't there, mount gives error 19)
5. cd /home/devel/cluster
6. make install
7. depmod -a
   (Now gfs.ko is there, so it should be okay to mount).
8. mount -tgfs /dev/bob_vg/lvol0 /mnt/gfs1

This occasionally gives the error in question (error 16) on one of the 
nodes.

I did a little backtracking and this looks like it is directly related 
to comment #16 of this bugzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=218560
Looking at the gfs_controld log, the sequence of events went something
like this:

1. The first mount was (step 4) was unsuccessful because the gfs.ko
   module was gone.  It created the mount group anyway.
2. do_unmount was called with mnterr != 0 (it was 19).  This was
   for the initial mount (first_mounter == 1).
3. As stated in comment #16 of 218560, since mnterr != 0, it did a
   goto around the code that would have otherwise deleted the mount
   group.
4. The second mount found the mount group, but not the mount point.

Note that the nodes that could successfully mount the second time
didn't have this problem, despite the fact that do_umount was called
with mnterr == 19.  In their gfs_controld logs, they have these
additional lines after do_unmount:

1173714141 bobs_gfs receive_mount_status from 2 len 288 last_cb 3
1173714141 bobs_gfs _receive_mount_status from 2 kernel_mount_error -1
first_mounter 1 opts 9
1173714141 bobs_gfs no next mounter available yet

Since all nodes were trying to mount simultaneously, all the nodes
thought they were the first mounter.  As each failed its mount, they
thought that the next low node should be made the first mounter.  But
all of them were destined to fail because of the missing kernel module.
Perhaps these nodes cleaned up their mount group accordingly in that
section of code, but I haven't traced it back that far yet.

So these are the options I see for fixing the code:
 
(a) Fix do_unmount as I recommended in comment #16 of 218560,
(b) Fix mount.gfs2.c so that it doesn't have this behavior, possibly so 
    that dlm is only mounted after a successful kernel mount.
(c) Fix it elsewhere in dlm to clean up its mount point after the failure,
(d) Keep the mount group out there and re-use it for the next mount
    (which is my least favorite option).

I still favor option (a).  I'll defer that decision to Dave T.
Since I've only seen this problem when the gfs kernel module appears
and disappears, I consider it low priority, although there may be other
cases where it could get into the same problem.
Comment 13 David Teigland 2007-03-20 18:12:07 EDT
I'm trying to zero in on the exact problem.  From the info above it
sounds like:

1. mount.gfs calls lock_dlm_join() which joins mountgroup
2. mount.gfs calls mount(2)
3. mount(2) returns ENODEV because gfs.ko is missing
4. mount.gfs does lock_dlm_leave() to leave the mountgroup
   [missing info about the result of this]
5. mount.gfs exits
6. insmod gfs.ko
7. mount.gfs calls lock_dlm_join() which gets EBUSY when it
   tries to join the mountgroup

If this is an accurate description of the problem, then my question is:
what might be going wrong with leaving the mountgroup in step 4?  I'd
like to see the messages that are going between mount.gfs and gfs_controld
during steps 3, 4 and 7 (the -v option shows those).
Comment 14 Robert Peterson 2007-03-29 15:42:39 EDT
Created attachment 151223 [details]
Script that contains the commands to recreate

I finally got this to reproduce a couple times, but it's tricky.
It doesn't seem to recreate if I use the attached script, but it
does occasionally if I cut and paste commands from the script and 
execute them one by one simultaneously with cssh.

I captured the output from mount -vvvv from all machines, and I'll
attach the output shortly.

I also got the output from gfs_controld -D but it's not too much
different from the "group_tool dump gfs" files attached above.
The gfs_controld output I captured doesn't match the mount -vvvv
output because they were taken in different recreations.
Comment 15 Robert Peterson 2007-03-29 15:43:51 EDT
Created attachment 151224 [details]
Output from mount -vvvv on trin-09 (worked)
Comment 16 Robert Peterson 2007-03-29 15:46:50 EDT
Created attachment 151225 [details]
Output from 2nd mount -vvvv on trin-09 (succ)

Incidentally, the .1 indicates the first mount that got expected error -19.
The .2 files are the output from the second mount that got -16 on trin-10
and was successful on all the others.
Comment 17 Robert Peterson 2007-03-29 15:47:32 EDT
Created attachment 151226 [details]
Output from mount -vvvv on trin-10 (expected -19)
Comment 18 Robert Peterson 2007-03-29 15:48:22 EDT
Created attachment 151227 [details]
Output from mount -vvvv on trin-10 (error -16)
Comment 19 Robert Peterson 2007-03-29 15:49:45 EDT
Created attachment 151228 [details]
Output from mount -vvvv on trin-11 (expected -19)
Comment 20 Robert Peterson 2007-03-29 15:50:39 EDT
Created attachment 151229 [details]
Output from mount -vvvv on trin-11 (worked)
Comment 21 Robert Peterson 2007-03-29 15:56:47 EDT
Created attachment 151230 [details]
"group_tool dump gfs" for trin-10 after error -16 received

This is the output from "group_tool dump gfs" from this latest
recreation of the problem on trin-10.
Comment 22 Kiersten (Kerri) Anderson 2007-04-23 12:51:07 EDT
Fixing Product Name.  Cluster Suite was merged into Red Hat Enterpise Linux for
5.0.  In addition dlm, fence and ccs were merged into the cman package, so
bugzilla should reflect package name where those utilities are located.
Comment 23 David Teigland 2007-12-04 12:57:29 EST
This has probably been fixed.
Comment 24 David Teigland 2008-06-24 13:30:20 EDT
It's been a year and a half since this was last seen, closing.


Note You need to log in before you can comment on or make changes to this bug.