Bug 254110

Summary: failed start of service because shared resource gfs already mounted
Product: Red Hat Enterprise Linux 5 Reporter: Herbert L. Plankl <h.plankl>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 5.0CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0353 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 14:30:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Wait + retry in case the mount is in progress.
none
Replacement patch with fixed typo none

Description Herbert L. Plankl 2007-08-24 07:22:54 UTC
Description of problem:
I'm using RH-Cluster-Suite 5 in a two_node-cluster. There are two services using
1 shared gfs-partition mounted on /opt/icoserve. The configuration of the shared
resource looks like:

                <resources>
                        <clusterfs device="/dev/icoserve/opt_icoserve"
force_unmount="0" fsid="63088" fstype="gfs" mountpoint="/opt/icoserve"
name="opt_icoserve" options=""/>
                </resources>

services are:

                <service autostart="1" domain="n1_n2" exclusive="0"
name="testscript1" recovery="restart">
                        <ip address="192.168.111.221" monitor_link="1"/>
                        <script file="/etc/init.d/testscript1" name="testscript1"/>
                        <clusterfs ref="opt_icoserve"/>
                </service>
                <service autostart="1" domain="n1_n2" exclusive="0"
name="testscript2" recovery="restart">
                        <ip address="192.168.111.222" monitor_link="1"/>
                        <script file="/etc/init.d/testscript2" name="testscript2"/>
                        <clusterfs ref="opt_icoserve"/>
                </service>

If the cluster starts, it tries to start the services. One service starts
correctly, the other service fails to start because the gfs-partition is already
mounted by the first service. Here is the log-output:

Aug 24 09:04:24 rhel5n1 clurgmgrd[6337]: <notice> Resource Group Manager Starting
Aug 24 09:04:29 rhel5n1 clurgmgrd[6337]: <notice> Starting stopped service
service:testscript1
Aug 24 09:04:29 rhel5n1 clurgmgrd[6337]: <notice> Starting stopped service
service:testscript2
Aug 24 09:04:31 rhel5n1 kernel: Trying to join cluster "lock_dlm",
"forty_two:opt_icoserve"
Aug 24 09:04:31 rhel5n1 kernel: Joined cluster. Now mounting FS...
Aug 24 09:04:31 rhel5n1 gfs_controld[6233]: mount point /opt/icoserve already used
Aug 24 09:04:31 rhel5n1 clurgmgrd: [6337]: <err> 'mount -t gfs 
/dev/mapper/icoserve-opt_icoserve /opt/icoserve' failed, error=1
Aug 24 09:04:31 rhel5n1 clurgmgrd[6337]: <notice> start on clusterfs
"opt_icoserve" returned 2 (invalid argument(s))
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=0:
Trying to acquire journal lock...
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=0:
Looking at journal...
Aug 24 09:04:31 rhel5n1 clurgmgrd[6337]: <warning> #68: Failed to start
service:testscript1; return value: 1
Aug 24 09:04:31 rhel5n1 clurgmgrd[6337]: <notice> Stopping service
service:testscript1
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=0: Done
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=1:
Trying to acquire journal lock...
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=1:
Looking at journal...
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=1: Done
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=2:
Trying to acquire journal lock...
Aug 24 09:04:31 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=2:
Looking at journal...
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=2: Done
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=3:
Trying to acquire journal lock...
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=3:
Looking at journal...
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=3: Done
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=4:
Trying to acquire journal lock...
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=4:
Looking at journal...
Aug 24 09:04:32 rhel5n1 kernel: GFS: fsid=forty_two:opt_icoserve.0: jid=4: Done
Aug 24 09:04:32 rhel5n1 clurgmgrd[6337]: <notice> Service service:testscript1 is
recovering
Aug 24 09:04:32 rhel5n1 clurgmgrd[6337]: <warning> #71: Relocating failed
service service:testscript1
Aug 24 09:04:32 rhel5n1 clurgmgrd[6337]: <notice> Stopping service
service:testscript1
Aug 24 09:04:33 rhel5n1 avahi-daemon[2400]: Registering new address record for
192.168.111.222 on bond0.
Aug 24 09:04:33 rhel5n1 clurgmgrd[6337]: <notice> Service service:testscript1 is
stopped
Aug 24 09:04:34 rhel5n1 clurgmgrd[6337]: <notice> Service service:testscript2
started


A manually start using " clusvcadm -e testscript1" starts the service..

Version-Release number of selected component (if applicable):
cman-2.0.70-1.el5
rgmanager-2.0.28-1.el5
system-config-cluster-1.0.50-1.2
ricci-0.10.0-2.el5
luci-0.10.0-2.el5


How reproducible:
Two_node-cluster with 2 services using the same shared gfs-partition, configured
to initially start on the same node.

Steps to Reproduce:
1. start cluster
2. check logfile
  
Actual results:
one service fails to start - if the second member is down, it stays stopped
until the admin starts it manually.

Expected results:
service starts correctly

Additional info: Thats a big problem for us. We have more services using the
same gfs-partition and so it is possible, that a service does not start. Our
customers are hospitals and they really need HA.

Comment 1 Lon Hohberger 2007-09-18 20:49:59 UTC
Rgmanager is supposed to check to see if the device is already mounted in the
location given; apparently, it didn't correctly do that.  

That is, the mount command should never have been run in this case.

Comment 2 Lon Hohberger 2007-09-20 14:03:43 UTC
Aug 24 09:04:31 rhel5n1 kernel: Joined cluster. Now mounting FS...
Aug 24 09:04:31 rhel5n1 gfs_controld[6233]: mount point /opt/icoserve already used

Ok, I figured this out: there's a race between mounting the GFS volume and
joining the cluster.  That is, we queued up a mount request already (which was
getting processed - do you have gfs volumes in /etc/fstab?).

However, since the mount had *not* completed yet, it didn't show up in
/proc/mounts yet.  The second mount command (e.g. the one spawned by rgmanager)
failed because the first mount command had not completed.  However, rgmanager
thought this was a start failure because:

  (a) The file system was not mounted yet, and
  (b) the file system could not be mounted on request from rgmanager

There are a number of possible ways to solve this:

  (a) Have gfs_controld buffer identical mount point / device requests
      and respond to all when the initial mount completes 
  (b) have rgmanager retry the mount (blindly), or
  (c) have rgmanager watch the service group to wait for it to finish
      transitioning when trying to mount gfs volumes.

Comment 3 David Teigland 2007-09-20 16:27:48 UTC
It appears to me that gfs_controld and mount.gfs are working as they should.
The first mount should succeed and the second should fail.

It sounds like rgmanager checks if the fs is mounted by looking in /proc/mounts,
and if it's not will try to mount it.  So, I can believe that the following
is quite possible:
1) mount gfs begins and goes somewhat slowly
2) rgmanager looks in /proc/mounts and doesn't find gfs
3) gfs appears in /proc/mounts
4) rgmanager tries to mount and fails

Perhaps mount could return a specific error indicating that a mount
for that fs is already in progress?  (I think it already may.)
And rgmanager could check for that error?


Comment 4 Lon Hohberger 2007-09-24 20:25:24 UTC
That would be fine, David - and is probably the correct progress indicator. 
Does the mount command have such an indication already?

Comment 5 Lon Hohberger 2007-09-24 20:26:20 UTC
More clearly - Is there already a standard error code from the mount command
that indicates 'mount of this device to that point is already in progress' ?

Comment 6 Lon Hohberger 2007-09-24 20:29:13 UTC
Man page says:

       mount has the following return codes (the bits can be ORed):
       0      success
       1      incorrect invocation or permissions
       2      system error (out of memory, cannot fork, no more loop devices)
       4      internal mount bug or missing nfs support in mount
       8      user interrupt
       16     problems writing or locking /etc/mtab
       32     mount failure
       64     some mount succeeded




Comment 7 RHEL Program Management 2007-10-16 03:44:38 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Lon Hohberger 2007-11-14 19:31:37 UTC
Created attachment 258611 [details]
Wait + retry in case the mount is in progress.

Comment 9 Lon Hohberger 2007-11-14 19:42:46 UTC
Created attachment 258671 [details]
Replacement patch with fixed typo

Comment 12 errata-xmlrpc 2008-05-21 14:30:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0353.html