Bug 254110
Summary: | failed start of service because shared resource gfs already mounted | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Herbert L. Plankl <h.plankl> | ||||||
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.0 | CC: | cluster-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHBA-2008-0353 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-05-21 14:30:31 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Herbert L. Plankl
2007-08-24 07:22:54 UTC
Rgmanager is supposed to check to see if the device is already mounted in the location given; apparently, it didn't correctly do that. That is, the mount command should never have been run in this case. Aug 24 09:04:31 rhel5n1 kernel: Joined cluster. Now mounting FS... Aug 24 09:04:31 rhel5n1 gfs_controld[6233]: mount point /opt/icoserve already used Ok, I figured this out: there's a race between mounting the GFS volume and joining the cluster. That is, we queued up a mount request already (which was getting processed - do you have gfs volumes in /etc/fstab?). However, since the mount had *not* completed yet, it didn't show up in /proc/mounts yet. The second mount command (e.g. the one spawned by rgmanager) failed because the first mount command had not completed. However, rgmanager thought this was a start failure because: (a) The file system was not mounted yet, and (b) the file system could not be mounted on request from rgmanager There are a number of possible ways to solve this: (a) Have gfs_controld buffer identical mount point / device requests and respond to all when the initial mount completes (b) have rgmanager retry the mount (blindly), or (c) have rgmanager watch the service group to wait for it to finish transitioning when trying to mount gfs volumes. It appears to me that gfs_controld and mount.gfs are working as they should. The first mount should succeed and the second should fail. It sounds like rgmanager checks if the fs is mounted by looking in /proc/mounts, and if it's not will try to mount it. So, I can believe that the following is quite possible: 1) mount gfs begins and goes somewhat slowly 2) rgmanager looks in /proc/mounts and doesn't find gfs 3) gfs appears in /proc/mounts 4) rgmanager tries to mount and fails Perhaps mount could return a specific error indicating that a mount for that fs is already in progress? (I think it already may.) And rgmanager could check for that error? That would be fine, David - and is probably the correct progress indicator. Does the mount command have such an indication already? More clearly - Is there already a standard error code from the mount command that indicates 'mount of this device to that point is already in progress' ? Man page says: mount has the following return codes (the bits can be ORed): 0 success 1 incorrect invocation or permissions 2 system error (out of memory, cannot fork, no more loop devices) 4 internal mount bug or missing nfs support in mount 8 user interrupt 16 problems writing or locking /etc/mtab 32 mount failure 64 some mount succeeded This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Created attachment 258611 [details]
Wait + retry in case the mount is in progress.
Created attachment 258671 [details]
Replacement patch with fixed typo
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0353.html |