Bug 1035042
| Summary: | Despite glusterd init script now starting before netfs, netfs fails to mount localhost glusterfs shares in RHS 2.1 | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | James Hartsock <hartsjc> | |
| Component: | glusterd | Assignee: | Raghavendra Talur <rtalur> | |
| Status: | CLOSED ERRATA | QA Contact: | SATHEESARAN <sasundar> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 2.1 | CC: | agunn, amukherj, barumuga, bhubbard, dblack, fharshav, hamiller, ira, knoha, kschinck, lherbolt, ndevos, nsathyan, psriniva, rtalur, sdharane, ssamanta, vagarwal, vbellur | |
| Target Milestone: | --- | Keywords: | Patch | |
| Target Release: | RHGS 3.0.0 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.6.0.9-1.el6rhs | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, entries in /etc/fstab for glusterfs mounts did not have _netdev option. This led to some systems becoming unresponsive.
With this fix, the hook scripts have '_netdev' option defined for glusterFS mounts in the /etc/fstab and the mount operation is successful.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1075182 1180137 (view as bug list) | Environment: | ||
| Last Closed: | 2014-09-22 19:29:47 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1061468, 1073815, 1075182 | |||
|
Description
James Hartsock
2013-11-26 23:04:02 UTC
From a customer - The file /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh is also wrong. As it doesn't add the _netdev option when generating the config for /etc/fstab. and then None of them do actually... [root@lp-rhs-02 1]# grep -R defaults * start/post/K29CTDBsetup.sh.rpmsave.rpmsave: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" start/post/K29CTDBsetup.sh.rpmsave: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" start/post/S29CTDBsetup.sh: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" stop/pre/K30samba-stop.sh.rpmsave.rpmsave: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" stop/pre/K29CTDB-teardown.sh.rpmsave: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" stop/pre/S29CTDB-teardown.sh: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" stop/pre/K29CTDB-teardown.sh.rpmsave.rpmsave: mntent="`hostname`:/$volname $mntpt glusterfs defaults,transport=tcp 0 0" From SFDC 01050444:
Looks like when the ctdblock volume is started, the S29CTDBsetup.sh script adds that line to /etc/fstab if it is not there. So when you finally get the server up and check /etc/fstab, there are two entries ( one modified and one original ).
I modified that part of the script and it appears to be working.
Here is the modified function from the script:
function add_fstab_entry () {
volname=$1
mntpt=$2
mntent="`hostname`:/$volname $mntpt glusterfs _netdev,defaults,transport=tcp 0 0"
exists=`grep "^$mntent" /etc/fstab`
if [ "$exists" == "" ]
then
echo "$mntent" >> /etc/fstab
fi
}
Before, the _netdev was missing.
For what it's worth, I've tested Harold's suggestion against a RHS 2.1.2 (glusterfs 3.4.0.59rhs) environment with a CTDB lock volume and in my lab, adding the '_netdev' option to the 'add_fstab_entry' and 'remove_fstab_entry' functions in both scripts (S29CTDBsetup.sh and S29CTDB-teardown.sh respectively) allows you to locally mount a "glusterfs" filesystem. This was persistent through an entire reboot of a 4 node Gluster cluster. *** Bug 1074316 has been marked as a duplicate of this bug. *** Setting flags required to add BZs to RHS 3.0 Errata The change that has been committed is related to the _netdev option. This should resolve this issue for most customers. In case the network is initialized a little slowly, it may be required to add a LINKDELAY parameter in the /etc/sysconfig/network-scripts/ifcfg-* file(s) as explained here: - http://mjanja.co.ke/2014/04/glusterfs-mounts-fail-at-boot-on-centos/ Oh, it can also be the case that glusterd is not starting the glusterfsd (brick) processes quickly enough. glusterd starts these processes in the background, after the service script exited. It may be required to start the brick processes first, and have glusterd wait with becoming a daemon. This would likely be a change that needs some more work. Niels, W.r.t comment 10, I think it is not necessary that glusterfsd processes are started to have mount successful. As long as glusterfs can talk to glusterd it will return success. You are right about network initialization in comment 9. Will it do if we create a knowledge base for that as it is not a code change and not every user will face that issue? If yes, then I will create a doc bug for that and let this bug be verified. Let me know what you think. Patch URL provided in comment 5 says - "Review in Progress" Is this patch merged ? It would be more helpful to add comment9 as a KB article as suggested by Ragavendra Talur in comment 11. I am not sure how to do this. Neils , what is the procedure add to KBase article ? Ignore comment 5, we had posted that downstream for rhs-3.0 branch before it was decided that we will follow upstream strategy. The corresponding upstream patch http://review.gluster.org/#/c/7221/ got merged before we fulled upstream code for rhs-3.0. The patch exists downstream. (In reply to SATHEESARAN from comment #14) > It would be more helpful to add comment9 as a KB article as suggested by > Ragavendra Talur in comment 11. > > I am not sure how to do this. > > Neils , what is the procedure add to KBase article ? In the 'external trackers' for the bug, we already have a knowledge base solution linked: - https://access.redhat.com/site/solutions/747673 You should have a login for the Red Hat Customer Portal, the login would look something like rhn-qa-sasundar (mine is rhn-support-ndevos). At the moment, the LINKDELAY option is not mentioned in the article. Do you want to add that, or shall I do that? (In reply to Niels de Vos from comment #16) > (In reply to SATHEESARAN from comment #14) > > It would be more helpful to add comment9 as a KB article as suggested by > > Ragavendra Talur in comment 11. > > > > I am not sure how to do this. > > > > Neils , what is the procedure add to KBase article ? > > In the 'external trackers' for the bug, we already have a knowledge base > solution linked: > - https://access.redhat.com/site/solutions/747673 > > You should have a login for the Red Hat Customer Portal, the login would > look something like rhn-qa-sasundar (mine is rhn-support-ndevos). > > At the moment, the LINKDELAY option is not mentioned in the article. Do > you want to add that, or shall I do that? It would be good, if you could take it up as I am very new to writing a KBase Article (In reply to Raghavendra Talur from comment #15) > Ignore comment 5, we had posted that downstream for rhs-3.0 branch before it > was > decided that we will follow upstream strategy. > > The corresponding upstream patch http://review.gluster.org/#/c/7221/ got > merged before we fulled upstream code for rhs-3.0. The patch exists > downstream. Thanks for the reply. Tested with glusterfs-3.6.0.22-1.el6rhs Followed the steps below : 0. Setup a 2 node cluster 1. Created a replica volume ( replica count 2 ) 2. Edit the file on both the nodes, "/var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh" & "/var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh", and replace META with the volume name created in step 1 3. Start the volume 4. Check the /etc/fstab file for gluster mount entry. Observation was that - glusterfs mount entry had _netdev option 5. Stop the volume observation - glusterfs mount entry being removed from /etc/fstab Apart from the CTDB test, simple fstab entry for glusterfs mount on RHEL 6.5 with _netdev option also works well Marking this bug as VERIFIED Hi Raghavendra, Please review the edited doc text for technical accuracy and sign off. Verified the doc text for technical accuracy. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html This will be fixed in 2.1.6. Bug for that https://bugzilla.redhat.com/show_bug.cgi?id=1180137 |