Bug 637678
Summary: | service failover hangs at quotaoff in /usr/share/cluster/fs.sh | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Ryan Mitchell <rmitchel> | ||||
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> | ||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 5.5 | CC: | amoralej, bmr, cluster-maint, djansa, edamato, jentrena, jwest, mjuricek, tao, tscherf | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | rgmanager-2.0.52-16.el5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 440645 | Environment: | |||||
Last Closed: | 2011-07-21 10:43:27 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 440645 | ||||||
Bug Blocks: | 694731 | ||||||
Attachments: |
|
Description
Ryan Mitchell
2010-09-27 05:48:33 UTC
Created attachment 449826 [details] avoid running quotaoff when stopping nfs services without quotas enabled Patch to avoid running quotaoff when stopping nfs services without quotas enabled. This can cause hangs if the nfs server is unavailable. Ported from RHEL4 bug 440645. This has not been tested yet as I can't reproduce the original issue. I have tested the patch for functionality with services however. Merged How to test We were not able to reproduce the hang as described, however, the fix is still pertinent since fs.sh patch fixes the fact that we were always calling quotaoff, even when quotas were not in use. Consequently, we are trying to test for patch correctness, as opposed to hang resolution, since we were not able to reliably reproduce the hang. 1) Create a 2+ node cluster 2) Set up syslog so that it redirects local4 to /var/log/rgmanager: FOR SYSLOG: echo "local4.* /var/log/rgmanager" >> /etc/syslog.conf service syslog restart FOR RSYSLOG: echo "local4.* /var/log/rgmanager" >> /etc/rsyslog.conf service rsyslog restart 3) Set rgmanager's logging up so that it logs debug messages to local4 in cluster.conf: <rm log_facility="local4" log_level="7" > ... </rm> 4) Add a service with a file system resource. Ensure that neither usrquota nor grpquota are specified in the mount options. <rm log_facility="local4" log_level="7" > <service name="test" > <fs name="fs-test" device="/dev/sdb3" mountpoint="/mnt/test" /> </service> </rm> 5) Enable the service. 6) Check the output of 'quota -v'. There should be no output related to the file system added in step (4). 7) Disable the service. * On old versions of rgmanager, quotaoff was always called when unmounting, which was the cause of this issue. * On the new version of rgmanager incorporating the fix, there should NOT be a log message describing quotas being disabled prior to unmounting the file system. 8) Add quota options to the file system resource. Simply add "usrquota,grpquota" to the options attribute of the file system resource: <fs name="fs-test" device="/dev/sdb3" options="usrquota,grpquota" mountpoint="/mnt/test" /> 9) Enable the service. 10) Check the output of 'quota -v'. * On old and new versions of rgmanager, there should be output related to the file system added in step (4): [root@rhel5-1 ~]# quota -v Disk quotas for user root (uid 0): Filesystem blocks quota limit grace files quota limit grace /dev/sda2 17700 0 0 6 0 0 11) Disable the service. * On old versions of rgmanager, quotaoff was always called when unmounting, which was the cause of this issue. * On the new version of rgmanager incorporating the fix, there SHOULD be a log message describing quotas being disabled prior to unmounting the file system: <debug> Turning off quotas for /mnt/test Tested package rgmanager-2.0.52-19.el5, no hang was reproduced. The patch is working OK, quotaoff is called only if quotas enabled. service WITHOUT usrquota,grpquota fs options: ### starting service May 13 04:07:28 a1 clurgmgrd[32686]: <notice> Starting disabled service service:test May 13 04:07:28 a1 clurgmgrd: [32686]: <info> mounting /dev/loop0 on /mnt/test May 13 04:07:28 a1 clurgmgrd: [32686]: <debug> mount -t ext3 /dev/loop0 /mnt/test May 13 04:07:28 a1 clurgmgrd: [32686]: <info> quotaopts = May 13 04:07:28 a1 clurgmgrd[32686]: <notice> Service service:test started May 13 04:07:38 a1 clurgmgrd[32686]: <debug> 1 events processed ### stopping service May 13 04:08:18 a1 clurgmgrd[32686]: <notice> Stopping service service:test May 13 04:08:19 a1 clurgmgrd: [32686]: <info> unmounting /mnt/test May 13 04:08:19 a1 clurgmgrd[32686]: <notice> Service service:test is disabled May 13 04:08:29 a1 clurgmgrd[32686]: <debug> 1 events processed service WITH usrquota,grpquota fs options ### starting service May 13 04:12:36 a1 clurgmgrd[918]: <notice> Starting disabled service service:test May 13 04:12:36 a1 clurgmgrd: [918]: <info> mounting /dev/loop0 on /mnt/test May 13 04:12:36 a1 clurgmgrd: [918]: <debug> mount -t ext3 -o usrquota,grpquota /dev/loop0 /mnt/test May 13 04:12:36 a1 clurgmgrd: [918]: <info> quotaopts = gu May 13 04:12:36 a1 clurgmgrd: [918]: <info> Enabling Quotas on /mnt/test May 13 04:12:36 a1 clurgmgrd: [918]: <debug> quotaon -gu /mnt/test May 13 04:12:36 a1 clurgmgrd[918]: <notice> Service service:test started May 13 04:12:46 a1 clurgmgrd[918]: <debug> 1 events processed ### stopping service May 13 04:13:57 a1 clurgmgrd[918]: <notice> Stopping service service:test May 13 04:13:58 a1 clurgmgrd: [918]: <debug> Turning off quotas for /mnt/test May 13 04:13:58 a1 clurgmgrd: [918]: <info> unmounting /mnt/test May 13 04:13:58 a1 clurgmgrd[918]: <notice> Service service:test is disabled May 13 04:14:08 a1 clurgmgrd[918]: <debug> 1 events processed An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1000.html |