Bug 728086 - fs-lib.sh doesn't handle mount error other than $?=1
Summary: fs-lib.sh doesn't handle mount error other than $?=1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents
Version: 6.1
Hardware: Unspecified
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Fabio Massimo Di Nitto
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 756082
TreeView+ depends on / blocked
 
Reported: 2011-08-04 03:21 UTC by Etsuji Nakai
Modified: 2012-06-20 14:38 UTC (History)
7 users (show)

Fixed In Version: resource-agents-3.9.2-11.el6
Doc Type: Bug Fix
Doc Text:
Cause: fs-lib.sh resource agent library was ignoring errors other than '1' Consequence: When a mount returned an error other than 1 (such as an iScsi mount) fs-lib.sh thought it worked properly Fix: make fs-lib.sh recognize other errors Result: fs-lib.sh now recognizes all errors and fails properly.
Clone Of:
Environment:
Last Closed: 2012-06-20 14:38:40 UTC


Attachments (Terms of Use)
Suggested patch for /usr/share/cluster/utils/fs-lib.sh (356 bytes, patch)
2011-08-04 03:21 UTC, Etsuji Nakai
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0947 normal SHIPPED_LIVE resource-agents bug fix and enhancement update 2012-06-19 20:59:56 UTC

Description Etsuji Nakai 2011-08-04 03:21:57 UTC
Created attachment 516605 [details]
Suggested patch for /usr/share/cluster/utils/fs-lib.sh

Description of problem:
The customer uses High Availability Add-On cluster with the iSCSI shared disk. When iSCSI disk access fails, the active nodes repeats stopping and starting the service forever.

The resource definition of the filesystem on the shared disk is as below:
<fs device="/dev/sdb" fstype="ext4" mountpoint="/data01" name="data_fs"/>

The root cause of the problem is that when rgmanager tries to restart the service, even though mounting the filesystem fails with the return code 32, /usr/share/cluster/utils/fs-lib.sh doesn't recognize it as an error.


Version-Release number of selected component (if applicable):
resource-agents-3.0.12-22.el6.x86_64
rgmanager-3.0.12-11.el6_1.1.x86_64


How reproducible:
Steps to Reproduce:
1.Configure a cluster with iSCSI shared disk and create a filesystem resource on it. Do not use qdisk.

2.Emulate the disk path error by blocking the iSCSI access with the iptables on the active cluster node.
# iptables -A INPUT -m tcp -p tcp --sport 3260 -j REJECT

  
Actual results:
rgmanager repeats stopping and starting the service forever.

Expected results:
The service is relocated to the other node.

Additional info:
See the attachment for the suggested patch to /usr/share/cluster/utils/fs-lib.sh. It catches the all non-zero return codes as an error when mounting the filesystem. In my lab cluster, it successfully relocated the service. However, I'm not sure whether it's good to handle ALL non-zero return codes as an error.

Comment 2 Lon Hohberger 2011-08-10 16:04:39 UTC
Nonzero return codes should be treated as errors, according to the mount man page.

Also, it appears that your patch would work.

Comment 3 Lon Hohberger 2011-08-10 16:05:38 UTC
Basically, if mount fails, the resource agent should return a failure -- this is for all values of failure, not just '1'.  In this case, the device is missing, and mount returned the generic '32' error code for a failed mount, which was not handled.

This should be simple to fix.

Comment 8 Fabio Massimo Di Nitto 2012-02-27 12:33:38 UTC
https://github.com/ClusterLabs/resource-agents/commit/ba09b94555d7c3b899e989b456cdbe1ee1b267ac

Available in rhel6-fixes branch upstream.

Comment 10 Fabio Massimo Di Nitto 2012-02-27 12:58:20 UTC
As for testing, I don´t have a setup to trigger an error != 1 at the moment but the patch is easy enough and tested in netfs.sh code.

Comment 14 Chris Feist 2012-04-30 21:47:50 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: fs-lib.sh resource agent library was ignoring errors other than '1'

Consequence: When a mount returned an error other than 1 (such as an iScsi mount) fs-lib.sh thought it worked properly

Fix: make fs-lib.sh recognize other errors

Result: fs-lib.sh now recognizes all errors and fails properly.

Comment 16 errata-xmlrpc 2012-06-20 14:38:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0947.html


Note You need to log in before you can comment on or make changes to this bug.