Red Hat Bugzilla – Bug 607358
1000+ entries in /proc/mounts results in large delays when failing over a fs.sh resource
Last modified: 2013-03-04 00:02:57 EST
Description of problem:
Using the fs.sh resource script to mount file systems via rgmanager. The fs.sh resource agent is scanning every entry in /proc/mounts. With 1000+ autofs entries in there the script takes over 2 minutes to parse through all of them. This results in a long delay to mount on failover.
Version-Release number of selected component (if applicable):
Any time hundreds to thousands of entries appear in /proc/mounts
Steps to Reproduce:
1. Have hundreds to thousands of entries in /proc/mounts
2. Have a resource group with an fs.sh resource in it
3. Fail the resource over
4. Observe a multi-minute delay on failover as /proc/mounts gets parsed
Multi-minute delays on failover
Failover time to not be effected, or negligibly effected, by number of entries in /proc/mounts
I do not think there is a fix for this which can be done in a non-disruptive way. The problem isn't really the size of /proc/mounts, it's how it's being processed.
Users who hit this issue can try this if desired:
This program is a C replacement for fs.sh. I wrote it in response to another issue in fs.sh. It is significantly faster (hundreds or thousands of times faster, depending) than fs.sh.
However, it is not going to be included in the RHEL product since it cannot (without an immense amount of work) perform all the checks and validation that fs.sh performs. That is, fsc is a subset of what fs.sh does; it is *not* a direct replacement in all configurations (but it is in most). The overlap is enough and the number of problems it solves are too few to be useful.
To try it, simply perform the following steps on each cluster node:
tar -xzvf fsc-0.5.4.tar.gz
chmod -x /usr/share/cluster/fs.sh
cp fsc /usr/share/cluster