Description of problem: Using the fs.sh resource script to mount file systems via rgmanager. The fs.sh resource agent is scanning every entry in /proc/mounts. With 1000+ autofs entries in there the script takes over 2 minutes to parse through all of them. This results in a long delay to mount on failover. Version-Release number of selected component (if applicable): rgmanager-2.0.52-1.el5 How reproducible: Any time hundreds to thousands of entries appear in /proc/mounts Steps to Reproduce: 1. Have hundreds to thousands of entries in /proc/mounts 2. Have a resource group with an fs.sh resource in it 3. Fail the resource over 4. Observe a multi-minute delay on failover as /proc/mounts gets parsed Actual results: Multi-minute delays on failover Expected results: Failover time to not be effected, or negligibly effected, by number of entries in /proc/mounts Additional info:
I do not think there is a fix for this which can be done in a non-disruptive way. The problem isn't really the size of /proc/mounts, it's how it's being processed. Users who hit this issue can try this if desired: http://people.redhat.com/lhh/fsc-0.5.4.tar.gz This program is a C replacement for fs.sh. I wrote it in response to another issue in fs.sh. It is significantly faster (hundreds or thousands of times faster, depending) than fs.sh. However, it is not going to be included in the RHEL product since it cannot (without an immense amount of work) perform all the checks and validation that fs.sh performs. That is, fsc is a subset of what fs.sh does; it is *not* a direct replacement in all configurations (but it is in most). The overlap is enough and the number of problems it solves are too few to be useful. To try it, simply perform the following steps on each cluster node: (stop rgmanager) tar -xzvf fsc-0.5.4.tar.gz cd fsc-0.5.4 make chmod -x /usr/share/cluster/fs.sh cp fsc /usr/share/cluster (start rgmanager)