Bug 607358 - 1000+ entries in /proc/mounts results in large delays when failing over a fs.sh resource
Summary: 1000+ entries in /proc/mounts results in large delays when failing over a fs....
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-23 21:47 UTC by Adam Drew
Modified: 2013-03-04 05:02 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-25 22:02:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Adam Drew 2010-06-23 21:47:40 UTC
Description of problem:

Using the fs.sh resource script to mount file systems via rgmanager. The fs.sh resource agent is scanning every entry in /proc/mounts. With 1000+ autofs entries in there the script takes over 2 minutes to parse through all of them. This results in a long delay to mount on failover.

Version-Release number of selected component (if applicable):

rgmanager-2.0.52-1.el5


How reproducible:

Any time hundreds to thousands of entries appear in /proc/mounts

Steps to Reproduce:
1. Have hundreds to thousands of entries in /proc/mounts
2. Have a resource group with an fs.sh resource in it
3. Fail the resource over
4. Observe a multi-minute delay on failover as /proc/mounts gets parsed
  
Actual results:
Multi-minute delays on failover

Expected results:
Failover time to not be effected, or negligibly effected, by number of entries in /proc/mounts

Additional info:

Comment 3 Lon Hohberger 2011-01-25 22:02:43 UTC
I do not think there is a fix for this which can be done in a non-disruptive way.  The problem isn't really the size of /proc/mounts, it's how it's being processed.

Users who hit this issue can try this if desired:

  http://people.redhat.com/lhh/fsc-0.5.4.tar.gz

This program is a C replacement for fs.sh.  I wrote it in response to another issue in fs.sh.  It is significantly faster (hundreds or thousands of times faster, depending) than fs.sh.

However, it is not going to be included in the RHEL product since it cannot (without an immense amount of work) perform all the checks and validation that fs.sh performs.  That is, fsc is a subset of what fs.sh does; it is *not* a direct replacement in all configurations (but it is in most).  The overlap is enough and the number of problems it solves are too few to be useful.

To try it, simply perform the following steps on each cluster node:

   (stop rgmanager)

   tar -xzvf fsc-0.5.4.tar.gz
   cd fsc-0.5.4
   make
   chmod -x /usr/share/cluster/fs.sh
   cp fsc /usr/share/cluster

   (start rgmanager)


Note You need to log in before you can comment on or make changes to this bug.