Bug 784933 - exportfs agent doubles rmtab on each relocation
Summary: exportfs agent doubles rmtab on each relocation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents
Version: 6.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: David Vossel
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-26 17:48 UTC by Jaroslav Kortus
Modified: 2013-11-21 05:17 UTC (History)
8 users (show)

Fixed In Version: resource-agents-3.9.2-29.el6
Doc Type: Technology Preview
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-11-21 05:17:04 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1541 0 normal SHIPPED_LIVE resource-agents bug fix and enhancement update 2013-11-20 21:40:47 UTC

Description Jaroslav Kortus 2012-01-26 17:48:49 UTC
Description of problem:
when exportfs relocates the exported share, rmtab size is doubled.
this is probably due to restore_rmtab call in the agent where it does "
cat  ${rmtab_backup} >> /var/lib/nfs/rmtab"

During shutdown this file is grepped and the result sent to rmtab. This way the file grows twice it's size on each graceful relocation eventually leading to unavailability of the service when the grow and copy operations become too slow to be completed in time.

Probably some sort of sort | uniq would help here, as the file was full of duplicated entries.


Version-Release number of selected component (if applicable):
pacemaker-1.1.6-3.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1. setup nfs server + nfs export of gfs2 filesystem (see below)
2. mount the share from the client (1 entry now in /var/lib/nfs/rmtab)
3. relocate the service (crm resource move nfsgroup)
4. see the entry doubled in /var/lib/nfs/rmtab
  
Actual results:
rmtab growing on each relocation until the service cannot be relocated any more.

Expected results:
file not growing and not containing duplicate entries

Additional info:
crm configure show
node node01
node node02
node node03
primitive ClusterIP ocf:heartbeat:IPaddr2 \
        params ip="192.168.100.11" cidr_netmask="32" \
        op monitor interval="30s"
primitive datadir ocf:heartbeat:exportfs \
        params clientspec="*" directory="/mnt/vedder0" fsid="4" options="all_squash,rw"
primitive gfs2 ocf:heartbeat:Filesystem \
        params device="/dev/rhts_cluster/vedder0" directory="/mnt/vedder0" fstype="gfs2" options="noatime"
primitive nfsserver ocf:heartbeat:nfsserver \
        params nfs_init_script="/etc/init.d/nfs" nfs_shared_infodir="/mnt/vedder0/nfs" nfs_ip="192.168.100.11" nfs_notify_cmd="/usr/sbin/sm-notify"
group nfsgroup nfsserver datadir ClusterIP \
        meta target-role="Started"
clone gfs2clone gfs2 \
        meta target-role="Started"

Comment 3 Andrew Beekhof 2012-01-29 10:00:31 UTC
Looks like an issue with the agent.  Re-assigning.

Comment 4 Chris Feist 2012-02-25 00:15:57 UTC
Jaroslav,

Do you know what version of resource-agents you had installed?  Was is just what was included in 6.2?  I'm having trouble getting things into /var/lib/nfs/rmtab can you also let me know what kernel you're using as well?

Thanks!
Chris

Comment 5 Jaroslav Kortus 2012-02-27 10:54:35 UTC
If I remember correctly, it was 6.2 GA, same applies for the kernel.

Comment 10 gh05t.7id37 2012-08-24 21:19:33 UTC
I was facing the same problem in a failover cluster implementation based on RHEL 6.2 and using the following packages:

pacemaker-libs-1.1.6-3.el6.x86_64
pacemaker-cluster-libs-1.1.6-3.el6.x86_64
pacemaker-1.1.6-3.el6.x86_64
pacemaker-cli-1.1.6-3.el6.x86_64

corosync-1.4.1-4.el6_2.2.x86_64
corosynclib-1.4.1-4.el6_2.2.x86_64

heartbeat-libs-3.0.4-1.el6.x86_64
heartbeat-3.0.4-1.el6.x86_64

I've fixed the problem changing the script /usr/lib/ocf/resource.d/heartbeat/exportfs, and inserting the following two lines of code:

grep -v ":${OCF_RESKEY_directory}:" /var/lib/nfs/rmtab > /var/lib/nfs/rmtab.tmp
mv   -f /var/lib/nfs/rmtab.tmp /var/lib/nfs/rmtab

The two lines above are inserted before the following line:

cat  ${rmtab_backup} >> /var/lib/nfs/rmtab

Comment 13 David Vossel 2013-08-06 18:27:04 UTC
This issue has been resolved in the latest resource-agents build.  The upstream patch related to this issue can be found here.

https://github.com/ClusterLabs/resource-agents/commit/bbc90e9de8636609842fb01219e8d9c789d8a623

Comment 14 David Vossel 2013-08-06 19:25:13 UTC
This has been fixed as a result of the heartbeat agent refresh.

Comment 18 michal novacek 2013-10-15 11:03:44 UTC
I have verified that the /var/lib/nfs/rmtab file size does not grow
exponentially with the patched version of resource-agents-3.9.2-40.el6.x86_64
after moving the nfs server 10 times.


setup of the cluster and resources is as follows:

---------
virt-021# pcs status
Cluster name: STSRHTS11429
Last updated: Tue Oct 15 11:42:28 2013
Last change: Tue Oct 15 11:36:21 2013 via cibadmin on virt-022
Stack: cman
Current DC: virt-022 - partition with quorum
Version: 1.1.10-14.el6-368c726
3 Nodes configured
7 Resources configured

Online: [ virt-020 virt-021 virt-022 ]

Full list of resources:
 virt-fencing   (stonith:fence_xvm):    Started virt-020 
 Resource Group: ha-nfsserver
     vip        (ocf::heartbeat:IPaddr2):       Started virt-021 
     nfs-server (ocf::heartbeat:nfsserver):     Started virt-021 
     nfs-export (ocf::heartbeat:exportfs):      Started virt-021 
 Clone Set: nfs-shared-fs-clone [nfs-shared-fs]
     Started: [ virt-020 virt-021 virt-022 ]
---------
virt-021# pcs resource show vip nfs-server nfs-export nfs-shared-fs-clone
 Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.34.70.217 cidr_netmask=23 
  Operations: monitor interval=30s (vip-monitor-interval-30s)

 Resource: nfs-server (class=ocf provider=heartbeat type=nfsserver)
  Attributes: nfs_ip=10.34.70.217 nfs_init_script=/etc/init.d/nfs \
nfs_shared_infodir=/mnt/nfs nfs_notify_cmd=/usr/sbin/sm-notify 
  Operations: monitor interval=30s (nfs-server-monitor-interval-30s)

 Resource: nfs-export (class=ocf provider=heartbeat type=exportfs)
  Attributes: directory=/mnt clientspec=* options=rw,async,no_all_squash fsid=238 
  Operations: monitor interval=60s (nfs-export-monitor-interval-60s)
 Clone: nfs-shared-fs-clone

 Resource: nfs-shared-fs (class=ocf provider=heartbeat type=Filesystem)
   Attributes: device=/dev/sda directory=/mnt fstype=gfs2 options= 
   Operations: monitor interval=30s (nfs-shared-fs-monitor-interval-30s)
---------
virt-021# ls -l /var/lib/nfs/rmtab
-rw-r-----. 1 root root 0 Oct 15 11:42 /var/lib/nfs/rmtab

mounted nfs share from outside of the cluster with this command:
# mount 10.34.70.217:/mnt /exports -o nfsvers=3

---------
virt-021# ls -l /var/lib/nfs/rmtab
-rw-r-----. 1 root root 27 Oct 15 11:45 /var/lib/nfs/rmtab


WITHOUT A PATCH (resource-agents-3.9.2-22.el6.x86_64):
======================================================
virt-021# grep -A 12 'restore_rmtab()' \
    /usr/lib/ocf/resource.d/heartbeat/exportfs 
restore_rmtab() {
    local rmtab_backup
    if [ ${OCF_RESKEY_rmtab_backup} != "none" ]; then
        rmtab_backup="${OCF_RESKEY_directory}/${OCF_RESKEY_rmtab_backup}"
        if [ -r ${rmtab_backup} ]; then
            cat  ${rmtab_backup} >> /var/lib/nfs/rmtab
            ocf_log debug "Restored `wc -l ${rmtab_backup}` rmtab entries from ${rmtab_backup}."
        else
            ocf_log warn "rmtab backup ${rmtab_backup} not found or not readable."
        fi
    fi
}

virt-021# for a in $(seq 1 5); do \
pcs resource move ha-nfsserver; sleep 5; \
pcs resource move ha-nfsserver; sleep 5; \
pcs constraint remove $(pcs constraint ref ha-nfsserver | grep cli); echo $a;\
done

virt-021# ls -l /var/lib/nfs/rmtab 
-rw-r-----. 1 root root 182655 Oct 15 12:42 /var/lib/nfs/rmtab


PATCHED VERSION (resource-agents-3.9.2-40.el6.x86_64)
=====================================================
virt-021#  grep -A 12 'restore_rmtab()' \
>     /usr/lib/ocf/resource.d/heartbeat/exportfs 
restore_rmtab() {
    local rmtab_backup
    if [ ${OCF_RESKEY_rmtab_backup} != "none" ]; then
        rmtab_backup="${OCF_RESKEY_directory}/${OCF_RESKEY_rmtab_backup}"
        if [ -r ${rmtab_backup} ]; then
            local tmpf=`mktemp`
            sort -u ${rmtab_backup} /var/lib/nfs/rmtab > $tmpf &&
                install -o root -m 644 $tmpf /var/lib/nfs/rmtab
            rm -f $tmpf
            ocf_log debug "Restored `wc -l ${rmtab_backup}` rmtab entries from ${rmtab_backup}."
        else
            ocf_log warn "rmtab backup ${rmtab_backup} not found or not readable."
        fi

virt-021# for a in $(seq 1 5); do \
pcs resource move ha-nfsserver; sleep 5; \
pcs resource move ha-nfsserver; sleep 5; \
pcs constraint remove $(pcs constraint ref ha-nfsserver | grep cli); echo $a;\
done

virt-021# ls -l /var/lib/nfs/rmtab
-rw-r--r--. 1 root root 27 Oct 15 12:58 /var/lib/nfs/rmtab

Comment 20 errata-xmlrpc 2013-11-21 05:17:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1541.html


Note You need to log in before you can comment on or make changes to this bug.