Red Hat Bugzilla – Bug 784933
exportfs agent doubles rmtab on each relocation
Last modified: 2013-11-21 00:17:04 EST
Description of problem: when exportfs relocates the exported share, rmtab size is doubled. this is probably due to restore_rmtab call in the agent where it does " cat ${rmtab_backup} >> /var/lib/nfs/rmtab" During shutdown this file is grepped and the result sent to rmtab. This way the file grows twice it's size on each graceful relocation eventually leading to unavailability of the service when the grow and copy operations become too slow to be completed in time. Probably some sort of sort | uniq would help here, as the file was full of duplicated entries. Version-Release number of selected component (if applicable): pacemaker-1.1.6-3.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. setup nfs server + nfs export of gfs2 filesystem (see below) 2. mount the share from the client (1 entry now in /var/lib/nfs/rmtab) 3. relocate the service (crm resource move nfsgroup) 4. see the entry doubled in /var/lib/nfs/rmtab Actual results: rmtab growing on each relocation until the service cannot be relocated any more. Expected results: file not growing and not containing duplicate entries Additional info: crm configure show node node01 node node02 node node03 primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="192.168.100.11" cidr_netmask="32" \ op monitor interval="30s" primitive datadir ocf:heartbeat:exportfs \ params clientspec="*" directory="/mnt/vedder0" fsid="4" options="all_squash,rw" primitive gfs2 ocf:heartbeat:Filesystem \ params device="/dev/rhts_cluster/vedder0" directory="/mnt/vedder0" fstype="gfs2" options="noatime" primitive nfsserver ocf:heartbeat:nfsserver \ params nfs_init_script="/etc/init.d/nfs" nfs_shared_infodir="/mnt/vedder0/nfs" nfs_ip="192.168.100.11" nfs_notify_cmd="/usr/sbin/sm-notify" group nfsgroup nfsserver datadir ClusterIP \ meta target-role="Started" clone gfs2clone gfs2 \ meta target-role="Started"
Looks like an issue with the agent. Re-assigning.
Jaroslav, Do you know what version of resource-agents you had installed? Was is just what was included in 6.2? I'm having trouble getting things into /var/lib/nfs/rmtab can you also let me know what kernel you're using as well? Thanks! Chris
If I remember correctly, it was 6.2 GA, same applies for the kernel.
I was facing the same problem in a failover cluster implementation based on RHEL 6.2 and using the following packages: pacemaker-libs-1.1.6-3.el6.x86_64 pacemaker-cluster-libs-1.1.6-3.el6.x86_64 pacemaker-1.1.6-3.el6.x86_64 pacemaker-cli-1.1.6-3.el6.x86_64 corosync-1.4.1-4.el6_2.2.x86_64 corosynclib-1.4.1-4.el6_2.2.x86_64 heartbeat-libs-3.0.4-1.el6.x86_64 heartbeat-3.0.4-1.el6.x86_64 I've fixed the problem changing the script /usr/lib/ocf/resource.d/heartbeat/exportfs, and inserting the following two lines of code: grep -v ":${OCF_RESKEY_directory}:" /var/lib/nfs/rmtab > /var/lib/nfs/rmtab.tmp mv -f /var/lib/nfs/rmtab.tmp /var/lib/nfs/rmtab The two lines above are inserted before the following line: cat ${rmtab_backup} >> /var/lib/nfs/rmtab
This issue has been resolved in the latest resource-agents build. The upstream patch related to this issue can be found here. https://github.com/ClusterLabs/resource-agents/commit/bbc90e9de8636609842fb01219e8d9c789d8a623
This has been fixed as a result of the heartbeat agent refresh.
I have verified that the /var/lib/nfs/rmtab file size does not grow exponentially with the patched version of resource-agents-3.9.2-40.el6.x86_64 after moving the nfs server 10 times. setup of the cluster and resources is as follows: --------- virt-021# pcs status Cluster name: STSRHTS11429 Last updated: Tue Oct 15 11:42:28 2013 Last change: Tue Oct 15 11:36:21 2013 via cibadmin on virt-022 Stack: cman Current DC: virt-022 - partition with quorum Version: 1.1.10-14.el6-368c726 3 Nodes configured 7 Resources configured Online: [ virt-020 virt-021 virt-022 ] Full list of resources: virt-fencing (stonith:fence_xvm): Started virt-020 Resource Group: ha-nfsserver vip (ocf::heartbeat:IPaddr2): Started virt-021 nfs-server (ocf::heartbeat:nfsserver): Started virt-021 nfs-export (ocf::heartbeat:exportfs): Started virt-021 Clone Set: nfs-shared-fs-clone [nfs-shared-fs] Started: [ virt-020 virt-021 virt-022 ] --------- virt-021# pcs resource show vip nfs-server nfs-export nfs-shared-fs-clone Resource: vip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.34.70.217 cidr_netmask=23 Operations: monitor interval=30s (vip-monitor-interval-30s) Resource: nfs-server (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_ip=10.34.70.217 nfs_init_script=/etc/init.d/nfs \ nfs_shared_infodir=/mnt/nfs nfs_notify_cmd=/usr/sbin/sm-notify Operations: monitor interval=30s (nfs-server-monitor-interval-30s) Resource: nfs-export (class=ocf provider=heartbeat type=exportfs) Attributes: directory=/mnt clientspec=* options=rw,async,no_all_squash fsid=238 Operations: monitor interval=60s (nfs-export-monitor-interval-60s) Clone: nfs-shared-fs-clone Resource: nfs-shared-fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sda directory=/mnt fstype=gfs2 options= Operations: monitor interval=30s (nfs-shared-fs-monitor-interval-30s) --------- virt-021# ls -l /var/lib/nfs/rmtab -rw-r-----. 1 root root 0 Oct 15 11:42 /var/lib/nfs/rmtab mounted nfs share from outside of the cluster with this command: # mount 10.34.70.217:/mnt /exports -o nfsvers=3 --------- virt-021# ls -l /var/lib/nfs/rmtab -rw-r-----. 1 root root 27 Oct 15 11:45 /var/lib/nfs/rmtab WITHOUT A PATCH (resource-agents-3.9.2-22.el6.x86_64): ====================================================== virt-021# grep -A 12 'restore_rmtab()' \ /usr/lib/ocf/resource.d/heartbeat/exportfs restore_rmtab() { local rmtab_backup if [ ${OCF_RESKEY_rmtab_backup} != "none" ]; then rmtab_backup="${OCF_RESKEY_directory}/${OCF_RESKEY_rmtab_backup}" if [ -r ${rmtab_backup} ]; then cat ${rmtab_backup} >> /var/lib/nfs/rmtab ocf_log debug "Restored `wc -l ${rmtab_backup}` rmtab entries from ${rmtab_backup}." else ocf_log warn "rmtab backup ${rmtab_backup} not found or not readable." fi fi } virt-021# for a in $(seq 1 5); do \ pcs resource move ha-nfsserver; sleep 5; \ pcs resource move ha-nfsserver; sleep 5; \ pcs constraint remove $(pcs constraint ref ha-nfsserver | grep cli); echo $a;\ done virt-021# ls -l /var/lib/nfs/rmtab -rw-r-----. 1 root root 182655 Oct 15 12:42 /var/lib/nfs/rmtab PATCHED VERSION (resource-agents-3.9.2-40.el6.x86_64) ===================================================== virt-021# grep -A 12 'restore_rmtab()' \ > /usr/lib/ocf/resource.d/heartbeat/exportfs restore_rmtab() { local rmtab_backup if [ ${OCF_RESKEY_rmtab_backup} != "none" ]; then rmtab_backup="${OCF_RESKEY_directory}/${OCF_RESKEY_rmtab_backup}" if [ -r ${rmtab_backup} ]; then local tmpf=`mktemp` sort -u ${rmtab_backup} /var/lib/nfs/rmtab > $tmpf && install -o root -m 644 $tmpf /var/lib/nfs/rmtab rm -f $tmpf ocf_log debug "Restored `wc -l ${rmtab_backup}` rmtab entries from ${rmtab_backup}." else ocf_log warn "rmtab backup ${rmtab_backup} not found or not readable." fi virt-021# for a in $(seq 1 5); do \ pcs resource move ha-nfsserver; sleep 5; \ pcs resource move ha-nfsserver; sleep 5; \ pcs constraint remove $(pcs constraint ref ha-nfsserver | grep cli); echo $a;\ done virt-021# ls -l /var/lib/nfs/rmtab -rw-r--r--. 1 root root 27 Oct 15 12:58 /var/lib/nfs/rmtab
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1541.html