Bug 784933
| Summary: | exportfs agent doubles rmtab on each relocation | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jaroslav Kortus <jkortus> |
| Component: | resource-agents | Assignee: | David Vossel <dvossel> |
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.2 | CC: | agk, cluster-maint, ddumas, dvossel, fdinitto, gh05t.7id37, lhh, mnovacek |
| Target Milestone: | rc | Keywords: | TechPreview |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | resource-agents-3.9.2-29.el6 | Doc Type: | Technology Preview |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-11-21 05:17:04 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Looks like an issue with the agent. Re-assigning. Jaroslav, Do you know what version of resource-agents you had installed? Was is just what was included in 6.2? I'm having trouble getting things into /var/lib/nfs/rmtab can you also let me know what kernel you're using as well? Thanks! Chris If I remember correctly, it was 6.2 GA, same applies for the kernel. I was facing the same problem in a failover cluster implementation based on RHEL 6.2 and using the following packages:
pacemaker-libs-1.1.6-3.el6.x86_64
pacemaker-cluster-libs-1.1.6-3.el6.x86_64
pacemaker-1.1.6-3.el6.x86_64
pacemaker-cli-1.1.6-3.el6.x86_64
corosync-1.4.1-4.el6_2.2.x86_64
corosynclib-1.4.1-4.el6_2.2.x86_64
heartbeat-libs-3.0.4-1.el6.x86_64
heartbeat-3.0.4-1.el6.x86_64
I've fixed the problem changing the script /usr/lib/ocf/resource.d/heartbeat/exportfs, and inserting the following two lines of code:
grep -v ":${OCF_RESKEY_directory}:" /var/lib/nfs/rmtab > /var/lib/nfs/rmtab.tmp
mv -f /var/lib/nfs/rmtab.tmp /var/lib/nfs/rmtab
The two lines above are inserted before the following line:
cat ${rmtab_backup} >> /var/lib/nfs/rmtab
This issue has been resolved in the latest resource-agents build. The upstream patch related to this issue can be found here. https://github.com/ClusterLabs/resource-agents/commit/bbc90e9de8636609842fb01219e8d9c789d8a623 This has been fixed as a result of the heartbeat agent refresh.
I have verified that the /var/lib/nfs/rmtab file size does not grow
exponentially with the patched version of resource-agents-3.9.2-40.el6.x86_64
after moving the nfs server 10 times.
setup of the cluster and resources is as follows:
---------
virt-021# pcs status
Cluster name: STSRHTS11429
Last updated: Tue Oct 15 11:42:28 2013
Last change: Tue Oct 15 11:36:21 2013 via cibadmin on virt-022
Stack: cman
Current DC: virt-022 - partition with quorum
Version: 1.1.10-14.el6-368c726
3 Nodes configured
7 Resources configured
Online: [ virt-020 virt-021 virt-022 ]
Full list of resources:
virt-fencing (stonith:fence_xvm): Started virt-020
Resource Group: ha-nfsserver
vip (ocf::heartbeat:IPaddr2): Started virt-021
nfs-server (ocf::heartbeat:nfsserver): Started virt-021
nfs-export (ocf::heartbeat:exportfs): Started virt-021
Clone Set: nfs-shared-fs-clone [nfs-shared-fs]
Started: [ virt-020 virt-021 virt-022 ]
---------
virt-021# pcs resource show vip nfs-server nfs-export nfs-shared-fs-clone
Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=10.34.70.217 cidr_netmask=23
Operations: monitor interval=30s (vip-monitor-interval-30s)
Resource: nfs-server (class=ocf provider=heartbeat type=nfsserver)
Attributes: nfs_ip=10.34.70.217 nfs_init_script=/etc/init.d/nfs \
nfs_shared_infodir=/mnt/nfs nfs_notify_cmd=/usr/sbin/sm-notify
Operations: monitor interval=30s (nfs-server-monitor-interval-30s)
Resource: nfs-export (class=ocf provider=heartbeat type=exportfs)
Attributes: directory=/mnt clientspec=* options=rw,async,no_all_squash fsid=238
Operations: monitor interval=60s (nfs-export-monitor-interval-60s)
Clone: nfs-shared-fs-clone
Resource: nfs-shared-fs (class=ocf provider=heartbeat type=Filesystem)
Attributes: device=/dev/sda directory=/mnt fstype=gfs2 options=
Operations: monitor interval=30s (nfs-shared-fs-monitor-interval-30s)
---------
virt-021# ls -l /var/lib/nfs/rmtab
-rw-r-----. 1 root root 0 Oct 15 11:42 /var/lib/nfs/rmtab
mounted nfs share from outside of the cluster with this command:
# mount 10.34.70.217:/mnt /exports -o nfsvers=3
---------
virt-021# ls -l /var/lib/nfs/rmtab
-rw-r-----. 1 root root 27 Oct 15 11:45 /var/lib/nfs/rmtab
WITHOUT A PATCH (resource-agents-3.9.2-22.el6.x86_64):
======================================================
virt-021# grep -A 12 'restore_rmtab()' \
/usr/lib/ocf/resource.d/heartbeat/exportfs
restore_rmtab() {
local rmtab_backup
if [ ${OCF_RESKEY_rmtab_backup} != "none" ]; then
rmtab_backup="${OCF_RESKEY_directory}/${OCF_RESKEY_rmtab_backup}"
if [ -r ${rmtab_backup} ]; then
cat ${rmtab_backup} >> /var/lib/nfs/rmtab
ocf_log debug "Restored `wc -l ${rmtab_backup}` rmtab entries from ${rmtab_backup}."
else
ocf_log warn "rmtab backup ${rmtab_backup} not found or not readable."
fi
fi
}
virt-021# for a in $(seq 1 5); do \
pcs resource move ha-nfsserver; sleep 5; \
pcs resource move ha-nfsserver; sleep 5; \
pcs constraint remove $(pcs constraint ref ha-nfsserver | grep cli); echo $a;\
done
virt-021# ls -l /var/lib/nfs/rmtab
-rw-r-----. 1 root root 182655 Oct 15 12:42 /var/lib/nfs/rmtab
PATCHED VERSION (resource-agents-3.9.2-40.el6.x86_64)
=====================================================
virt-021# grep -A 12 'restore_rmtab()' \
> /usr/lib/ocf/resource.d/heartbeat/exportfs
restore_rmtab() {
local rmtab_backup
if [ ${OCF_RESKEY_rmtab_backup} != "none" ]; then
rmtab_backup="${OCF_RESKEY_directory}/${OCF_RESKEY_rmtab_backup}"
if [ -r ${rmtab_backup} ]; then
local tmpf=`mktemp`
sort -u ${rmtab_backup} /var/lib/nfs/rmtab > $tmpf &&
install -o root -m 644 $tmpf /var/lib/nfs/rmtab
rm -f $tmpf
ocf_log debug "Restored `wc -l ${rmtab_backup}` rmtab entries from ${rmtab_backup}."
else
ocf_log warn "rmtab backup ${rmtab_backup} not found or not readable."
fi
virt-021# for a in $(seq 1 5); do \
pcs resource move ha-nfsserver; sleep 5; \
pcs resource move ha-nfsserver; sleep 5; \
pcs constraint remove $(pcs constraint ref ha-nfsserver | grep cli); echo $a;\
done
virt-021# ls -l /var/lib/nfs/rmtab
-rw-r--r--. 1 root root 27 Oct 15 12:58 /var/lib/nfs/rmtab
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1541.html |
Description of problem: when exportfs relocates the exported share, rmtab size is doubled. this is probably due to restore_rmtab call in the agent where it does " cat ${rmtab_backup} >> /var/lib/nfs/rmtab" During shutdown this file is grepped and the result sent to rmtab. This way the file grows twice it's size on each graceful relocation eventually leading to unavailability of the service when the grow and copy operations become too slow to be completed in time. Probably some sort of sort | uniq would help here, as the file was full of duplicated entries. Version-Release number of selected component (if applicable): pacemaker-1.1.6-3.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. setup nfs server + nfs export of gfs2 filesystem (see below) 2. mount the share from the client (1 entry now in /var/lib/nfs/rmtab) 3. relocate the service (crm resource move nfsgroup) 4. see the entry doubled in /var/lib/nfs/rmtab Actual results: rmtab growing on each relocation until the service cannot be relocated any more. Expected results: file not growing and not containing duplicate entries Additional info: crm configure show node node01 node node02 node node03 primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="192.168.100.11" cidr_netmask="32" \ op monitor interval="30s" primitive datadir ocf:heartbeat:exportfs \ params clientspec="*" directory="/mnt/vedder0" fsid="4" options="all_squash,rw" primitive gfs2 ocf:heartbeat:Filesystem \ params device="/dev/rhts_cluster/vedder0" directory="/mnt/vedder0" fstype="gfs2" options="noatime" primitive nfsserver ocf:heartbeat:nfsserver \ params nfs_init_script="/etc/init.d/nfs" nfs_shared_infodir="/mnt/vedder0/nfs" nfs_ip="192.168.100.11" nfs_notify_cmd="/usr/sbin/sm-notify" group nfsgroup nfsserver datadir ClusterIP \ meta target-role="Started" clone gfs2clone gfs2 \ meta target-role="Started"