Red Hat Bugzilla – Bug 1384955
when nfsserver resource stops rpcbind depending non-clustered services stop too
Last modified: 2017-08-01 10:55:11 EDT
Description of problem: nfsserver resource agent stops rpcbind deamon as part of stop sequence. This causes other non-clustered services which utilize rpcbind (NIS / ypbind) to stop too. This behaviour is not desired as it created downtime. # systemctl is-enabled ypbind enabled One of the ypbind’s dependencies is rpcbind: # systemctl list-dependencies ypbind | grep rpcbind * |-rpcbind.service * | |-rpcbind.socket The resource agent ((ocf::heartbeat:nfsserver) explicitly starts (Line 736) and stops (Line 900) rpcbind: # grep -n -E "(start|stop) rpcbind" /usr/lib/ocf/resource.d/heartbeat/nfsserver 736: nfs_exec start rpcbind 900: nfs_exec stop rpcbind > /dev/null 2>&1 Version-Release number of selected component (if applicable): resource-agents-3.9.5-54.el7_2.17.x86_64 How reproducible: always Steps to Reproduce: 1. pacemaker based cluster with nfsserver resource 2. NIS/ypbind running on cluster node (not configured as cluster service) 3. stop nfsserver resource Actual results: NIS gets stopped due to rpcbind stop Expected results: Stopping nfsserver resource doesn't affect services outside cluster Additional info: Node1: Clustered nfs service (Active node) Non-clustered nis servce running. Node2: Clustered nfs service (passive node) While clustered nfs service moved from node1 to node2. NIS service will get stopped on node1.
https://github.com/ClusterLabs/resource-agents/pull/869
I have verified that ypbind service will not be stopped (as a result of nfsserver killing rpcbind) when nfsserver is stopped with resource-agents-3.9.5-104.el7 --- Common setup: * setup local nis server on one of the nodes using this howto: https://access.redhat.com/solutions/7247 * check that ypbind service is running > $ systemctl is-active ypbind > active * configure cluster with active/passive nfs [1], [2] before the patch (resource-agents-3.9.5-80.el7) =============================================== [root@host-134 ~]# pcs resource disable nfs-daemon [root@host-134 ~]# systemctl is-active ypbind inactive [root@host-134 ~]# ypcat passwd No such map passwd.byname. Reason: Can't bind to server which serves this domain after the patch (resource-agents-3.9.5-104.el7) =============================================== [root@host-134 ~]# pcs resource disable nfs-daemon [root@host-134 ~]# systemctl is-active ypbind active [root@host-134 ~]# ypcat passwd test:x:1000:1000::/home/test:/bin/bash testmonkey:x:1001:1001::/home/testmonkey:/bin/bash ---- > (2) pcs-status root@host-134 ~]# pcs status Cluster name: STSRHTS10447 Stack: corosync Current DC: host-143 (version 1.1.16-9.el7-94ff4df) - partition with quorum Last updated: Tue Jun 6 07:10:16 2017 Last change: Tue Jun 6 07:10:05 2017 by hacluster via crmd on host-143 3 nodes configured 17 resources configured Online: [ host-134 host-142 host-143 ] Full list of resources: fence-host-134 (stonith:fence_xvm): Started host-142 fence-host-142 (stonith:fence_xvm): Started host-143 fence-host-143 (stonith:fence_xvm): Started host-134 Clone Set: dlm-clone [dlm] Started: [ host-134 host-142 host-143 ] Clone Set: clvmd-clone [clvmd] Started: [ host-134 host-142 host-143 ] Resource Group: hanfs-ap havg (ocf::heartbeat:LVM): Started host-134 mnt-shared (ocf::heartbeat:Filesystem): Started host-134 nfs-daemon (ocf::heartbeat:nfsserver): Started host-134 export-root (ocf::heartbeat:exportfs): Started host-134 export--mnt-shared-0 (ocf::heartbeat:exportfs): Started host-134 export--mnt-shared-1 (ocf::heartbeat:exportfs): Started host-134 vip (ocf::heartbeat:IPaddr2): Started host-134 nfs-notify (ocf::heartbeat:nfsnotify): Started host-134 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled > (1) pcs-config [root@host-134 ~]# pcs config Cluster Name: STSRHTS10447 Corosync Nodes: host-134 host-142 host-143 Pacemaker Nodes: host-134 host-142 host-143 Resources: Clone: dlm-clone Meta Attrs: interleave=true ordered=true Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: monitor interval=30s on-fail=fence (dlm-monitor-interval-30s) start interval=0s timeout=90 (dlm-start-interval-0s) stop interval=0s timeout=100 (dlm-stop-interval-0s) Clone: clvmd-clone Meta Attrs: interleave=true ordered=true Resource: clvmd (class=ocf provider=heartbeat type=clvm) Attributes: with_cmirrord=1 Operations: monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s) start interval=0s timeout=90 (clvmd-start-interval-0s) stop interval=0s timeout=90 (clvmd-stop-interval-0s) Group: hanfs-ap Resource: havg (class=ocf provider=heartbeat type=LVM) Attributes: exclusive=true partial_activation=false volgrpname=shared Operations: monitor interval=10 timeout=30 (havg-monitor-interval-10) start interval=0s timeout=30 (havg-start-interval-0s) stop interval=0s timeout=30 (havg-stop-interval-0s) Resource: mnt-shared (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/shared0 directory=/mnt/shared fstype=ext4 options= Operations: monitor interval=30s (mnt-shared-monitor-interval-30s) start interval=0s timeout=60 (mnt-shared-start-interval-0s) stop interval=0s timeout=60 (mnt-shared-stop-interval-0s) Resource: nfs-daemon (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_no_notify=true nfs_shared_infodir=/mnt/shared/nfs Operations: monitor interval=30s (nfs-daemon-monitor-interval-30s) start interval=0s timeout=90s (nfs-daemon-start-interval-0s) stop interval=0s timeout=20s (nfs-daemon-stop-interval-0s) Resource: export-root (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=* directory=/mnt/shared fsid=354 options=rw Operations: monitor interval=10 timeout=20 (export-root-monitor-interval-10) start interval=0s timeout=40 (export-root-start-interval-0s) stop interval=0s timeout=120 (export-root-stop-interval-0s) Resource: export--mnt-shared-0 (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=* directory=/mnt/shared/0 fsid=1 options=rw Operations: monitor interval=10 timeout=20 (export--mnt-shared-0-monitor-interval-10) start interval=0s timeout=40 (export--mnt-shared-0-start-interval-0s) stop interval=0s timeout=120 (export--mnt-shared-0-stop-interval-0s) Resource: export--mnt-shared-1 (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=* directory=/mnt/shared/1 fsid=2 options=rw Operations: monitor interval=10 timeout=20 (export--mnt-shared-1-monitor-interval-10) start interval=0s timeout=40 (export--mnt-shared-1-start-interval-0s) stop interval=0s timeout=120 (export--mnt-shared-1-stop-interval-0s) Resource: vip (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=22 ip=10.15.107.148 Operations: monitor interval=30s (vip-monitor-interval-30s) start interval=0s timeout=20s (vip-start-interval-0s) stop interval=0s timeout=20s (vip-stop-interval-0s) Resource: nfs-notify (class=ocf provider=heartbeat type=nfsnotify) Attributes: source_host=dhcp-107-148.lab.msp.redhat.com Operations: monitor interval=30 timeout=90 (nfs-notify-monitor-interval-30) start interval=0s timeout=90 (nfs-notify-start-interval-0s) stop interval=0s timeout=90 (nfs-notify-stop-interval-0s) Stonith Devices: Resource: fence-host-134 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=host-134 pcmk_host_map=host-134:host-134.virt.lab.msp.redhat.com Operations: monitor interval=60s (fence-host-134-monitor-interval-60s) Resource: fence-host-142 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=host-142 pcmk_host_map=host-142:host-142.virt.lab.msp.redhat.com Operations: monitor interval=60s (fence-host-142-monitor-interval-60s) Resource: fence-host-143 (class=stonith type=fence_xvm) Attributes: pcmk_host_check=static-list pcmk_host_list=host-143 pcmk_host_map=host-143:host-143.virt.lab.msp.redhat.com Operations: monitor interval=60s (fence-host-143-monitor-interval-60s) Fencing Levels: Location Constraints: Ordering Constraints: start dlm-clone then start clvmd-clone (kind:Mandatory) start clvmd-clone then start hanfs-ap (kind:Mandatory) Colocation Constraints: clvmd-clone with dlm-clone (score:INFINITY) hanfs-ap with clvmd-clone (score:INFINITY) Ticket Constraints: Alerts: No alerts defined Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: STSRHTS10447 dc-version: 1.1.16-9.el7-94ff4df have-watchdog: false last-lrm-refresh: 1496751005 no-quorum-policy: freeze Quorum: Options:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1844