Created attachment 1482401 [details] pcs cluster report Anonymous clone of clvmd causes restart of all dependent services on node re-join. We have two node cluster (1) (2) that basically have the following dependency tree: dlm-clone -> clvmd-clone --------> mysql-group (vip, clustered vg, gfs2, db in docker container) docker-clone ---/ --> meaning order and colocation. When one of the node1 is shut down resources are correctly moved to the node2. When node1 rejoins cluster all resources dependant on clvmd get another restart. This seems like very bad thing to happen because cluster node joining the cluster must not cause peacufully running resources restart. Steps to reproduce. 1/ Configure cluster with said dependencies (1) (2) 2/ pcs cluster stop node1 3/ pcs cluster start node1 Actual result: restart of all peacefully running resources on node2 Expected result: resources kept running on node2 Reproducibility: always Additional information: # rpm -q pacemaker resource-agents pacemaker-1.1.19-7.el7.x86_64 resource-agents-4.1.1-10.el7.x86_64 We use 'pcs resource defaults resource-stickiness=200' so the resources do not relocate when new node joins (step 3). We see the same bahaviour in RHEL7.5. The problem is shown by this log excerpt: Sep 11 18:13:54 virt-246 corosync[31438]: [TOTEM ] A new membership (10.37.167.116:772) was formed. Members joined: 1 Sep 11 18:13:54 virt-246 corosync[31438]: [QUORUM] Members[2]: 1 2 Sep 11 18:13:54 virt-246 corosync[31438]: [MAIN ] Completed service synchronization, ready to provide service. Sep 11 18:13:54 virt-246 crmd[31454]: notice: Node virt-245 state is now member Sep 11 18:13:54 virt-246 pacemakerd[31448]: notice: Node virt-245 state is now member Sep 11 18:13:56 virt-246 crmd[31454]: notice: High CPU load detected: 1.160000 Sep 11 18:13:56 virt-246 stonith-ng[31450]: notice: Node virt-245 state is now member Sep 11 18:13:56 virt-246 attrd[31452]: notice: Node virt-245 state is now member Sep 11 18:13:56 virt-246 cib[31449]: notice: Node virt-245 state is now member Sep 11 18:13:57 virt-246 crmd[31454]: notice: State transition S_IDLE -> S_INTEGRATION Sep 11 18:14:01 virt-246 pengine[31453]: warning: Processing failed monitor of clvmd:0 on virt-246: not running Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Start dlm:1 ( virt-245 ) Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Restart clvmd:0 ( virt-246 ) due to required dlm-clone running Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Start clvmd:1 ( virt-245 ) Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Start dockerd:1 ( virt-245 ) Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Restart db-stage-vip ( virt-246 ) due to required clvmd-clone running Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Restart db-stage-lvm ( virt-246 ) due to required db-stage-vip start Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Restart db-stage-fs ( virt-246 ) due to required db-stage-lvm start Sep 11 18:14:01 virt-246 pengine[31453]: notice: * Restart mysql-stage ( virt-246 ) due to required db-stage-fs start Sep 11 18:14:01 virt-246 pengine[31453]: notice: Calculated transition 27, saving inputs in /var/lib/pacemaker/pengine/pe-input-386.bz2 It seems that dlm:1 is started and this causes clvmd to be restarted and this leads to chain restart of all dependencies. (1) [root@virt-246 ~]# pcs status Cluster name: el-cluster Stack: corosync Current DC: virt-246 (version 1.1.19-7.el7-c3c624ea3d) - partition with quorum Last updated: Tue Sep 11 18:20:39 2018 Last change: Tue Sep 11 18:13:32 2018 by root via cibadmin on virt-246 2 nodes configured 24 resources configured (10 DISABLED) Online: [ virt-245 virt-246 ] Full list of resources: fence-virt-245 (stonith:fence_xvm): Started virt-246 fence-virt-246 (stonith:fence_xvm): Started virt-246 Clone Set: dlm-clone [dlm] Started: [ virt-245 virt-246 ] Clone Set: clvmd-clone [clvmd] Started: [ virt-245 virt-246 ] Clone Set: dockerd-clone [dockerd] Started: [ virt-245 virt-246 ] Resource Group: mysql-g-stage db-stage-vip (ocf::heartbeat:IPaddr): Started virt-246 db-stage-lvm (ocf::heartbeat:LVM): Started virt-246 db-stage-fs (ocf::heartbeat:Filesystem): Started virt-246 mysql-stage (ocf::heartbeat:docker): Started virt-246 Resource Group: mysql-g-live db-live-vip (ocf::heartbeat:IPaddr): Stopped (disabled) db-live-lvm (ocf::heartbeat:LVM): Stopped (disabled) db-live-fs (ocf::heartbeat:Filesystem): Stopped (disabled) mysql-live (ocf::heartbeat:docker): Stopped (disabled) Clone Set: container-shared-fs-clone [container-shared-fs] Stopped (disabled): [ virt-245 virt-246 ] Resource Group: cqe-frontend-stage frontend-stage-vip (ocf::heartbeat:IPaddr): Stopped frontend-stage (ocf::heartbeat:docker): Stopped Resource Group: cqe-frontend-live frontend-live-vip (ocf::heartbeat:IPaddr): Stopped frontend-live (ocf::heartbeat:docker): Stopped cqe-dispatcher-live (ocf::heartbeat:docker): Stopped cqe-dispatcher-stage (ocf::heartbeat:docker): Stopped Failed Actions: * clvmd:1_monitor_30000 on virt-246 'not running' (7): call=111, status=complete, exitreason='', last-rc-change='Tue Sep 11 17:46:15 2018', queued=0ms, exec=386ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled (2): [root@virt-246 ~]# pcs config Cluster Name: el-cluster Corosync Nodes: virt-245 virt-246 Pacemaker Nodes: virt-245 virt-246 Resources: Clone: dlm-clone Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: monitor interval=10 start-delay=0 timeout=20 (dlm-monitor-interval-10) start interval=0s timeout=90 (dlm-start-interval-0s) stop interval=0s timeout=100 (dlm-stop-interval-0s) Clone: clvmd-clone Meta Attrs: with_cmirrord=1 Resource: clvmd (class=ocf provider=heartbeat type=clvm) Operations: monitor interval=30 timeout=90 (clvmd-monitor-interval-30) start interval=0s timeout=90 (clvmd-start-interval-0s) stop interval=0s timeout=90 (clvmd-stop-interval-0s) Clone: dockerd-clone Resource: dockerd (class=systemd type=docker) Operations: monitor interval=60 timeout=100 (dockerd-monitor-interval-60) start interval=0s timeout=100 (dockerd-start-interval-0s) stop interval=0s timeout=100 (dockerd-stop-interval-0s) Group: mysql-g-stage Resource: db-stage-vip (class=ocf provider=heartbeat type=IPaddr) Attributes: cidr_netmask=22 ip=10.37.165.142 Operations: monitor interval=10s timeout=20s (db-stage-vip-monitor-interval-10s) start interval=0s timeout=20s (db-stage-vip-start-interval-0s) stop interval=0s timeout=20s (db-stage-vip-stop-interval-0s) Resource: db-stage-lvm (class=ocf provider=heartbeat type=LVM) Attributes: volgrpname=storage-db-stage Operations: methods interval=0s timeout=5s (db-stage-lvm-methods-interval-0s) monitor interval=10s timeout=30s (db-stage-lvm-monitor-interval-10s) start interval=0s timeout=30s (db-stage-lvm-start-interval-0s) stop interval=0s timeout=30s (db-stage-lvm-stop-interval-0s) Resource: db-stage-fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/storage-db-stage/db-stage directory=/var/lib/mysql-stage fstype=gfs2 Operations: monitor interval=20s timeout=40s (db-stage-fs-monitor-interval-20s) notify interval=0s timeout=60s (db-stage-fs-notify-interval-0s) start interval=0s timeout=60s (db-stage-fs-start-interval-0s) stop interval=0s timeout=60s (db-stage-fs-stop-interval-0s) Resource: mysql-stage (class=ocf provider=heartbeat type=docker) Attributes: image=docker.io/mariadb:10.3 run_opts="--user 5010:5010 --volume /var/lib/mysql-stage:/var/lib/mysql --volume /shared/containers/configs/mysql-stage:/etc/mysql/conf.d --volume /shared/containers/logs/mysql-stage:/var/log/mysql --publish 10.37.165.142:3306:3306" Operations: monitor interval=30s timeout=30s (mysql-stage-monitor-interval-30s) start interval=0s timeout=90s (mysql-stage-start-interval-0s) stop interval=0s timeout=90s (mysql-stage-stop-interval-0s) Group: mysql-g-live Meta Attrs: target-role=Stopped Resource: db-live-vip (class=ocf provider=heartbeat type=IPaddr) Attributes: cidr_netmask=22 ip=10.37.165.133 Operations: monitor interval=10s timeout=20s (db-live-vip-monitor-interval-10s) start interval=0s timeout=20s (db-live-vip-start-interval-0s) stop interval=0s timeout=20s (db-live-vip-stop-interval-0s) Resource: db-live-lvm (class=ocf provider=heartbeat type=LVM) Attributes: volgrpname=storage-db-live Operations: methods interval=0s timeout=5s (db-live-lvm-methods-interval-0s) monitor interval=10s timeout=30s (db-live-lvm-monitor-interval-10s) start interval=0s timeout=30s (db-live-lvm-start-interval-0s) stop interval=0s timeout=30s (db-live-lvm-stop-interval-0s) Resource: db-live-fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/storage-db-live/db-live directory=/var/lib/mysql-live fstype=gfs2 Operations: monitor interval=20s timeout=40s (db-live-fs-monitor-interval-20s) notify interval=0s timeout=60s (db-live-fs-notify-interval-0s) start interval=0s timeout=60s (db-live-fs-start-interval-0s) stop interval=0s timeout=60s (db-live-fs-stop-interval-0s) Resource: mysql-live (class=ocf provider=heartbeat type=docker) Attributes: image=docker.io/mariadb:10.3 run_opts="--user 5020:5020 --volume /var/lib/mysql-live:/var/lib/mysql --volume /shared/containers/configs/mysql-live:/etc/mysql/conf.d --volume /shared/containers/logs/mysql-live:/var/log/mysql --publish 10.37.165.133:3306:3306" Operations: monitor interval=30s timeout=30s (mysql-live-monitor-interval-30s) start interval=0s timeout=90s (mysql-live-start-interval-0s) stop interval=0s timeout=90s (mysql-live-stop-interval-0s) Clone: container-shared-fs-clone Meta Attrs: interleave=true target-role=Stopped Resource: container-shared-fs (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/mapper/storage--data-container--logs directory=/shared/containers fstype=gfs2 Operations: monitor interval=20s timeout=40s (container-shared-fs-monitor-interval-20s) notify interval=0s timeout=60s (container-shared-fs-notify-interval-0s) start interval=0s timeout=60s (container-shared-fs-start-interval-0s) stop interval=0s timeout=60s (container-shared-fs-stop-interval-0s) Group: cqe-frontend-stage Resource: frontend-stage-vip (class=ocf provider=heartbeat type=IPaddr) Attributes: cidr_netmask=22 ip=10.37.165.165 Operations: monitor interval=10s timeout=20s (frontend-stage-vip-monitor-interval-10s) start interval=0s timeout=20s (frontend-stage-vip-start-interval-0s) stop interval=0s timeout=20s (frontend-stage-vip-stop-interval-0s) Resource: frontend-stage (class=ocf provider=heartbeat type=docker) Attributes: image=docker-registry.engineering.redhat.com/cqe/frontend:latest-stage monitor_cmd="curl http://localhost:8080/clusterqe/" run_opts="--env-file=/shared/containers/configs/container-variables-stage --volume /shared/containers/logs/frontend-stage:/var/log --volume /shared/containers/configs/cqe:/etc/cluster-django --publish 10.37.165.165:80:8080" Operations: monitor interval=30s timeout=30s (frontend-stage-monitor-interval-30s) start interval=0s timeout=240s (frontend-stage-start-interval-0s) stop interval=0s timeout=90s (frontend-stage-stop-interval-0s) Group: cqe-frontend-live Resource: frontend-live-vip (class=ocf provider=heartbeat type=IPaddr) Attributes: cidr_netmask=22 ip=10.37.165.220 Operations: monitor interval=10s timeout=20s (frontend-live-vip-monitor-interval-10s) start interval=0s timeout=20s (frontend-live-vip-start-interval-0s) stop interval=0s timeout=20s (frontend-live-vip-stop-interval-0s) Resource: frontend-live (class=ocf provider=heartbeat type=docker) Attributes: image=docker-registry.engineering.redhat.com/cqe/frontend:latest-live monitor_cmd="curl http://localhost:8080/clusterqe/" run_opts="--env-file=/shared/containers/configs/container-variables-live --volume /shared/containers/logs/frontend-live:/var/log --volume /shared/containers/configs/cqe:/etc/cluster-django --publish 10.37.165.220:80:8080" Operations: monitor interval=30s timeout=30s (frontend-live-monitor-interval-30s) start interval=0s timeout=240s (frontend-live-start-interval-0s) stop interval=0s timeout=90s (frontend-live-stop-interval-0s) Resource: cqe-dispatcher-live (class=ocf provider=heartbeat type=docker) Attributes: image=docker-registry.engineering.redhat.com/cqe/dispatcher:latest-live run_opts="--env-file=/shared/containers/configs/container-variables-live --volume /shared/containers/logs/frontend-live:/var/log --volume /shared/containers/configs/cqe:/etc/cluster-django" Operations: monitor interval=30s timeout=30s (cqe-dispatcher-live-monitor-interval-30s) start interval=0s timeout=90s (cqe-dispatcher-live-start-interval-0s) stop interval=0s timeout=90s (cqe-dispatcher-live-stop-interval-0s) Resource: cqe-dispatcher-stage (class=ocf provider=heartbeat type=docker) Attributes: image=docker-registry.engineering.redhat.com/cqe/dispatcher:latest-stage run_opts="--env-file=/shared/containers/configs/container-variables-stage --volume /shared/containers/logs/frontend-stage:/var/log --volume /shared/containers/configs/cqe:/etc/cluster-django" Operations: monitor interval=30s timeout=30s (cqe-dispatcher-stage-monitor-interval-30s) start interval=0s timeout=90s (cqe-dispatcher-stage-start-interval-0s) stop interval=0s timeout=90s (cqe-dispatcher-stage-stop-interval-0s) Stonith Devices: Resource: fence-virt-245 (class=stonith type=fence_xvm) Attributes: delay=5 pcmk_host_check=static-list pcmk_host_list=virt-245 pcmk_host_map=virt-245:virt-245.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-245-monitor-interval-60s) Resource: fence-virt-246 (class=stonith type=fence_xvm) Attributes: delay=5 pcmk_host_check=static-list pcmk_host_list=virt-246 pcmk_host_map=virt-246:virt-246.cluster-qe.lab.eng.brq.redhat.com Operations: monitor interval=60s (fence-virt-246-monitor-interval-60s) Fencing Levels: Location Constraints: Ordering Constraints: start dlm-clone then start clvmd-clone (kind:Mandatory) start clvmd-clone then start mysql-g-stage (kind:Mandatory) start dockerd-clone then start mysql-g-stage (kind:Mandatory) start clvmd-clone then start mysql-g-live (kind:Mandatory) start dockerd-clone then start mysql-g-live (kind:Mandatory) start clvmd-clone then start container-shared-fs-clone (kind:Mandatory) start container-shared-fs-clone then start cqe-frontend-stage (kind:Mandatory) start dockerd-clone then start cqe-frontend-stage (kind:Mandatory) start mysql-g-stage then start cqe-frontend-stage (kind:Mandatory) start container-shared-fs-clone then start cqe-frontend-live (kind:Mandatory) start dockerd-clone then start cqe-frontend-live (kind:Mandatory) start mysql-g-live then start cqe-frontend-live (kind:Mandatory) start cqe-frontend-live then start cqe-dispatcher-live (kind:Mandatory) start cqe-frontend-stage then start cqe-dispatcher-stage (kind:Mandatory) Colocation Constraints: clvmd-clone with dlm-clone (score:INFINITY) mysql-g-stage with clvmd-clone (score:INFINITY) mysql-g-stage with dockerd-clone (score:INFINITY) mysql-g-live with clvmd-clone (score:INFINITY) mysql-g-live with dockerd-clone (score:INFINITY) container-shared-fs-clone with clvmd-clone (score:INFINITY) cqe-frontend-stage with container-shared-fs-clone (score:INFINITY) cqe-frontend-stage with dockerd-clone (score:INFINITY) cqe-frontend-stage with mysql-g-stage (score:100) cqe-frontend-live with container-shared-fs-clone (score:INFINITY) cqe-frontend-live with dockerd-clone (score:INFINITY) cqe-frontend-live with mysql-g-live (score:100) cqe-dispatcher-live with container-shared-fs-clone (score:INFINITY) cqe-dispatcher-live with dockerd-clone (score:INFINITY) cqe-dispatcher-live with mysql-g-live (score:100) cqe-dispatcher-stage with container-shared-fs-clone (score:INFINITY) cqe-dispatcher-stage with dockerd-clone (score:INFINITY) cqe-dispatcher-stage with mysql-g-stage (score:100) Ticket Constraints: Alerts: No alerts defined Resources Defaults: resource-stickiness: 200 Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: el-cluster dc-version: 1.1.19-7.el7-c3c624ea3d have-watchdog: false Quorum: Options:
This is the expected behavior when the "interleave" clone option is not set: "When this clone is ordered relative to another clone, if this option is false (the default), the ordering is relative to all instances of the other clone, whereas if this option is true, the ordering is relative only to instances on the same node. Allowed values: false, true" Let me know if that doesn't solve the issue.
Yes, this seems to solve the issue. In the light of the events causing me to fill in this bug it seems only logical to want by default the opposite behaviour to current state. Is there a reason why default for clones is interleave=false?
(In reply to michal novacek from comment #3) > Yes, this seems to solve the issue. > > In the light of the events causing me to fill in this bug it seems only > logical to want by default the opposite behaviour to current state. Is there > a reason why default for clones is interleave=false? interleave=false is considered safer, since Pacemaker can't know whether the applications involved can function properly with interleave=true. I.e. if false is the default, the worst that happens is unnecessary restarts for applications that would benefit from true; whereas if true is the default, resource failure is guaranteed for applications that don't support it. At this point, it also is important for backward compatibility.