1241511 – dlm_controld waits for fencing which will never occur causing hang

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1241511 - dlm_controld waits for fencing which will never occur causing hang

Summary: dlm_controld waits for fencing which will never occur causing hang

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	dlm
Sub Component:
Version:	7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	David Teigland
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-07-09 11:26 UTC by michal novacek
Modified:	2021-09-03 12:09 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-08-14 15:30:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
pcs cluster report output (710.58 KB, application/x-bzip) 2015-07-09 11:26 UTC, michal novacek	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1268313	0	unspecified	CLOSED	clvmd/dlm resource agent monitor action should recognize it is hung	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1321711	1	None	None	None	2021-03-11 14:31:48 UTC

Internal Links: 1268313 1321711

Description michal novacek 2015-07-09 11:26:12 UTC

Created attachment 1050232 [details]
pcs cluster report output

Description of problem:
Have a quorate cluster running pacemaker cluster with clvmd and dlm clone.
Disabling and enabling network communication between all nodes at the same time
will most of the times (but not always) lead back to quorate cluster without
any fencing. In this case, dlm_controld expects some fencing and will hang
until it occurs. Fencing will never occur because the cluster is quorate with
all nodes.

Then, at the same time, disable network communication between cluster nodes. This will lead to all cluster nodes turning inquorate. 

Version-Release number of selected component (if applicable):
dlm-4.0.2-5.el7.x86_64
lvm2-cluster-2.02.115-3.el7.x86_64
pacemaker-1.1.12-22.el7.x86_64
corosync-2.3.4-4.el7.x86_64

How reproducible: very frequent

Steps to Reproduce:
1. have quorate pacemaker cluster
. check nodes uptime
. disable network communication between all nodes with iptables and wait for
    all nodes turning inquorate
. enable at the same time network communication between nodes 
. check whether fencing occured and if it has not check dlm status and logs

Actual results: dlm hanging

Expected results: dlm happilly working

Additional info:
# tail /var/log/messages
...
Jul  9 12:18:33 virt-020 pengine[2287]: warning: custom_action: Action dlm:2_stop_0 on virt-019 is unrunnable (offline)
Jul  9 12:18:33 virt-020 pengine[2287]: warning: custom_action: Action dlm:2_stop_0 on virt-019 is unrunnable (offline)
Jul  9 12:18:33 virt-020 pengine[2287]: notice: LogActions: Stop    dlm:1       (virt-018 - blocked)
Jul  9 12:18:33 virt-020 pengine[2287]: notice: LogActions: Stop    dlm:2       (virt-019 - blocked)
Jul  9 12:18:41 virt-020 dlm_controld[2438]: 151 daemon joined 2 needs fencing
Jul  9 12:18:41 virt-020 dlm_controld[2438]: 151 daemon joined 1 needs fencing
Jul  9 12:18:41 virt-020 dlm_controld[2438]: 151 daemon node 1 stateful merge
Jul  9 12:18:41 virt-020 dlm_controld[2438]: 151 daemon node 1 stateful merge
Jul  9 12:18:41 virt-020 dlm_controld[2438]: 151 daemon node 2 stateful merge
Jul  9 12:18:41 virt-020 dlm_controld[2438]: 151 daemon node 2 stateful merge
Jul  9 12:19:12 virt-020 dlm_controld[2438]: 183 fence work wait to clear merge 2 clean 1 part 0 gone 0
Jul  9 12:19:39 virt-020 dlm_controld[2438]: 210 clvmd wait for fencing

Comment 4 David Teigland 2015-07-21 15:37:27 UTC

"Fencing will never occur because the cluster is quorate with all nodes"

If stateful cluster nodes fail, they need to be fenced.

If dlm is in charge of fencing, then stateful cluster merges are a situation where you might need to manually intervene (e.g. if no partition maintained quorum).  When pacemaker does fencing, I don't know what's supposed to happen.

Comment 5 David Teigland 2015-07-28 14:26:21 UTC

If you reproduce this with dlm by itself (get rid of pacemaker) then I could explain the behavior.  Please either reproduce that way, or reassign to pacemaker.

Comment 7 Andrew Beekhof 2015-08-13 22:01:27 UTC

The cluster doesn't require fencing in this situation.
If the dlm requires it, then it is up to the dlm to initiate it.

Comment 8 David Teigland 2015-08-14 15:30:22 UTC

The expected dlm behavior here remains the same as it's been in the past (since partition/merge handling was added), and there does not appear to be anything to fix.

In the case of a cluster partition that merges, if one partition maintained quorum, then it will kill merged nodes.  Otherwise, as in this case, user intervention is required to select and kill merged nodes.

Comment 9 Eric Ren 2016-04-26 02:48:43 UTC

Hello Michal,

> How reproducible: very frequent
> 
> Steps to Reproduce:
> 1. have quorate pacemaker cluster
> . check nodes uptime
> . disable network communication between all nodes with iptables and wait for
>     all nodes turning inquorate
> . enable at the same time network communication between nodes 
> . check whether fencing occured and if it has not check dlm status and logs

With 3 nodes cluster, unfortunately I cannot reproduce(fencing quickly happens ) if applying iptables manually:-/ Looking at cluster report you attached, I think you may use some automatic method to make a real transient disconnection all of sudden. If so, could you please share your method/scripts to help reproduce?

The reason why I'm here is this patch (https://github.com/ClusterLabs/pacemaker/pull/839) has a problem which will cause both nodes to be fenced in 2-nodes cluster unnecessarily in the following case:

1. Bring both nodes up in the cluster and all resources started.
2. Fence one node by issuing "pkill -9 corosync"
3. Watch logs and surviving node fences the other node and then ends up self fencing 

It will decrease availability in 2-nodes scenario. IMHO, the patch shouldn't let "controld" RA rely on "dlm_tool ls" to get "wait fencing" because this message means there's a node in cluster needing fencing. This commands on each node tell RA the same message, so every node will die. IOW, we need dlm tell RA if this node needs fencing, then that patch should work better.

Thanks for your time;-)

Comment 10 michal novacek 2016-05-11 08:48:51 UTC

What I do is that I create /root/iptables.sh on each of the cluster node and then I do run from a node outside of the cluster:

for i in 1 2 3; do ssh node$i /root/iptables.sh & done; wait

This way I was able to manifest the problem described in like less than ten attempts on a three node cluster. 

The imortant thing is '&' instead of ';' in the for cycle which will the commands in parallel. 

Hope this helps.

Comment 11 Eric Ren 2016-05-11 08:59:44 UTC

(In reply to michal novacek from comment #10)
> What I do is that I create /root/iptables.sh on each of the cluster node and
> then I do run from a node outside of the cluster:
> 
> for i in 1 2 3; do ssh node$i /root/iptables.sh & done; wait
> 
> This way I was able to manifest the problem described in like less than ten
> attempts on a three node cluster. 
> 
> The imortant thing is '&' instead of ';' in the for cycle which will the
> commands in parallel. 
> 
> Hope this helps.

Hi Michal,

Thanks a lot for your info! I've reproduced this problem now. In case you may interest:
1. setup ntp (optional);
2. put this scritps on every nodes:
---
#!/bin/sh

PATH=$PATH:/usr/sbin/
has_quorum=
hosts="ocfs2test2,ocfs2test3"   # other 2 nodes

iptables -A INPUT -s $hosts -j DROP
echo "iptables: add rules" > /tmp/cron.log

while true; do
	has_quorum=`corosync-quorumtool | awk '{if($1=="Quorate:") print $2;}'`
	if [ $has_quorum == "No" ] ; then
		echo "Quorum lost now" >> /tmp/cron.log
		break;
	fi
done

iptables -D INPUT -s $hosts -j DROP
echo "iptables: remove rules" >> /tmp/cron.log
---
3. concurrently trigger to run by crontab;

Thanks again.

Note You need to log in before you can comment on or make changes to this bug.