Bug 239856

Summary: failed server election due to suspended mirror log causes clvm deadlock
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmirror-kernelAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 4CC: agk, dwysocha, jbrassow, mbroz, prockai
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0991 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-21 21:15:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Debugging information on link-02
none
Debugging information on link-04
none
Debugging information on link-07
none
Debugging information on link-08
none
Relevent messages output from link-07
none
More from messages on link-07 none

Description Corey Marthaler 2007-05-11 18:46:05 UTC
Description of problem:
I had been running 'cmirror_lock_stress' which basically just does multiple
instances of cmirror operations (create, down convert, up convert, deactivate,
reactivate, delete) on all the nodes in a cluster at the same time. Eventually,
two mirror logs failed to activate and got stuck in the SUSPEND state and thus
caused the cmirror server election failures. 

The devices were only in the suspend state on 3 of the nodes in the cluster
(link-0[247]) but were active on link-08.

Here is what was happen on all the nodes just before this took place:

LINK-02:
ay 10 16:18:13 link-02 qarshd[27150]: Running cmdline: lvs
lock_stress/link-02.12483 --noheadings -o mirror_log | awk {'print $1'}
May 10 16:18:18 link-02 lvm[6466]: Monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:18:27 link-02 lvm[6466]: Monitoring mirror device
lock_stress-link--04.12481 for events
May 10 16:18:32 link-02 qarshd[27355]: Talking to peer 10.15.80.47:49734
May 10 16:18:32 link-02 qarshd[27355]: Running cmdline: lvconvert -m 0
lock_stress/link-02.12483
May 10 16:18:33 link-02 lvm[6466]: No longer monitoring mirror device
lock_stress-link--02.12483 for events
May 10 16:18:40 link-02 qarshd[27396]: Talking to peer 10.15.80.47:49738
May 10 16:18:40 link-02 qarshd[27396]: Running cmdline: lvs
lock_stress/link-02.12483 --noheadings -o mirror_log | awk {'print $1'}
May 10 16:18:41 link-02 lvm[6466]: No longer monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:19:01 link-02 kernel: dm-cmirror: Failed to receive election results
from server: (3u0694BM,-110)
May 10 16:19:41 link-02 last message repeated 2 times



LINK-04:
May 10 16:17:35 link-04 qarshd[27126]: Running cmdline: lvcreate -m 2 -n
link-04.12481 -L 500M lock_stress
May 10 16:17:57 link-04 [6834]: Monitoring mirror device
lock_stress-link--07.12479 for events
May 10 16:18:02 link-04 lvm[6834]: Monitoring mirror device
lock_stress-link--02.12483 for events
May 10 16:18:08 link-04 lvm[6834]: Monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:18:17 link-04 lvm[6834]: Monitoring mirror device
lock_stress-link--04.12481 for events
May 10 16:18:17 link-04 qarshd[27509]: Talking to peer 10.15.80.47:33482
May 10 16:18:17 link-04 qarshd[27509]: Running cmdline: lvs
lock_stress/link-04.12481 --noheadings -o mirror_log | awk {'print $1'}
May 10 16:18:22 link-04 qarshd[27519]: Talking to peer 10.15.80.47:33486
May 10 16:18:22 link-04 qarshd[27519]: Running cmdline: lvconvert -m 0
lock_stress/link-04.12481
May 10 16:18:23 link-04 lvm[6834]: No longer monitoring mirror device
lock_stress-link--02.12483 for events
May 10 16:18:30 link-04 lvm[6834]: No longer monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:18:50 link-04 kernel: dm-cmirror: Failed to receive election results
from server: (3u0694BM,-110)
May 10 16:19:30 link-04 last message repeated 2 times




LINK-07:
May 10 16:17:58 link-07 qarshd[27065]: Running cmdline: lvcreate -m 3 -n
link-07.12479 -L 500M lock_stress
May 10 16:18:08 link-07 [6442]: Monitoring mirror device
lock_stress-link--07.12479 for events
May 10 16:18:18 link-07 hald[3552]: Timed out waiting for hotplug event 3615.
Rebasing to 3600
May 10 16:18:20 link-07 qarshd[27180]: Talking to peer 10.15.80.47:47611
May 10 16:18:20 link-07 qarshd[27180]: Running cmdline: lvs
lock_stress/link-07.12479 --noheadings -o mirror_log | awk {'print $1'}
May 10 16:18:22 link-07 hald[3552]: Timed out waiting for hotplug event 3601.
Rebasing to 3615
May 10 16:18:25 link-07 lvm[6442]: Monitoring mirror device
lock_stress-link--02.12483 for events
May 10 16:18:31 link-07 lvm[6442]: Monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:18:40 link-07 lvm[6442]: Monitoring mirror device
lock_stress-link--04.12481 for events
May 10 16:18:44 link-07 qarshd[27458]: Talking to peer 10.15.80.47:47621
May 10 16:18:44 link-07 qarshd[27458]: Running cmdline: lvconvert -m 1
lock_stress/link-07.12479
May 10 16:18:45 link-07 lvm[6442]: No longer monitoring mirror device
lock_stress-link--02.12483 for events
May 10 16:18:53 link-07 lvm[6442]: No longer monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:19:13 link-07 kernel: dm-cmirror: Failed to receive election results
from server: (3u0694BM,-110)
May 10 16:19:28 link-07 kernel: dm-cmirror: Error listening for server(2)
response for 3u0694BM: -110
May 10 16:19:43 link-07 kernel: dm-cmirror: Error listening for server(2)
response for 3u0694BM: -110
May 10 16:19:53 link-07 lvm[6442]: No longer monitoring mirror device
lock_stress-link--07.12479 for events
May 10 16:19:58 link-07 kernel: dm-cmirror: Error listening for server(2)
response for 3u0694BM: -110
May 10 16:20:43 link-07 last message repeated 3 times
May 10 16:21:58 link-07 last message repeated 5 times
May 10 16:22:09 link-07 qarshd[27458]: Sending child 27459 signal 2
May 10 16:22:13 link-07 kernel: dm-cmirror: Error listening for server(2)
response for 3u0694BM: -110
May 10 16:22:58 link-07 last message repeated 3 times
May 10 16:24:13 link-07 last message repeated 5 times
May 10 16:25:28 link-07 last message repeated 5 times
May 10 16:26:43 link-07 last message repeated 5 times
May 10 16:27:13 link-07 last message repeated 2 times
May 10 16:27:13 link-07 kernel: dm-cmirror: Clustered mirror retried requests ::
32 of 39442 (1%)
May 10 16:27:13 link-07 kernel: dm-cmirror: Too many retries, attempting to
re-establish server connection.
May 10 16:27:33 link-07 kernel: dm-cmirror: Failed to receive election results
from server: (3u0694BM,-110)



LINK-08:
May 10 16:18:31 link-08 qarshd[31709]: Running cmdline: lvs
lock_stress/link-08.12485 --noheadings -o mirror_log | awk {'print $1'}
May 10 16:18:40 link-08 lvm[7254]: Monitoring mirror device
lock_stress-link--04.12481 for events
May 10 16:18:44 link-08 qarshd[31803]: Talking to peer 10.15.80.47:41860
May 10 16:18:44 link-08 qarshd[31803]: Running cmdline: lvconvert -m 2
lock_stress/link-08.12485
May 10 16:18:45 link-08 lvm[7254]: No longer monitoring mirror device
lock_stress-link--02.12483 for events
May 10 16:18:53 link-08 lvm[7254]: No longer monitoring mirror device
lock_stress-link--08.12485 for events
May 10 16:19:13 link-08 kernel: dm-cmirror: Failed to receive election results
from server: (3u0694BM,-110)
May 10 16:19:53 link-08 last message repeated 2 times



Version-Release number of selected component (if applicable):
2.6.9-55.ELlargesmp
cmirror-kernel-2.6.9-32.0
lvm2-cluster-2.02.21-7.el4

Comment 2 Corey Marthaler 2007-05-14 16:40:57 UTC
Appear to have reproduced this bug of the weekend.

Link-02:
[...]
May 13 04:02:09 link-02 kernel: dm-cmirror: Failed to receive election results
from server: (yDf7i4D1,-110)
May 13 04:02:09 link-02 kernel: dm-cmirror: server_id=dead, server_valid=1, yDf7i4D1
May 13 04:02:09 link-02 kernel: dm-cmirror: trigger = LRT_GET_SYNC_COUNT
May 13 04:02:09 link-02 kernel: dm-cmirror: LRT_ELECTION(10): (yDf7i4D1)
May 13 04:02:09 link-02 kernel: dm-cmirror:   starter     : 3
May 13 04:02:09 link-02 kernel: dm-cmirror:   co-ordinator: 3
May 13 04:02:09 link-02 kernel: dm-cmirror:   node_count  : 0
[...]


Link-04:
[...]
May 13 04:02:36 link-04 kernel: dm-cmirror: Failed to receive election results
from server: (1P3Mw1Yx,-110)
May 13 04:02:36 link-04 kernel: dm-cmirror: server_id=dead, server_valid=1, 1P3Mw1Yx
May 13 04:02:36 link-04 kernel: dm-cmirror: trigger = LRT_COMPLETE_RESYNC_WORK
May 13 04:02:36 link-04 kernel: dm-cmirror: LRT_ELECTION(10): (1P3Mw1Yx)
May 13 04:02:36 link-04 kernel: dm-cmirror:   starter     : 4
May 13 04:02:36 link-04 kernel: dm-cmirror:   co-ordinator: 4
May 13 04:02:36 link-04 kernel: dm-cmirror:   node_count  : 0
May 13 04:02:49 link-04 kernel: dm-cmirror: LRT_ELECTION(10): (yDf7i4D1)
May 13 04:02:49 link-04 kernel: dm-cmirror:   starter     : 3
May 13 04:02:49 link-04 kernel: dm-cmirror:   co-ordinator: 3
May 13 04:02:49 link-04 kernel: dm-cmirror:   node_count  : 1
[...]


LINK-07:
[...]
May 13 04:02:57 link-07 kernel: dm-cmirror: Failed to receive election results
from server: (yDf7i4D1,-110)
May 13 04:02:57 link-07 kernel: dm-cmirror: server_id=dead, server_valid=1, yDf7i4D1
May 13 04:02:57 link-07 kernel: dm-cmirror: trigger = LRT_GET_SYNC_COUNT
[...]


LINK-08:
[...]
May 13 04:02:13 link-08 kernel: dm-cmirror: Failed to receive election results
from server: (yDf7i4D1,-110)
May 13 04:02:13 link-08 kernel: dm-cmirror: server_id=dead, server_valid=1, yDf7i4D1
May 13 04:02:13 link-08 kernel: dm-cmirror: trigger = LRT_GET_SYNC_COUNT
May 13 04:02:13 link-08 kernel: dm-cmirror: LRT_ELECTION(10): (yDf7i4D1)
May 13 04:02:13 link-08 kernel: dm-cmirror:   starter     : 1
May 13 04:02:13 link-08 kernel: dm-cmirror:   co-ordinator: 1
May 13 04:02:13 link-08 kernel: dm-cmirror:   node_count  : 0
May 13 04:02:18 link-08 kernel: dm-cmirror: LRT_ELECTION(10): (1P3Mw1Yx)
May 13 04:02:18 link-08 kernel: dm-cmirror:   starter     : 4
May 13 04:02:18 link-08 kernel: dm-cmirror:   co-ordinator: 4
May 13 04:02:18 link-08 kernel: dm-cmirror:   node_count  : 1
May 13 04:02:31 link-08 kernel: dm-cmirror: LRT_ELECTION(10): (yDf7i4D1)
May 13 04:02:31 link-08 kernel: dm-cmirror:   starter     : 3
May 13 04:02:31 link-08 kernel: dm-cmirror:   co-ordinator: 3
May 13 04:02:31 link-08 kernel: dm-cmirror:   node_count  : 2
[...]



[root@link-02 log]# cman_tool nodes
Node  Votes Exp Sts  Name
   1    1    4   M   link-08
   2    1    4   M   link-07
   3    1    4   M   link-02
   4    1    4   M   link-04
[root@link-02 log]# cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           3   2 run       -
[2 1 3 4]

DLM Lock Space:  "clvmd"                            96  74 run       -
[3 4 2 1]

DLM Lock Space:  "clustered_log"                   142 117 run       -
[2 1 3 4]



[root@link-02 log]# dmsetup table
lock_stress-link--04.7689_mimage_0: 0 1024000 linear 8:33 1032576
lock_stress-link--08.7687_mimage_2: 0 1024000 linear 8:33 384
lock_stress-link--08.7687: 0 1024000 mirror clustered_disk 4 253:2 1024
LVM-vFSXsH7SJ789IvPT9Eq2sLLq2KabOXaAOdiWIhfSW3ZTVejln0c2djc2k8y0WVxJ
block_on_error 3 253:3 0 253:4 0 253:5 0
lock_stress-link--02.7691: 0 1024000 mirror clustered_disk 4 253:13 1024
LVM-vFSXsH7SJ789IvPT9Eq2sLLq2KabOXaADG9t1bHbmkh8v2eXxFKnLlURyDf7i4D1
block_on_error 5 253:14 0 253:15 0 253:16 0 253:17 0 253:18 0
lock_stress-link--08.7687_mimage_1: 0 1024000 linear 8:17 384
lock_stress-link--07.7693_mimage_3: 0 1024000 linear 8:97 8576
lock_stress-link--08.7687_mimage_0: 0 1024000 linear 8:1 384
lock_stress-link--08.7687_mlog: 0 8192 linear 8:97 384
lock_stress-link--07.7693_mimage_2: 0 1024000 linear 8:81 384
lock_stress-link--04.7689_mlog: 0 8192 linear 8:81 2048384
lock_stress-link--02.7691_mimage_4: 0 1024000 linear 8:81 1024384
lock_stress-link--07.7693_mimage_1: 0 1024000 linear 8:65 384
lock_stress-link--04.7689: 0 1024000 mirror clustered_disk 4 253:20 1024
LVM-vFSXsH7SJ789IvPT9Eq2sLLq2KabOXaAHL5efyw6EHAacfcdSpz3NLqj1P3Mw1Yx
block_on_error 2 253:21 0 253:22 0
lock_stress-link--02.7691_mlog: 0 8192 linear 8:97 1032576
lock_stress-link--02.7691_mimage_3: 0 1024000 linear 8:65 1024384
lock_stress-link--07.7693_mimage_0: 0 1024000 linear 8:49 384
lock_stress-link--07.7693_mlog: 0 8192 linear 8:33 1024384
lock_stress-link--02.7691_mimage_2: 0 1024000 linear 8:49 1024384
VolGroup00-LogVol01: 0 4063232 linear 3:2 73793920
lock_stress-link--02.7691_mimage_1: 0 1024000 linear 8:17 1024384
VolGroup00-LogVol00: 0 73793536 linear 3:2 384
lock_stress-link--02.7691_mimage_0: 0 1024000 linear 8:1 1024384
lock_stress-link--07.7693: 0 1024000 mirror clustered_disk 4 253:7 1024
LVM-vFSXsH7SJ789IvPT9Eq2sLLq2KabOXaAfE57gu7FYrcC89vEF3wBfVnjUFKeuzZ2
block_on_error 4 253:8 0 253:9 0 253:10 0 253:11 0
lock_stress-link--04.7689_mimage_1: 0 1024000 linear 8:97 1040768




Comment 3 Corey Marthaler 2007-05-15 16:10:25 UTC
Bumping the priority as this appears to be easily reproduced, though a customer
will most likely never do this many locking operations inorder to cause this.

Comment 4 Jonathan Earl Brassow 2007-05-16 20:48:22 UTC
Created attachment 154871 [details]
Debugging information on link-02

Comment 5 Jonathan Earl Brassow 2007-05-16 20:48:48 UTC
Created attachment 154872 [details]
Debugging information on link-04

Comment 6 Jonathan Earl Brassow 2007-05-16 20:49:20 UTC
Created attachment 154873 [details]
Debugging information on link-07

Comment 7 Jonathan Earl Brassow 2007-05-16 20:49:46 UTC
Created attachment 154874 [details]
Debugging information on link-08

Comment 8 Jonathan Earl Brassow 2007-05-16 20:52:44 UTC
Created attachment 154875 [details]
Relevent messages output from link-07

Shows that there are two copies of the log in play.  Could be getting
conflicting 'lc->suspend' readings.

Comment 9 Jonathan Earl Brassow 2007-05-16 20:53:55 UTC
Created attachment 154876 [details]
More from messages on link-07

Comment 10 Jonathan Earl Brassow 2007-05-17 19:14:26 UTC
I can reproduce this on the rhel4.6 codebase as well.

If I simply do:
while true; do
  lvchange -an vg1/`hostname -s`
  lvremove vg1/`hostname -s`
  lvcreate -m3 -L 1G -n `hostname -s` vg1
  for i in 2 1 0; do
    lvconvert -m $i vg1/`hostname -s`
  done
done
on each machine in the cluster, I can reproduce in a matter of minutes.


Comment 11 Jonathan Earl Brassow 2007-05-17 19:24:11 UTC
This bug only affects mirrors >= 3-way.

It requires enough load (either by changing mirror configurations or via I/O) to
force a down convert (mirror -> mirror) to choose a different server from the
original.

The cause of the bug is an uninitialized variable.

Comment 12 Corey Marthaler 2007-08-14 21:31:06 UTC
Just a note that this bug was seen today while running on the 4.5.z packages.

2.6.9-55.0.3.ELsmp
cmirror-kernel-2.6.9-32.0.0.2

Comment 13 Jonathan Earl Brassow 2007-08-16 17:09:19 UTC
I am no longer able to reproduce this with the latest rhel4.6 code.  The test in
comment 10 used to trigger the problem very quickly.  I've now been unsuccessful
for days.

Please try to reproduce.

Comment 14 Jonathan Earl Brassow 2007-08-16 17:13:12 UTC
I'm using 2.6.9-55.16.ELsmp and the latest rhel4.6 code from CVS.  I'm sorry,
I'm not sure if it has been built yet.  Will check.


Comment 15 Jonathan Earl Brassow 2007-08-16 17:15:33 UTC
Looks like the lvm2 package was built on July 24th, 2007

Comment 16 Jonathan Earl Brassow 2007-08-23 16:48:43 UTC
I've had a patch in my tree that I was unaware I had not checked in...  Probably
explains why I couldn't reproduce.


Comment 17 Jonathan Earl Brassow 2007-08-23 16:52:17 UTC
assigned -> post.


Comment 18 Corey Marthaler 2007-08-24 13:16:05 UTC
This bug does appear to be fixed with the latest packages. I did however
eventually hit bz 245799, so I didn't get to let the tests run as long as I had
hoped.

Comment 20 errata-xmlrpc 2007-11-21 21:15:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0991.html