Bug 670935

Summary: cmirror sync issues - 'Told to clear recovery on wrong region'
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmirrorAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED WONTFIX QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 4CC: agk, ccaulfie, dwysocha, edamato, heinzm, jbrassow, joe.thornber, prajnoha, pvrabec
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-19 18:34:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
log from taft-04 none

Description Corey Marthaler 2011-01-19 17:18:34 UTC
Description of problem:
I saw this issue again while doing rhel4.9 regression testing and wanted to document it.

The second cmirror create deadlocked, and like in bug 452121, there were these messages about clearing recovery on the wrong region.

I'll post the kern dump from the hung node as well.

TAFT-04:
Jan 19 02:03:08 taft-04 qarshd[5034]: Running cmdline: lvcreate -m 2 -n nonsyncd_secondary_3legs_1 -L 600M helter_skelter /dev/sdd1:0-1000 /dev/sdf1:0-1000 /dev/sdh1:0-1000 /dev/sde1:0-150
Jan 19 02:03:08 taft-04 udevd[1294]: udev done!
Jan 19 02:03:27 taft-04 [13965]: Monitoring mirror device helter_skelter-nonsyncd_secondary_3legs_1 for events
Jan 19 02:03:27 taft-04 qarshd[5143]: Talking to peer 10.15.89.99:56743
Jan 19 02:03:27 taft-04 qarshd[5143]: Running cmdline: lvcreate -m 2 -n nonsyncd_secondary_3legs_2 -L 600M helter_skelter /dev/sdd1:0-1000 /dev/sdf1:0-1000 /dev/sdh1:0-1000 /dev/sde1:0-150
Jan 19 02:03:27 taft-04 udevd[1294]: udev done!
Jan 19 02:03:30 taft-04 udevd[1294]: udev done!
Jan 19 02:03:31 taft-04 lvm[13965]: Monitoring mirror device helter_skelter-nonsyncd_secondary_3legs_2 for events


TAFT-01:
Jan 19 02:03:28 taft-01 [15109]: Monitoring mirror device helter_skelter-nonsyncd_secondary_3legs_1 for events
Jan 19 02:03:36 taft-01 udevd[1288]: udev done!
Jan 19 02:03:47 taft-01 last message repeated 2 times
Jan 19 02:03:47 taft-01 lvm[15109]: Monitoring mirror device helter_skelter-nonsyncd_secondary_3legs_2 for events
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Told to clear recovery on wrong region 0/18446744073709551615
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Unable to notify server of sync state change
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Told to clear recovery on wrong region 0/18446744073709551615
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Unable to notify server of sync state change
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Told to clear recovery on wrong region 0/18446744073709551615
Jan 19 02:03:47 taft-01 udevd[1288]: udev done!
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Unable to notify server of sync state change
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Told to clear recovery on wrong region 0/18446744073709551615
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Unable to notify server of sync state change
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Told to clear recovery on wrong region 0/18446744073709551615
Jan 19 02:03:47 taft-01 kernel: dm-cmirror: Unable to notify server of sync state change
Jan 19 02:03:57 taft-01 udevd[1288]: udev done!


Version-Release number of selected component (if applicable):
2.6.9-94.ELsmp

lvm2-2.02.42-9.el4    BUILT: Thu Oct 21 15:49:57 CDT 2010
lvm2-cluster-2.02.42-10.el4    BUILT: Tue Jan 18 06:17:17 CST 2011
device-mapper-1.02.28-3.el4    BUILT: Thu Mar  4 14:48:16 CST 2010
cmirror-1.0.2-1.el4    BUILT: Thu Feb 26 15:29:27 CST 2009
cmirror-kernel-2.6.9-43.14.el4    BUILT: Wed Dec 22 16:24:19 CST 2010


How reproducible:
Few times during regression testing

Comment 1 Corey Marthaler 2011-01-19 17:35:20 UTC
Created attachment 474326 [details]
log from taft-04