Bug 214517 - hung cmirror operations due to looping mirror region requests
hung cmirror operations due to looping mirror region requests
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: cmirror (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jonathan Earl Brassow
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-11-07 17:36 EST by Corey Marthaler
Modified: 2010-01-11 21:01 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-08-05 17:40:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2006-11-07 17:36:07 EST
Description of problem:
I was testing cmirror up and down converts on one mirror while at the same time
writing to a gfs filesystem on top of another mirror (though the later probably
has nothing to do with this). LVM then hung doing looping region requests while
attempting to clean up the mirror, the looping appears to be because the mirror
server wasn't allowing the request.




[root@link-02 ~]# while true; do lvcreate -m 1 -n mirror3 -L 50M corey;
lvconvert -m 0 /dev/corey/mirror3; lvconvert -m 1 /dev/corey/mirror3; lvconvert
-m 0 /dev/corey/mirror3; lvconvert -m 1 /dev/corey/mirror3; lvconvert -m 0
/dev/corey/mirror3; lvchange -an /dev/corey/mirror3; lvremove  -f
/dev/corey/mirror3; sleep 1; done
  Logical volume "mirror3" already exists in volume group "corey"
  Logical volume mirror3 is already not mirrored.
  Logical volume mirror3 converted.
  Logical volume mirror3 converted.
  Logical volume mirror3 converted.
  Logical volume mirror3 converted.
  Logical volume "mirror3" successfully removed
  Rounding up size to full physical extent 52.00 MB
  Logical volume "mirror3" created
  Logical volume mirror3 converted.
  Logical volume mirror3 converted.
  Error locking on node link-07: Command timed out
  Problem reactivating mirror3
  Internal error: Duplicate LV name mirror3_mlog detected in corey.
  Failed to create mirror log.
  Logical volume mirror3 is already not mirrored.
  Error locking on node link-07: Command timed out
  Error locking on node link-02: Volume is busy on another node
  Can't get exclusive access to volume "mirror3"
  Logical volume "mirror3" already exists in volume group "corey"
  Logical volume mirror3 is already not mirrored.
  Internal error: Duplicate LV name mirror3_mlog detected in corey.
  Failed to create mirror log.
  Logical volume mirror3 is already not mirrored.
  Internal error: Duplicate LV name mirror3_mlog detected in corey.
  Failed to create mirror log.
  Logical volume mirror3 is already not mirrored.


Nov  7 16:47:29 link-07 kernel: dm-cmirror: Continuing request:: LRT_GET_SYNC_COUNT
Nov  7 16:47:30 link-07 kernel: dm-cmirror: Clean-up required due to new server
Nov  7 16:47:30 link-07 kernel: dm-cmirror:  - Wiping clear region list
Nov  7 16:47:30 link-07 kernel: dm-cmirror:  - 0 clear region requests wiped
Nov  7 16:47:30 link-07 kernel: dm-cmirror:  - Resending all mark region requests
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113192
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113191
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113190
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113189
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113188
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113187
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113186
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113185
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113122
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113121
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 112870
Nov  7 16:47:30 link-07 kernel: dm-cmirror: Clean-up complete
Nov  7 16:47:30 link-07 kernel: dm-cmirror: Continuing request:: LRT_GET_SYNC_COUNT
Nov  7 16:47:30 link-07 kernel: dm-cmirror: Clean-up required due to new server
Nov  7 16:47:30 link-07 kernel: dm-cmirror:  - Wiping clear region list
Nov  7 16:47:30 link-07 kernel: dm-cmirror:  - 0 clear region requests wiped
Nov  7 16:47:30 link-07 kernel: dm-cmirror:  - Resending all mark region requests
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113192
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113191
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113190
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113189
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113188
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113187
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113186
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113185
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113122
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 113121
Nov  7 16:47:30 link-07 kernel: dm-cmirror:    - 112870
Nov  7 16:47:30 link-07 kernel: dm-cmirror: Clean-up complete



mirror server:
Nov  7 11:52:42 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113122)
Nov  7 11:52:42 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113121)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,112870)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113192)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113191)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113190)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113189)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113188)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113187)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113186)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113185)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113122)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113121)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,112870)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113192)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113191)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113190)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113189)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113188)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113187)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113186)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113185)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113122)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113121)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,112870)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113192)
Nov  7 11:52:43 link-04 kernel: dm-cmirror: Attempt to mark a already marked
region (4,113191)


From another node in the cluster:
[root@link-02 ~]# lvs
  LV               VG    Attr   LSize  Origin Snap%  Move Log          Copy%
  mirror           corey mwi-a- 70.00G                    mirror_mlog  100.00
  mirror2          corey mwi-a- 50.00G                    mirror2_mlog 100.00
  mirror3          corey -wi--- 52.00M
  mirror3_mimage_1 corey -wi-a- 52.00M
  mirror3_mlog     corey -wi---  4.00M
[root@link-02 ~]# dmsetup ls
corey-mirror2_mlog      (253, 6)
corey-mirror3_mimage_1  (253, 12)
corey-mirror    (253, 5)
corey-mirror2   (253, 9)
corey-mirror3_mimage_0  (253, 11)
corey-mirror2_mimage_1  (253, 8)
corey-mirror2_mimage_0  (253, 7)
corey-mirror_mimage_1   (253, 4)
corey-mirror_mimage_0   (253, 3)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)
corey-mirror_mlog       (253, 2)


Version-Release number of selected component (if applicable):
[root@link-02 ~]# rpm -qa | grep cmirror
cmirror-kernel-debuginfo-2.6.9-13.0
cmirror-kernel-smp-2.6.9-15.5
cmirror-kernel-largesmp-2.6.9-13.0
cmirror-kernel-2.6.9-13.0
cmirror-1.0.1-0
cmirror-debuginfo-1.0.1-0
[root@link-02 ~]# rpm -qa | grep lvm2
lvm2-cluster-2.02.13-1
lvm2-cluster-debuginfo-2.02.06-7.0.RHEL4
lvm2-2.02.13-1
Comment 1 Jonathan Earl Brassow 2006-11-27 17:39:43 EST
a variable that triggered the client to keep sending the server it's marked
region list was not being reset.
Comment 2 Corey Marthaler 2007-05-24 11:23:28 EDT
I ran cmirror locking operations all night and wasn't able to trip this
deadlock, marking verified in cmirror-kernel-2.6.9-32.0.
Comment 3 Chris Feist 2008-08-05 17:40:19 EDT
Fixed in current release (4.7).

Note You need to log in before you can comment on or make changes to this bug.