Red Hat Bugzilla – Bug 453672
system appears to deadlock (OOM) during 3-way cmirror I/O plus failure
Last modified: 2010-05-14 15:59:27 EDT
Description of problem:
I created 3 3-way mirrors and then started I/O to all 3 mirrors from all 4 nodes
(taft-0). I noticed that taft-01 started to slow way down right away.
Then, after failing /dev/sdh it became almost unresponsive. I then killed
taft-02 (in an attempt to test bz 233034). That caused taft-01 to just about
completly lock up. All the other nodes' recovery is stuck waiting for taft-01 to
mirror1 taft Mwi-ao 15.00G mirror1_mlog
[mirror1_mimage_0] taft iwi-ao 15.00G
[mirror1_mimage_1] taft iwi-ao 15.00G
[mirror1_mimage_2] taft iwi-ao 15.00G
[mirror1_mlog] taft lwi-ao 4.00M
mirror2 taft Mwi-ao 15.00G mirror2_mlog
[mirror2_mimage_0] taft iwi-ao 15.00G
[mirror2_mimage_1] taft iwi-ao 15.00G
[mirror2_mimage_2] taft iwi-ao 15.00G
[mirror2_mlog] taft lwi-ao 4.00M
mirror3 taft Mwi-ao 15.00G mirror3_mlog
[mirror3_mimage_0] taft iwi-ao 15.00G
[mirror3_mimage_1] taft iwi-ao 15.00G
[mirror3_mimage_2] taft iwi-ao 15.00G
[mirror3_mlog] taft lwi-ao 4.00M
I'll attach a kern dump from taft-01.
Version-Release number of selected component (if applicable):
lvm2-2.02.37-3.el4 BUILT: Thu Jun 12 10:09:19 CDT 2008
lvm2-cluster-2.02.37-3.el4 BUILT: Thu Jun 12 10:22:07 CDT 2008
device-mapper-1.02.25-2.el4 BUILT: Mon Jun 9 09:28:41 CDT 2008
cmirror-1.0.1-1 BUILT: Tue Jan 30 17:28:02 CST 2007
cmirror-kernel-2.6.9-41.4 BUILT: Tue Jun 3 13:54:29 CDT 2008
Looks like this is some kind of memory leak.
Created attachment 310717 [details]
log and kern dump from taft-01
No 3-way cluster mirrors on rhel4.
If bug is present in re-write of later releases, please open new bug(s).