Bug 453672

Summary: system appears to deadlock (OOM) during 3-way cmirror I/O plus failure
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmirror-kernelAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED WONTFIX QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: iannis
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-14 19:59:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log and kern dump from taft-01 none

Description Corey Marthaler 2008-07-01 19:57:36 UTC
Description of problem:
I created 3 3-way mirrors and then started I/O to all 3 mirrors from all 4 nodes
(taft-0[1234]). I noticed that taft-01 started to slow way down right away.
Then, after failing /dev/sdh it became almost unresponsive. I then killed
taft-02 (in an attempt to test bz 233034). That caused taft-01 to just about
completly lock up. All the other nodes' recovery is stuck waiting for taft-01 to
fence taft-02.

  mirror1            taft       Mwi-ao 15.00G                    mirror1_mlog
100.00         mirror1_mimage_0(0),mirror1_mimage_1(0),mirror1_mim
age_2(0)
  [mirror1_mimage_0] taft       iwi-ao 15.00G                                  
             /dev/sdb1(0)
  [mirror1_mimage_1] taft       iwi-ao 15.00G                                  
             /dev/sdc1(0)
  [mirror1_mimage_2] taft       iwi-ao 15.00G                                  
             /dev/sdd1(0)
  [mirror1_mlog]     taft       lwi-ao  4.00M                                  
             /dev/sdh1(0)

  mirror2            taft       Mwi-ao 15.00G                    mirror2_mlog
100.00         mirror2_mimage_0(0),mirror2_mimage_1(0),mirror2_mim
age_2(0)
  [mirror2_mimage_0] taft       iwi-ao 15.00G                                  
             /dev/sde1(0)
  [mirror2_mimage_1] taft       iwi-ao 15.00G                                  
             /dev/sdf1(0)
  [mirror2_mimage_2] taft       iwi-ao 15.00G                                  
             /dev/sdg1(0)
  [mirror2_mlog]     taft       lwi-ao  4.00M                                  
             /dev/sdd1(3840)

  mirror3            taft       Mwi-ao 15.00G                    mirror3_mlog
100.00         mirror3_mimage_0(0),mir
age_2(0)
  [mirror3_mimage_0] taft       iwi-ao 15.00G                                  
             /dev/sdh1(1)
  [mirror3_mimage_1] taft       iwi-ao 15.00G                                  
             /dev/sdb1(3840)
  [mirror3_mimage_2] taft       iwi-ao 15.00G                                  
             /dev/sdc1(3840)
  [mirror3_mlog]     taft       lwi-ao  4.00M                                  
             /dev/sdd1(3841)  

I'll attach a kern dump from taft-01.


Version-Release number of selected component (if applicable):
2.6.9-71.ELsmp

lvm2-2.02.37-3.el4    BUILT: Thu Jun 12 10:09:19 CDT 2008
lvm2-cluster-2.02.37-3.el4    BUILT: Thu Jun 12 10:22:07 CDT 2008
device-mapper-1.02.25-2.el4    BUILT: Mon Jun  9 09:28:41 CDT 2008
cmirror-1.0.1-1    BUILT: Tue Jan 30 17:28:02 CST 2007
cmirror-kernel-2.6.9-41.4    BUILT: Tue Jun  3 13:54:29 CDT 2008

Comment 1 Corey Marthaler 2008-07-01 20:38:38 UTC
Looks like this is some kind of memory leak.

Comment 2 Corey Marthaler 2008-07-01 20:40:05 UTC
Created attachment 310717 [details]
log and kern dump from taft-01

Comment 4 Jonathan Earl Brassow 2010-05-14 19:59:27 UTC
No 3-way cluster mirrors on rhel4.

If bug is present in re-write of later releases, please open new bug(s).