Bug 429920

Summary: RHEL5 cmirror tracker: clogd gets stuck in saCkptInitialize
Product: Red Hat Enterprise Linux 5 Reporter: Corey Marthaler <cmarthal>
Component: openaisAssignee: Steven Dake <sdake>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 5.2CC: agk, ccaulfie, cluster-maint, dwysocha, heinzm, mbroz, sdake
Target Milestone: rcKeywords: TestBlocker
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0411 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 14:31:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 430797    

Description Corey Marthaler 2008-01-23 19:52:55 UTC
Description of problem:
Running the init script or starting clogd by hand will never return. It's stuck
waiting.
wait4(3960, 0x7fff91b1a38c, WNOHANG, NULL) = 0


Version-Release number of selected component (if applicable):
kmod-cmirror-0.1.3-5.el5
cmirror-1.1.6-2.el5
2.6.18-62.el5

How reproducible:
Everytime

Comment 1 Corey Marthaler 2008-01-23 23:46:59 UTC
This is on the latest build as well. 

[root@hayes-02 ~]# service cmirror start
Loading clustered mirror log module:                       [  OK  ]
Starting clustered mirror log server:                      [  OK  ]
[root@hayes-02 ~]# service cmirror stop
Stopping clustered mirror log server:                      [  OK  ]
Unloading clustered mirror log module:                     [  OK  ]
[root@hayes-02 ~]# service cmirror start
Loading clustered mirror log module:                       [  OK  ]
Starting clustered mirror log server:      
[HANG]

2.6.18-71.el5
cmirror-1.1.7-1.el5
kmod-cmirror-0.1.4-1.el5

Comment 2 Jonathan Earl Brassow 2008-01-24 17:33:40 UTC
I've seen this too.  I think this is because I'm not properly exiting the AIS
ckpt service.  It should be in the latest build now though.

Either way, if the log server dies or gets killed by -9, we should be able to
restart... and AIS should cleanup.

What you've done above is the same as doing:
1) clogd
2) killall clogd
3) clogd

If this now works, then try:
1) clogd
2) killall -9 clogd
3) clogd


Comment 5 Corey Marthaler 2008-02-04 14:56:18 UTC
This issue is marked ON_QA but isn't fixed in the latest build. Should the fix
be in the following?

cmirror-1.1.8-1.el5
kmod-cmirror-0.1.5-2.el5
lvm2-2.02.32-1.el5
lvm2-cluster-2.02.32-1.el5

Comment 6 Jonathan Earl Brassow 2008-02-04 17:14:59 UTC
my guess is that this is an OpenAIS issue and won't be fixed by any of the above
packages....

Steve,
I have a cluster you can use to test this.


Comment 7 Steven Dake 2008-02-04 17:24:11 UTC
sigh I believed whoever changed it to modified had fixed the problem. 
Apparently not.

I'll talk to you Jon and reassign this bug from you to me.

Comment 8 Corey Marthaler 2008-02-26 20:10:18 UTC
Fix verified in openais-0.80.3-12.el5.

Comment 10 errata-xmlrpc 2008-05-21 14:31:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0411.html