Bug 429920 - RHEL5 cmirror tracker: clogd gets stuck in saCkptInitialize
Summary: RHEL5 cmirror tracker: clogd gets stuck in saCkptInitialize
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openais
Version: 5.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Steven Dake
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 430797
TreeView+ depends on / blocked
 
Reported: 2008-01-23 19:52 UTC by Corey Marthaler
Modified: 2016-04-26 15:39 UTC (History)
7 users (show)

Fixed In Version: RHBA-2008-0411
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 14:31:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0411 0 normal SHIPPED_LIVE openais bug fix update 2008-05-19 22:35:21 UTC

Description Corey Marthaler 2008-01-23 19:52:55 UTC
Description of problem:
Running the init script or starting clogd by hand will never return. It's stuck
waiting.
wait4(3960, 0x7fff91b1a38c, WNOHANG, NULL) = 0


Version-Release number of selected component (if applicable):
kmod-cmirror-0.1.3-5.el5
cmirror-1.1.6-2.el5
2.6.18-62.el5

How reproducible:
Everytime

Comment 1 Corey Marthaler 2008-01-23 23:46:59 UTC
This is on the latest build as well. 

[root@hayes-02 ~]# service cmirror start
Loading clustered mirror log module:                       [  OK  ]
Starting clustered mirror log server:                      [  OK  ]
[root@hayes-02 ~]# service cmirror stop
Stopping clustered mirror log server:                      [  OK  ]
Unloading clustered mirror log module:                     [  OK  ]
[root@hayes-02 ~]# service cmirror start
Loading clustered mirror log module:                       [  OK  ]
Starting clustered mirror log server:      
[HANG]

2.6.18-71.el5
cmirror-1.1.7-1.el5
kmod-cmirror-0.1.4-1.el5

Comment 2 Jonathan Earl Brassow 2008-01-24 17:33:40 UTC
I've seen this too.  I think this is because I'm not properly exiting the AIS
ckpt service.  It should be in the latest build now though.

Either way, if the log server dies or gets killed by -9, we should be able to
restart... and AIS should cleanup.

What you've done above is the same as doing:
1) clogd
2) killall clogd
3) clogd

If this now works, then try:
1) clogd
2) killall -9 clogd
3) clogd


Comment 5 Corey Marthaler 2008-02-04 14:56:18 UTC
This issue is marked ON_QA but isn't fixed in the latest build. Should the fix
be in the following?

cmirror-1.1.8-1.el5
kmod-cmirror-0.1.5-2.el5
lvm2-2.02.32-1.el5
lvm2-cluster-2.02.32-1.el5

Comment 6 Jonathan Earl Brassow 2008-02-04 17:14:59 UTC
my guess is that this is an OpenAIS issue and won't be fixed by any of the above
packages....

Steve,
I have a cluster you can use to test this.


Comment 7 Steven Dake 2008-02-04 17:24:11 UTC
sigh I believed whoever changed it to modified had fixed the problem. 
Apparently not.

I'll talk to you Jon and reassign this bug from you to me.

Comment 8 Corey Marthaler 2008-02-26 20:10:18 UTC
Fix verified in openais-0.80.3-12.el5.

Comment 10 errata-xmlrpc 2008-05-21 14:31:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0411.html



Note You need to log in before you can comment on or make changes to this bug.