Bug 447799
| Summary: | clvmd init script hangs during lock_gulm startup | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> | ||||||||
| Component: | lvm2-cluster | Assignee: | Christine Caulfield <ccaulfie> | ||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 4 | CC: | agk, ccaulfie, dwysocha, edamato, jbrassow, mbroz, prockai | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2009-04-24 14:49:55 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Corey Marthaler
2008-05-21 20:50:42 UTC
Created attachment 306317 [details]
log from the startup hang
Can you get a log from clvmd started up as "clvmd -d" please ? I did try to reproduce this on my cluster but it seems to work fine for me. The dump we have seems to show clvmd waiting for gulm, but more than that I can't tell. I've repoduced this with the requested info. The hang is during the 'vgscan'. I'll attach the clvmd -d log as well as an strace of the vgscan. Created attachment 308383 [details]
clvmd -d log
Created attachment 308384 [details]
hung vgscan strace
*** Bug 444600 has been marked as a duplicate of this bug. *** That's a really bizarre place for the log to end. It ends in the middle of a loop around the nodes in the cluster, for clvmd to hang there I think it would have to be stuck in a dm_hash_* function which seems VERY odd. How easy is this for you to reproduce? I've tried quite hard on my 3node roth cluster with no luck The key to reproducing this is to not have clvmd running on the other nodes in the cluster, just lock_gulmd. So when the clvmd init hangs, it's the only node attempting to join that service. Yes, I'd guessed that much from the logs, it still works for me though. I'll try repeating it ad nauseum. Ah, I think I see what's happening. clvmd sees that the other nodes are down but still waits for the command timeout to trigger. If you waited for a couple of minutes I suspect that vgscan would return. This patch fixes that so it's consistent with cman in returning "clvmd node running" errors immediately. Checking in daemons/clvmd/clvmd-gulm.c; /cvs/lvm2/LVM2/daemons/clvmd/clvmd-gulm.c,v <-- clvmd-gulm.c new revision: 1.23; previous revision: 1.22 done |