Bug 490863
Summary: | Failback fails, kills rgmanager | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Janne Peltonen <janne.peltonen> |
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 5.2 | CC: | cluster-maint, edamato |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-03-18 14:42:52 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Janne Peltonen
2009-03-18 11:36:22 UTC
Well, that sounds like a bug, so it really doesn't matter whether you're using Debian, Fedora, RHEL, CentOS, or otherwise. What I think you hit is related to rgmanager running out of descriptors: https://bugzilla.redhat.com/show_bug.cgi?id=461956 This patch was included in the RHEL5.3 release: http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=50dc172c12f728ebb5916e2059b01404d94dd066 Basically, after an event (joining/leaving the cluster, starting/stopping rgmanager, etc), rgmanager could run itself out of connection descriptors because it would actually go to sleep with a lock where it shouldn't have. This produced a wide range of strange behaviors, and easily could have produced the problem you're seeing. I would check the 5.3 release 2.0.46(?) of rgmanager; I'm pretty sure this is fixed. |