Bug 159628
Summary: | sm join/leave bug | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | David Teigland <teigland> |
Component: | cman | Assignee: | David Teigland <teigland> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | cluster-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHBA-2005-734 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-10-07 16:46:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Teigland
2005-06-06 09:26:54 UTC
This is a bug with SM's internal event id's. The event id is a uint32, but this is cast to a uint16 when it's passed between machines in an SM message. When the id gets past 65535, the event id in the messages no longer matches the id used within SM, so things stall. When SM stalls, all the nodes need to be reset. This bug takes a while to see because you need to reach 65535 SM events in your cluster. Everyone will eventually hit it, though, given enough events in a long running cluster. (starting/stopping clvmd, mounting/unmounting gfs, joining/leaving the fence domain are all SM events) There is a simple fix for this that changes the event id's in messages to uint32's. I'll check this in after some basic tests. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-734.html |