Bug 303981
Summary: | clurgmgr sefaults upon startup after cluster is stopped | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Chris Harms <chris> | ||||||
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.1 | CC: | cluster-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHBA-2008-0353 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-05-21 14:30:36 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Chris Harms
2007-09-24 20:39:51 UTC
Which node (node ID 1 or 2) ? Actually - the easiest thing to do is create /etc/sysconfig/cluster w/ the following contents: DAEMON_COREFILE_LIMIT="unlimited" RGMGR_OPTS="-w" This will cause clurgmgrd to produce a core file in the root directory -- could you attach the core and your cluster configuration? Fixing product Created attachment 204601 [details]
core dump of rgmanager
core dump of clurgmgr on cluster startup
(In reply to comment #1) > Which node (node ID 1 or 2) ? Node 2 Wow... thanks for the core. :) Ok, so... We received a VF_VIEW_FORMED message during for a transaction we did not have recorded. The transaction was allegedly from node 1, transaction ID 1, and came immediately after node 2 had received the PORTOPENED status from node 1. What normally happens is nodes request current states of distributed data when they access it. This means that it's safe to just throw away messages for pieces of data we don't have. This bug is restricted to RHEL5 because RHEL4 doesn't use CMAN's excellent multicast capabilities. This means that in the same situation on RHEL4, the socket with the unwanted data would not have been opened at this point. This is rather easy to fix. Created attachment 210861 [details]
Patch
All the other parts of vf_process_msg() seem to correctly ignore messages for which there is no key node associated. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0353.html |