Bug 599654
Summary: | killing process gives CPG_REASON_LEAVE instead of CPG_REASON_PROCDOWN | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | David Teigland <teigland> | ||||
Component: | openais | Assignee: | Jan Friesse <jfriesse> | ||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 5.5 | CC: | cluster-maint, edamato, jkortus, sdake | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | openais-0.80.6-21.el5 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, the Closed Process Group (CPG) interface returned the wrong result, which could have led to incorrect behavior in some situations. With this update, the CPG interface now behaves as expected.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 604078 (view as bug list) | Environment: | |||||
Last Closed: | 2011-01-13 23:57:02 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 604078, 604242, 618765, 618766, 624488 | ||||||
Attachments: |
|
Description
David Teigland
2010-06-03 17:20:08 UTC
Created attachment 424119 [details]
Proposed patch - sent to ML
end CPG_REASON_PROCDOWN on process left
Our manual pages are clear:
CPG_REASON_PROCDOWN - the process left a group without calling
cpg_leave().
Currently, we are sending CPG_REASON_LEAVE in such situation.
*** Bug 604242 has been marked as a duplicate of this bug. *** I tested openais-0.80.6-21.el5.x86_64.rpm and it works correctly. on nodes 1,2,3: service cman start; service clvmd start; mount /dev/shared/x /gfs on node 1: killall -9 groupd on nodes 2,3: group_tool -v type level name id state node id local_done fence 0 default 00010001 FAIL_START_WAIT 1 100020003 0 [2 3] dlm 1 clvmd 00020001 FAIL_ALL_STOPPED 1 100020003 -1 [1 2 3] dlm 1 x 00040001 FAIL_ALL_STOPPED 1 100020003 -1 [1 2 3] gfs 2 x 00030001 FAIL_ALL_STOPPED 1 100020003 -1 [1 2 3] They are correctly processing a node 1 failure. Also, 'group_tool dump' data shows the node_leave event before fix and the node_down event after. before: 1276726614 groupd confchg total 2 left 1 joined 0 1276726614 0:default confchg left 1 joined 0 total 2 1276726614 0:default confchg removed node 1 reason 2 1276726614 0:default process_node_leave 1 1276726614 0:default cpg del node 1 total 2 1276726614 0:default make_event_id 100020002 nodeid 1 memb_count 2 type 2 1276726614 0:default queue leave event for nodeid 1 after: 1278010518 groupd confchg total 2 left 1 joined 0 1278010518 0:default add to recovery set 1 1278010518 0:default process_node_down 1 1278010518 0:default cpg del node 1 total 2 - down 1278010518 0:default make_event_id 100020003 nodeid 1 memb_count 2 type 3 1278010518 0:default queue recover event for nodeid 1 1278010518 0:default confchg left 1 joined 0 total 2 1278010518 0:default confchg removed node 1 reason 5 Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, the Closed Process Group (CPG) interface returned the wrong result, which could have led to incorrect behavior in some situations. With this update, the CPG interface now behaves as expected. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0100.html |