Bug 599654 - killing process gives CPG_REASON_LEAVE instead of CPG_REASON_PROCDOWN
killing process gives CPG_REASON_LEAVE instead of CPG_REASON_PROCDOWN
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: openais (Show other bugs)
5.5
All Linux
urgent Severity medium
: rc
: ---
Assigned To: Jan Friesse
Cluster QE
: ZStream
: 604242 (view as bug list)
Depends On:
Blocks: 604078 604242 618765 618766 624488
  Show dependency treegraph
 
Reported: 2010-06-03 13:20 EDT by David Teigland
Modified: 2016-04-26 09:57 EDT (History)
4 users (show)

See Also:
Fixed In Version: openais-0.80.6-21.el5
Doc Type: Bug Fix
Doc Text:
Previously, the Closed Process Group (CPG) interface returned the wrong result, which could have led to incorrect behavior in some situations. With this update, the CPG interface now behaves as expected.
Story Points: ---
Clone Of:
: 604078 (view as bug list)
Environment:
Last Closed: 2011-01-13 18:57:02 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Proposed patch - sent to ML (829 bytes, patch)
2010-06-15 07:24 EDT, Jan Friesse
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0100 normal SHIPPED_LIVE openais bug fix update 2011-01-12 12:21:13 EST

  None (edit)
Description David Teigland 2010-06-03 13:20:08 EDT
Description of problem:

node1: killall -9 groupd
other nodes get confchgs for all cpg's node1 was in with CPG_REASON_LEAVE

CPG_REASON_LEAVE is supposed to indicate that the process called cpg_leave().
CPG_REASON_PROCDOWN is supposed to indicate that the process exited without calling cpg_leave().

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 2 Jan Friesse 2010-06-15 07:24:58 EDT
Created attachment 424119 [details]
Proposed patch - sent to ML

end CPG_REASON_PROCDOWN on process left

Our manual pages are clear:

CPG_REASON_PROCDOWN - the process left a group without calling
cpg_leave().

Currently, we are sending CPG_REASON_LEAVE in such situation.
Comment 3 Steven Dake 2010-06-27 17:37:24 EDT
*** Bug 604242 has been marked as a duplicate of this bug. ***
Comment 4 David Teigland 2010-07-01 15:24:35 EDT
I tested openais-0.80.6-21.el5.x86_64.rpm and it works correctly.

on nodes 1,2,3:
service cman start; service clvmd start; mount /dev/shared/x /gfs

on node 1:
killall -9 groupd

on nodes 2,3:
group_tool -v
type             level name     id       state node id local_done
fence            0     default  00010001 FAIL_START_WAIT 1 100020003 0
[2 3]
dlm              1     clvmd    00020001 FAIL_ALL_STOPPED 1 100020003 -1
[1 2 3]
dlm              1     x        00040001 FAIL_ALL_STOPPED 1 100020003 -1
[1 2 3]
gfs              2     x        00030001 FAIL_ALL_STOPPED 1 100020003 -1
[1 2 3]

They are correctly processing a node 1 failure.

Also, 'group_tool dump' data shows the node_leave event before fix and the node_down event after.

before:
1276726614 groupd confchg total 2 left 1 joined 0
1276726614 0:default confchg left 1 joined 0 total 2
1276726614 0:default confchg removed node 1 reason 2
1276726614 0:default process_node_leave 1
1276726614 0:default cpg del node 1 total 2
1276726614 0:default make_event_id 100020002 nodeid 1 memb_count 2 type 2
1276726614 0:default queue leave event for nodeid 1

after:
1278010518 groupd confchg total 2 left 1 joined 0
1278010518 0:default add to recovery set 1
1278010518 0:default process_node_down 1
1278010518 0:default cpg del node 1 total 2 - down
1278010518 0:default make_event_id 100020003 nodeid 1 memb_count 2 type 3
1278010518 0:default queue recover event for nodeid 1
1278010518 0:default confchg left 1 joined 0 total 2
1278010518 0:default confchg removed node 1 reason 5
Comment 12 Douglas Silas 2011-01-11 18:14:37 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, the Closed Process Group (CPG) interface returned the wrong result, which could have led to incorrect behavior in some situations. With this update, the CPG interface now behaves as expected.
Comment 14 errata-xmlrpc 2011-01-13 18:57:02 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0100.html

Note You need to log in before you can comment on or make changes to this bug.