Bug 820357

Summary: cmannotifyd does not issue initial quorum state
Product: Red Hat Enterprise Linux 6 Reporter: RHEL Program Management <pm-rhel>
Component: clusterAssignee: Fabio Massimo Di Nitto <fdinitto>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.3CC: ccaulfie, cluster-maint, djansa, fdinitto, jpayne, jwest, lhh, mgoulish, pm-eus, rpeterso, syeghiay, teigland, tross
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: cluster-3.0.12.1-23.el6_2.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-14 09:33:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 819787    
Bug Blocks:    

Description RHEL Program Management 2012-05-09 18:25:45 UTC
This bug has been copied from bug #819787 and has been proposed
to be backported to 6.2 z-stream (EUS).

Comment 4 Fabio Massimo Di Nitto 2012-05-09 18:32:02 UTC
http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=188a980e506b14cfdf94de9129360dd10f5a53d6

building is temporary blocked on distgit cluster rhel-6.2 branch creation/fix.

Comment 7 Justin Payne 2012-05-10 19:02:55 UTC
Verified in cman-3.0.12.1-23.el6_2.1

[root@dash-03 zstream]# rpm -q cman
cman-3.0.12.1-23.el6_2.1.x86_64

[root@dash-01 zstream]# rpm -q cman
cman-3.0.12.1-23.el6_2.1.x86_64

[root@dash-03 zstream]# ll /etc/cluster/cman-notify.d/
total 4
-rwxr-xr-x. 1 root root 1578 May  8 18:01 cman_notify_template.sh

[root@dash-01 zstream]# ll /etc/cluster/cman-notify.d/
total 4
-rwxr-xr-x. 1 root root 1578 May  8 18:14 cman_notify_template.sh

[root@dash-01 zstream]# cman_tool status
<----------------- CUT OUT ----------------------->
Version: 6.2.0
Config Version: 1
Cluster Name: dash
Cluster Id: 57228
Cluster Member: Yes
Cluster Generation: 96
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
May 10 13:50:17 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:50:17 corosync [CMAN  ] daemon: client command is 90
May 10 13:50:17 corosync [CMAN  ] daemon: About to process command
May 10 13:50:17 corosync [CMAN  ] memb: command to process is 90
May 10 13:50:17 corosync [CMAN  ] memb: command return code is -2
May 10 13:50:17 corosync [CMAN  ] daemon: Returning command data. length = 0
May 10 13:50:17 corosync [CMAN  ] daemon: sending reply 40000090 to fd 17
Total votes: 2
Node votes: 1
Quorum: 2  
May 10 13:50:17 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:50:17 corosync [CMAN  ] daemon: client command is d
May 10 13:50:17 corosync [CMAN  ] daemon: About to process command
May 10 13:50:17 corosync [CMAN  ] memb: command to process is d
May 10 13:50:17 corosync [CMAN  ] memb: command return code is 1
May 10 13:50:17 corosync [CMAN  ] daemon: Returning command data. length = 0
May 10 13:50:17 corosync [CMAN  ] daemon: sending reply 4000000d to fd 17
Active subsystems: 1
Flags: 
Ports Bound: 0  
May 10 13:50:17 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:50:17 corosync [CMAN  ] daemon: client command is 90
May 10 13:50:17 corosync [CMAN  ] daemon: About to process command
May 10 13:50:17 corosync [CMAN  ] memb: command to process is 90
May 10 13:50:17 corosync [CMAN  ] memb: command return code is 0
May 10 13:50:17 corosync [CMAN  ] daemon: Returning command data. length = 440
May 10 13:50:17 corosync [CMAN  ] daemon: sending reply 40000090 to fd 17
Node name: dash-01
Node ID: 1

[root@dash-03 zstream]# cman_tool status
<--------------- CUT OUT --------------------------->
Version: 6.2.0
Config Version: 1
Cluster Name: dash
Cluster Id: 57228
Cluster Member: Yes
Cluster Generation: 96
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
May 10 13:52:25 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:52:25 corosync [CMAN  ] daemon: client command is 90
May 10 13:52:25 corosync [CMAN  ] daemon: About to process command
May 10 13:52:25 corosync [CMAN  ] memb: command to process is 90
May 10 13:52:25 corosync [CMAN  ] memb: command return code is -2
May 10 13:52:25 corosync [CMAN  ] daemon: Returning command data. length = 0
May 10 13:52:25 corosync [CMAN  ] daemon: sending reply 40000090 to fd 17
Total votes: 2
Node votes: 1
Quorum: 2  
May 10 13:52:25 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:52:25 corosync [CMAN  ] daemon: client command is d
May 10 13:52:25 corosync [CMAN  ] daemon: About to process command
May 10 13:52:25 corosync [CMAN  ] memb: command to process is d
May 10 13:52:25 corosync [CMAN  ] memb: command return code is 1
May 10 13:52:25 corosync [CMAN  ] daemon: Returning command data. length = 0
May 10 13:52:25 corosync [CMAN  ] daemon: sending reply 4000000d to fd 17
Active subsystems: 1
Flags: 
Ports Bound: 0  
May 10 13:52:25 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:52:25 corosync [CMAN  ] daemon: client command is 90
May 10 13:52:25 corosync [CMAN  ] daemon: About to process command
May 10 13:52:25 corosync [CMAN  ] memb: command to process is 90
May 10 13:52:25 corosync [CMAN  ] memb: command return code is 0
May 10 13:52:25 corosync [CMAN  ] daemon: Returning command data. length = 440
May 10 13:52:25 corosync [CMAN  ] daemon: sending reply 40000090 to fd 17
Node name: dash-03
Node ID: 3

[1]+  Done                    /usr/sbin/cmannotifyd
[root@dash-01 zstream]# cat /var/log/cluster/file.log
debugging is enabled
replace me with something to do
debugging is enabled
replace me with something to do
we still have quorum

[1]+  Done                    /usr/sbin/cmannotifyd
[root@dash-03 zstream]# cat /var/log/cluster/file.log
debugging is enabled
replace me with something to do
debugging is enabled
replace me with something to do
we still have quorum

[root@dash-03 zstream]# cman_tool leave
May 10 13:55:01 corosync [CMAN  ] daemon: read 20 bytes from fd 18
May 10 13:55:01 corosync [CMAN  ] daemon: client command is 800000bb
May 10 13:55:01 corosync [CMAN  ] daemon: About to process command
May 10 13:55:01 corosync [CMAN  ] memb: command to process is 800000bb
May 10 13:55:01 corosync [CMAN  ] daemon: sending reply 102 to fd 17
May 10 13:55:01 corosync [CMAN  ] memb: command return code is -11
May 10 13:55:01 corosync [CMAN  ] daemon: read 20 bytes from fd 17
May 10 13:55:01 corosync [CMAN  ] daemon: client command is bc
May 10 13:55:01 corosync [CMAN  ] daemon: About to process command
May 10 13:55:01 corosync [CMAN  ] memb: command to process is bc
May 10 13:55:01 corosync [CMAN  ] memb: Shutdown reply is 1
May 10 13:55:01 corosync [CMAN  ] memb: Sending LEAVE, reason 0
May 10 13:55:01 corosync [CMAN  ] ais: comms send message 0x7fff832defe0 len = 4
May 10 13:55:01 corosync [CMAN  ] memb: shutdown decision is: 0 (yes=1, no=0)
flags=0
May 10 13:55:01 corosync [CMAN  ] memb: command return code is -11
May 10 13:55:01 corosync [TOTEM ] mcasted message added to pending queue
May 10 13:55:01 corosync [TOTEM ] Delivering 1a to 1b
May 10 13:55:01 corosync [TOTEM ] Delivering MCAST message with seq 1b to
pending delivery queue
May 10 13:55:01 corosync [CMAN  ] ais: deliver_fn source nodeid = 3, len=20,
endian_conv=0
May 10 13:55:01 corosync [CMAN  ] memb: Message on port 0 is 7
May 10 13:55:01 corosync [CMAN  ] memb: got LEAVE from node 3, reason = 0
May 10 13:55:01 corosync [CMAN  ] daemon: send status return: 0
May 10 13:55:01 corosync [CMAN  ] daemon: sending reply c00000bb to fd 18
May 10 13:55:01 corosync [TOTEM ] Received ringid(10.15.89.168:96) seq 1b
May 10 13:55:01 corosync [SERV  ] Unloading all Corosync service engines.
May 10 13:55:01 corosync [TOTEM ] releasing messages up to and including 1b
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync extended
virtual synchrony service
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync
configuration service
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync cluster
closed process group service v1.01
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync cluster
config database access v1.01
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync profile
loading service
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: openais checkpoint
service B.01.01
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync CMAN
membership service 2.90
May 10 13:55:01 corosync [SERV  ] Service engine unloaded: corosync cluster
quorum service v0.1
May 10 13:55:01 corosync [TOTEM ] sending join/leave message
May 10 13:55:01 corosync [MAIN  ] Corosync Cluster Engine exiting with status 0
at main.c:1864.

[root@dash-01 zstream]# cat /var/log/cluster/cmannotifyd.log
May 08 18:20:19 cmannotifyd shutting down...
May 08 18:27:01 cmannotifyd Dispatching first cluster status
May 08 18:29:01 cmannotifyd Received a cman shutdown request
May 08 18:29:01 cmannotifyd waiting for cman to reappear..
May 08 18:34:01 cmannotifyd Dispatching first cluster status
May 08 18:34:01 cmannotifyd cman is back..
May 08 18:40:01 cmannotifyd shutting down...
May 10 13:52:52 cmannotifyd Dispatching first cluster status
May 10 13:54:41 cmannotifyd Received a cman statechange notification

[root@dash-03 zstream]# cman_tool join

[root@dash-03 zstream]# cman_tool status
<--------------- CUT OUT --------------------------->
Version: 6.2.0
Config Version: 1
Cluster Name: dash
Cluster Id: 57228
Cluster Member: Yes
Cluster Generation: 104
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
May 10 13:57:10 corosync [CMAN  ] daemon: read 20 bytes from fd 18
May 10 13:57:10 corosync [CMAN  ] daemon: client command is 90
May 10 13:57:10 corosync [CMAN  ] daemon: About to process command
May 10 13:57:10 corosync [CMAN  ] memb: command to process is 90
May 10 13:57:10 corosync [CMAN  ] memb: command return code is -2
May 10 13:57:10 corosync [CMAN  ] daemon: Returning command data. length = 0
May 10 13:57:10 corosync [CMAN  ] daemon: sending reply 40000090 to fd 18
Total votes: 2
Node votes: 1
Quorum: 2  
May 10 13:57:10 corosync [CMAN  ] daemon: read 20 bytes from fd 18
May 10 13:57:10 corosync [CMAN  ] daemon: client command is d
May 10 13:57:10 corosync [CMAN  ] daemon: About to process command
May 10 13:57:10 corosync [CMAN  ] memb: command to process is d
May 10 13:57:10 corosync [CMAN  ] memb: command return code is 2
May 10 13:57:10 corosync [CMAN  ] daemon: Returning command data. length = 0
May 10 13:57:10 corosync [CMAN  ] daemon: sending reply 4000000d to fd 18
Active subsystems: 2
Flags: 
Ports Bound: 0  
May 10 13:57:10 corosync [CMAN  ] daemon: read 20 bytes from fd 18
May 10 13:57:10 corosync [CMAN  ] daemon: client command is 90
May 10 13:57:10 corosync [CMAN  ] daemon: About to process command
May 10 13:57:10 corosync [CMAN  ] memb: command to process is 90
May 10 13:57:10 corosync [CMAN  ] memb: command return code is 0
May 10 13:57:10 corosync [CMAN  ] daemon: Returning command data. length = 440
May 10 13:57:10 corosync [CMAN  ] daemon: sending reply 40000090 to fd 18
Node name: dash-03
Node ID: 3

[root@dash-01 zstream]# cat /var/log/cluster/cmannotifyd.log 
May 08 18:20:19 cmannotifyd shutting down...
May 08 18:27:01 cmannotifyd Dispatching first cluster status
May 08 18:29:01 cmannotifyd Received a cman shutdown request
May 08 18:29:01 cmannotifyd waiting for cman to reappear..
May 08 18:34:01 cmannotifyd Dispatching first cluster status
May 08 18:34:01 cmannotifyd cman is back..
May 08 18:40:01 cmannotifyd shutting down...
May 10 13:52:52 cmannotifyd Dispatching first cluster status
May 10 13:54:41 cmannotifyd Received a cman statechange notification
May 10 13:56:35 cmannotifyd Received a cman statechange notification
May 10 13:56:35 cmannotifyd Received a cman statechange notification

Comment 9 errata-xmlrpc 2012-05-14 09:33:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0575.html