Bug 696301 - cman service fails to stop cleanly on shutdown/reboot of node
Summary: cman service fails to stop cleanly on shutdown/reboot of node
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: cluster
Version: 14
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Andrew Beekhof
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-13 19:31 UTC by Gregory Lee Bartholomew
Modified: 2011-04-28 06:21 UTC (History)
10 users (show)

Fixed In Version: pacemaker-1.1.5-1.fc14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-28 06:21:54 UTC
Type: ---


Attachments (Terms of Use)
screenshot of cman-shutdown bug in action (64.19 KB, image/png)
2011-04-13 19:31 UTC, Gregory Lee Bartholomew
no flags Details
node eb2024-58 was up continually (187.76 KB, application/octet-stream)
2011-04-14 14:29 UTC, Gregory Lee Bartholomew
no flags Details
node eb2024-59 was rebooted twice (193.69 KB, application/octet-stream)
2011-04-14 14:30 UTC, Gregory Lee Bartholomew
no flags Details

Description Gregory Lee Bartholomew 2011-04-13 19:31:43 UTC
Created attachment 491880 [details]
screenshot of cman-shutdown bug in action

Description of problem:

The cman service fails to stop cleanly on shutdown/reboot of a node when using a pacemaker+cman configuration with gfs2 handled by pacemaker.

Version-Release number of selected component (if applicable):

cman-3.1.1-1.fc14.x86_64

How reproducible:

It happens every time I do a shutdown/reboot of any one of my cluster nodes.

Steps to Reproduce:
1. Press Ctrl-Alt-Del
  
Actual results:

"[FAILED]"

Expected results:

"[  OK  ]"

Additional info:

PaceMaker version = Pacemaker-1-1-c6a01b02950b

[root@eb2024-58 ~]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
============
Last updated: Wed Apr 13 14:14:34 2011
Stack: cman
Current DC: eb2024-58.cs.siue.edu - partition with quorum
Version: 1.1.5-c6a01b02950bd17ab21a8042fa9ab0e4f9606923
2 Nodes configured, unknown expected votes
7 Resources configured.
============

Online: [ eb2024-58.cs.siue.edu eb2024-59.cs.siue.edu ]

 Clone Set: dual-gfs [gfs]
     Started: [ eb2024-58.cs.siue.edu eb2024-59.cs.siue.edu ]
 Clone Set: dual-ip [ip] (unique)
     ip:0	(ocf::heartbeat:IPaddr2):	Started eb2024-58.cs.siue.edu
     ip:1	(ocf::heartbeat:IPaddr2):	Started eb2024-59.cs.siue.edu
 Clone Set: dual-apache [apache]
     Started: [ eb2024-58.cs.siue.edu eb2024-59.cs.siue.edu ]
 stonith	(stonith:external/sbd):	Started eb2024-58.cs.siue.edu
[root@eb2024-58 ~]# cat /etc/cluster/cluster.conf 
<?xml version="1.0"?>
<cluster config_version="2" name="siue-cs">
  <cman two_node="1" expected_votes="1"/>
  <clusternodes>
    <clusternode name="eb2024-58.cs.siue.edu" nodeid="1">
      <fence/>
    </clusternode>
    <clusternode name="eb2024-59.cs.siue.edu" nodeid="2">
      <fence/>
    </clusternode>
  </clusternodes>
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
  <fencedevices/>
  <rm/>
</cluster>
[root@eb2024-58 ~]# chkconfig --list | grep ":on"
cman           	0:off	1:off	2:on	3:on	4:on	5:on	6:off
crond          	0:off	1:off	2:on	3:on	4:on	5:on	6:off
gfs2-cluster   	0:off	1:off	2:on	3:on	4:on	5:on	6:off
heartbeat      	0:off	1:off	2:on	3:on	4:on	5:on	6:off
iptables       	0:off	1:off	2:on	3:on	4:on	5:on	6:off
iscsi          	0:off	1:off	2:off	3:on	4:on	5:on	6:off
iscsid         	0:off	1:off	2:off	3:on	4:on	5:on	6:off
libvirt-guests 	0:off	1:off	2:off	3:on	4:on	5:on	6:off
lvm2-monitor   	0:off	1:on	2:on	3:on	4:on	5:on	6:off
messagebus     	0:off	1:off	2:on	3:on	4:on	5:on	6:off
netfs          	0:off	1:off	2:off	3:on	4:on	5:on	6:off
network        	0:off	1:off	2:on	3:on	4:on	5:on	6:off
nfslock        	0:off	1:off	2:off	3:on	4:on	5:on	6:off
ntpd           	0:off	1:off	2:on	3:on	4:on	5:on	6:off
pacemaker      	0:off	1:off	2:on	3:on	4:on	5:on	6:off
rpcbind        	0:off	1:off	2:on	3:on	4:on	5:on	6:off
rpcgssd        	0:off	1:off	2:off	3:on	4:on	5:on	6:off
rpcidmapd      	0:off	1:off	2:off	3:on	4:on	5:on	6:off
rsyslog        	0:off	1:off	2:on	3:on	4:on	5:on	6:off
sendmail       	0:off	1:off	2:on	3:on	4:on	5:on	6:off
sshd           	0:off	1:off	2:on	3:on	4:on	5:on	6:off
udev-post      	0:off	1:on	2:on	3:on	4:on	5:on	6:off
[root@eb2024-58 ~]# cat /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
	version: 2
	secauth: off
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 146.163.150.0
		mcastaddr: 239.255.0.0
		mcastport: 4000
	}
}

logging {
	fileline: off
	to_stderr: no
	to_logfile: yes
	to_syslog: yes
	logfile: /var/log/cluster/corosync.log
	debug: off
	timestamp: on
	logger_subsys {
		subsys: AMF
		debug: off
	}
}

amf {
	mode: disabled
}
[root@eb2024-58 ~]# 

Screenshot Attached.

Comment 1 Gregory Lee Bartholomew 2011-04-13 19:52:39 UTC
P.S. -- A side-effect of this bug seems to be that 9/10 times when the node comes back up, it does not automatically come back online.

A second reboot will hang for 1 minute when shutting down the cman service but then everything will come back online properly when the node boots back up.

gb

Comment 2 Andrew Beekhof 2011-04-14 09:40:10 UTC
Screenshot isn't very useful, we need (from all nodes) logs and the output of:
  fence_tool ls
  fence_tool dump

Comment 3 Gregory Lee Bartholomew 2011-04-14 14:29:26 UTC
Created attachment 492124 [details]
node eb2024-58 was up continually

Comment 4 Gregory Lee Bartholomew 2011-04-14 14:30:20 UTC
Created attachment 492125 [details]
node eb2024-59 was rebooted twice

Comment 5 Gregory Lee Bartholomew 2011-04-14 14:49:58 UTC
No problem.  Here is the output of fence_tool from both nodes and the log files are attached in the previous posts.

[root@eb2024-58 ~]# fence_tool ls
fence domain
member count  2
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 2 

[root@eb2024-58 ~]# fence_tool dump
1302790941 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/fenced.log
1302790941 fenced 3.1.1 started
1302790941 failed to get dbus connection
1302790941 cluster node 1 added seq 412
1302790941 cluster node 2 added seq 412
1302790941 our_nodeid 1 our_name eb2024-58.cs.siue.edu
1302790941 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/fenced.log
1302790941 cpg_join fenced:daemon ...
1302790941 setup_cpg_daemon 10
1302790941 group_mode 3 compat 0
1302790941 fenced:daemon ring 1:412 2 memb 1 2
1302790941 fenced:daemon conf 1 1 0 memb 1 join 1 left
1302790941 receive_protocol from 1 max 1.1.1.0 run 0.0.0.0
1302790941 daemon node 1 max 0.0.0.0 run 0.0.0.0
1302790941 daemon node 1 join 1302790941 left 0 local quorum 1302790941
1302790941 set_protocol member_count 1 propose daemon 1.1.1
1302790941 receive_protocol from 1 max 1.1.1.0 run 1.1.1.0
1302790941 daemon node 1 max 1.1.1.0 run 0.0.0.0
1302790941 daemon node 1 join 1302790941 left 0 local quorum 1302790941
1302790941 run protocol from nodeid 1
1302790941 daemon run 1.1.1 max 1.1.1
1302790941 receive_protocol from 1 max 1.1.1.0 run 1.1.1.0
1302790941 daemon node 1 max 1.1.1.0 run 1.1.1.0
1302790941 daemon node 1 join 1302790941 left 0 local quorum 1302790941
1302790941 client connection 3 fd 11
1302790941 /cluster/fence_daemon/@clean_start is 0
1302790941 /cluster/fence_daemon/@post_join_delay is 3
1302790941 /cluster/fence_daemon/@post_fail_delay is 0
1302790941 added 2 nodes from ccs
1302790941 cpg_join fenced:default ...
1302790941 fenced:default conf 1 1 0 memb 1 join 1 left
1302790941 add_change cg 1 joined nodeid 1
1302790941 add_change cg 1 m 1 j 1 r 0 f 0
1302790941 check_ringid cluster 412 cpg 0:0
1302790941 fenced:default ring 1:412 2 memb 1 2
1302790941 check_ringid done cluster 412 cpg 1:412
1302790941 check_quorum done
1302790941 send_start 1:1 flags 1 started 0 m 1 j 1 r 0 f 0
1302790941 receive_start 1:1 len 152
1302790941 match_change 1:1 matches cg 1
1302790941 wait_messages cg 1 got all 1
1302790941 set_master from 0 to low node 1
1302790941 send_complete 1:1 flags 1 started 0 m 1 j 1 r 0 f 0
1302790941 receive_complete 1:1 len 152
1302790944 fenced:daemon conf 2 1 0 memb 1 2 join 2 left
1302790944 receive_protocol from 2 max 1.1.1.0 run 0.0.0.0
1302790944 daemon node 2 max 0.0.0.0 run 0.0.0.0
1302790944 daemon node 2 join 1302790944 left 0 local quorum 1302790941
1302790944 receive_protocol from 1 max 1.1.1.0 run 1.1.1.1
1302790944 daemon node 1 max 1.1.1.0 run 1.1.1.0
1302790944 daemon node 1 join 1302790941 left 0 local quorum 1302790941
1302790944 receive_protocol from 2 max 1.1.1.0 run 1.1.1.0
1302790944 daemon node 2 max 1.1.1.0 run 0.0.0.0
1302790944 daemon node 2 join 1302790944 left 0 local quorum 1302790941
1302790944 fenced:default conf 2 1 0 memb 1 2 join 2 left
1302790944 add_change cg 2 joined nodeid 2
1302790944 add_change cg 2 m 2 j 1 r 0 f 0
1302790944 check_ringid done cluster 412 cpg 1:412
1302790944 check_quorum done
1302790944 send_start 1:2 flags 2 started 1 m 2 j 1 r 0 f 0
1302790944 receive_start 2:1 len 152
1302790944 match_change 2:1 matches cg 2
1302790944 wait_messages cg 2 need 1 of 2
1302790944 receive_start 1:2 len 152
1302790944 match_change 1:2 matches cg 2
1302790944 wait_messages cg 2 got all 2
1302790944 set_master from 1 to complete node 1
1302790944 send_complete 1:2 flags 2 started 1 m 2 j 1 r 0 f 0
1302790944 receive_complete 1:2 len 152
1302791048 cluster node 2 removed seq 416
1302791048 fenced:daemon conf 1 0 1 memb 1 join left 2
1302791048 fenced:daemon ring 1:416 1 memb 1
1302791048 fenced:default conf 1 0 1 memb 1 join left 2
1302791048 add_change cg 3 remove nodeid 2 reason 3
1302791048 add_change cg 3 m 1 j 0 r 1 f 1
1302791048 add_victims node 2
1302791048 check_ringid cluster 416 cpg 1:412
1302791048 fenced:default ring 1:416 1 memb 1
1302791048 check_ringid done cluster 416 cpg 1:416
1302791048 check_quorum done
1302791048 send_start 1:3 flags 2 started 2 m 1 j 0 r 1 f 1
1302791048 receive_start 1:3 len 152
1302791048 match_change 1:3 matches cg 3
1302791048 wait_messages cg 3 got all 1
1302791048 set_master from 1 to complete node 1
1302791048 delay post_fail_delay 0 quorate_from_last_update 0
1302791048 eb2024-59.cs.siue.edu not a cluster member after 0 sec post_fail_delay
1302791048 fencing node eb2024-59.cs.siue.edu
1302791048 fence eb2024-59.cs.siue.edu dev 0.0 agent none result: error no method
1302791048 fence eb2024-59.cs.siue.edu failed
1302791048 connected to dbus :1.1
1302791051 fencing node eb2024-59.cs.siue.edu
1302791051 fence eb2024-59.cs.siue.edu dev 0.0 agent none result: error no method
1302791051 fence eb2024-59.cs.siue.edu failed
1302791054 fencing node eb2024-59.cs.siue.edu
1302791054 fence eb2024-59.cs.siue.edu dev 0.0 agent none result: error no method
1302791054 fence eb2024-59.cs.siue.edu failed
1302791061 cluster node 2 added seq 420
1302791061 fenced:daemon ring 1:420 2 memb 1 2
1302791063 fence eb2024-59.cs.siue.edu overridden by administrator intervention
1302791063 send_victim_done cg 3 flags 2 victim nodeid 2
1302791063 send_complete 1:3 flags 2 started 2 m 1 j 0 r 1 f 1
1302791063 fenced:default ring 1:420 2 memb 1 2
1302791063 client connection 3 fd 16
1302791063 fenced:daemon conf 2 1 0 memb 1 2 join 2 left
1302791063 receive_protocol from 2 max 1.1.1.0 run 0.0.0.0
1302791063 daemon node 2 max 0.0.0.0 run 0.0.0.0
1302791063 daemon node 2 join 1302791063 left 1302791048 local quorum 1302790941
1302791063 send_external victim nodeid 2
1302791063 receive_protocol from 1 max 1.1.1.0 run 1.1.1.1
1302791063 daemon node 1 max 1.1.1.0 run 1.1.1.1
1302791063 daemon node 1 join 1302790941 left 0 local quorum 1302790941
1302791063 receive_victim_done 1:3 flags 2 len 80
1302791063 receive_victim_done 1:3 remove victim 2 time 1302791063 how 3
1302791063 receive_complete 1:3 len 152
1302791063 receive_external from 1 len 40 victim nodeid 2
1302791063 receive_protocol from 2 max 1.1.1.0 run 1.1.1.0
1302791063 daemon node 2 max 1.1.1.0 run 0.0.0.0
1302791063 daemon node 2 join 1302791063 left 1302791048 local quorum 1302790941
1302791063 fenced:default conf 2 1 0 memb 1 2 join 2 left
1302791063 add_change cg 4 joined nodeid 2
1302791063 add_change cg 4 m 2 j 1 r 0 f 0
1302791063 check_ringid done cluster 420 cpg 1:420
1302791063 check_quorum done
1302791063 send_start 1:4 flags 2 started 3 m 2 j 1 r 0 f 0
1302791063 receive_start 1:4 len 152
1302791063 match_change 1:4 matches cg 4
1302791063 wait_messages cg 4 need 1 of 2
1302791063 receive_start 2:1 len 152
1302791063 match_change 2:1 matches cg 4
1302791063 wait_messages cg 4 got all 2
1302791063 set_master from 1 to complete node 1
1302791063 send_complete 1:4 flags 2 started 3 m 2 j 1 r 0 f 0
1302791063 receive_complete 1:4 len 152
1302791136 fenced:default conf 1 0 1 memb 1 join left 2
1302791136 add_change cg 5 remove nodeid 2 reason 2
1302791136 add_change cg 5 m 1 j 0 r 1 f 0
1302791136 check_ringid done cluster 420 cpg 1:420
1302791136 check_quorum done
1302791136 send_start 1:5 flags 2 started 4 m 1 j 0 r 1 f 0
1302791136 receive_start 1:5 len 152
1302791136 match_change 1:5 matches cg 5
1302791136 wait_messages cg 5 got all 1
1302791136 set_master from 1 to complete node 1
1302791136 send_complete 1:5 flags 2 started 4 m 1 j 0 r 1 f 0
1302791136 receive_complete 1:5 len 152
1302791137 fenced:daemon conf 1 0 1 memb 1 join left 2
1302791137 fenced:daemon conf 1 0 1 memb 1 join left 2
1302790932 cluster node 2 removed seq 424
1302790932 fenced:daemon ring 1:424 1 memb 1
1302790932 fenced:default ring 1:424 1 memb 1
1302790941 cluster node 2 added seq 428
1302790941 fenced:daemon ring 1:428 2 memb 1 2
1302790941 fenced:default ring 1:428 2 memb 1 2
1302790945 fenced:daemon conf 2 1 0 memb 1 2 join 2 left
1302790945 receive_protocol from 1 max 1.1.1.0 run 1.1.1.1
1302790945 daemon node 1 max 1.1.1.0 run 1.1.1.1
1302790945 daemon node 1 join 1302790941 left 0 local quorum 1302790941
1302790945 receive_protocol from 2 max 1.1.1.0 run 0.0.0.0
1302790945 daemon node 2 max 0.0.0.0 run 0.0.0.0
1302790945 daemon node 2 join 1302790945 left 1302791137 local quorum 1302790941
1302790945 receive_protocol from 2 max 1.1.1.0 run 1.1.1.0
1302790945 daemon node 2 max 1.1.1.0 run 0.0.0.0
1302790945 daemon node 2 join 1302790945 left 1302791137 local quorum 1302790941
1302790945 fenced:default conf 2 1 0 memb 1 2 join 2 left
1302790945 add_change cg 6 joined nodeid 2
1302790945 add_change cg 6 m 2 j 1 r 0 f 0
1302790945 check_ringid done cluster 428 cpg 1:428
1302790945 check_quorum done
1302790945 send_start 1:6 flags 2 started 5 m 2 j 1 r 0 f 0
1302790945 receive_start 1:6 len 152
1302790945 match_change 1:6 matches cg 6
1302790945 wait_messages cg 6 need 1 of 2
1302790945 receive_start 2:1 len 152
1302790945 match_change 2:1 matches cg 6
1302790945 wait_messages cg 6 got all 2
1302790945 set_master from 1 to complete node 1
1302790945 send_complete 1:6 flags 2 started 5 m 2 j 1 r 0 f 0
1302790945 receive_complete 1:6 len 152
[root@eb2024-58 ~]# 




[root@eb2024-59 ~]# fence_tool ls
fence domain
member count  2
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 2 

[root@eb2024-59 ~]# fence_tool dump
1302791223 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/fenced.log
1302791223 fenced 3.1.1 started
1302791223 failed to get dbus connection
1302791223 cluster node 1 added seq 428
1302791223 cluster node 2 added seq 428
1302791223 our_nodeid 2 our_name eb2024-59.cs.siue.edu
1302791223 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/fenced.log
1302791223 cpg_join fenced:daemon ...
1302791223 setup_cpg_daemon 10
1302791223 group_mode 3 compat 0
1302791223 fenced:daemon conf 2 1 0 memb 1 2 join 2 left
1302791223 fenced:daemon ring 1:428 2 memb 1 2
1302791223 receive_protocol from 1 max 1.1.1.0 run 1.1.1.1
1302791223 daemon node 1 max 0.0.0.0 run 0.0.0.0
1302791223 daemon node 1 join 1302791223 left 0 local quorum 1302791223
1302791223 run protocol from nodeid 1
1302791223 daemon run 1.1.1 max 1.1.1
1302791223 receive_protocol from 2 max 1.1.1.0 run 0.0.0.0
1302791223 daemon node 2 max 0.0.0.0 run 0.0.0.0
1302791223 daemon node 2 join 1302791223 left 0 local quorum 1302791223
1302791223 receive_protocol from 2 max 1.1.1.0 run 1.1.1.0
1302791223 daemon node 2 max 1.1.1.0 run 0.0.0.0
1302791223 daemon node 2 join 1302791223 left 0 local quorum 1302791223
1302791223 client connection 3 fd 11
1302791223 /cluster/fence_daemon/@clean_start is 0
1302791223 /cluster/fence_daemon/@post_join_delay is 3
1302791223 /cluster/fence_daemon/@post_fail_delay is 0
1302791224 added 2 nodes from ccs
1302791224 cpg_join fenced:default ...
1302791224 fenced:default conf 2 1 0 memb 1 2 join 2 left
1302791224 add_change cg 1 joined nodeid 2
1302791224 add_change cg 1 m 2 j 1 r 0 f 0
1302791224 check_ringid cluster 428 cpg 0:0
1302791224 fenced:default ring 1:428 2 memb 1 2
1302791224 check_ringid done cluster 428 cpg 1:428
1302791224 check_quorum done
1302791224 send_start 2:1 flags 1 started 0 m 2 j 1 r 0 f 0
1302791224 receive_start 1:6 len 152
1302791224 match_change 1:6 matches cg 1
1302791224 save_history 2 master 1 time 1302791063 how 3
1302791224 save_history 2 ext node 1 ext time 1302791063
1302791224 wait_messages cg 1 need 1 of 2
1302791224 receive_start 2:1 len 152
1302791224 match_change 2:1 matches cg 1
1302791224 wait_messages cg 1 got all 2
1302791224 set_master from 0 to complete node 1
1302791224 receive_complete 1:6 len 152
[root@eb2024-59 ~]#

Comment 6 Andrew Beekhof 2011-04-19 10:06:36 UTC
Apr 14 09:25:37 eb2024-59 crmd: [1508]: info: cman_event_callback: CMAN wants to shut down: optional

This message should never occur.
Are you shutting down pacemaker before you attempt to stop cman?

Comment 7 Gregory Lee Bartholomew 2011-04-21 16:56:42 UTC
It looks like the pacemaker init script that comes with the Fedora rpms does stop pacemaker before cman, yes.

[user@eb2024-01 ~]$ head Desktop/pacemaker
#!/bin/bash

# Authors:
#  Andrew Beekhof <abeekhof>
#  Fabio M. Di Nitto <fdinitto>
#
# License: Revised BSD

# chkconfig: - 99 01
# description: Pacemaker Cluster Manager

[user@eb2024-01 ~]$ ssh root.siue.edu
root.siue.edu's password: 
Last login: Thu Apr 21 11:50:20 2011
[root@eb2024-58 ~]# runlevel
N 3
[root@eb2024-58 ~]# chkconfig --list | grep "pacemaker\|cman"
cman           	0:off	1:off	2:on	3:on	4:on	5:on	6:off
pacemaker      	0:off	1:off	2:on	3:on	4:on	5:on	6:off
[root@eb2024-58 ~]# ls -1 /etc/rc.d/rc3.d | grep "pacemaker\|cman"
S21cman
S99pacemaker
[root@eb2024-58 ~]# ls -1 /etc/rc.d/rc6.d | grep "pacemaker\|cman"
K01pacemaker
K79cman
[root@eb2024-58 ~]#

Comment 8 Andrew Beekhof 2011-04-22 09:17:39 UTC
How did you shut down though?

I'm almost positive "service cman stop" wont try to stop pacemaker first.
You have to specify "service pacemaker stop" first.

Comment 9 Gregory Lee Bartholomew 2011-04-26 19:41:53 UTC
I'm just pressing Ctrl-Alt-Delete and expecting things to stop in the correct order on their own.  I'm not entering any commands prior to sending the VM the reboot signal.  I may have run "chkconfig pacemaker on" and/or "chkconfig --add pacemaker" when I set up these VM's though -- I don't remember for sure.  Also, when I ran "make install" for Pacemaker-1-1-c6a01b02950b I ended up with inconsistent results on my two VM's.  On one VM, I found an /etc/rc.d/init.d/pacemaker file with a size of 0 bytes.  On the other VM the file still contained the bash-script code.  My guess as to what happened is that I "touched" the pacemaker script on the one VM at some point before upgrading to Pacemaker-1-1-c6a01b02950b and the file was left from the previous version because it had a newer timestamp.  I copied the pacemaker init script from one system to the other to reconcile the inconsistency.

P.S. This is how I did the upgrade on the VM's:

yum -y install autoconf automake make libtool
?? yum -y install intltool

yum -y install glib2-devel libxml2-devel libxslt-devel bzip2-devel libtool-ltdl-devel
yum -y install cluster-glue-libs-devel corosynclib-devel cman-devel

wget -O pacemaker.tar.gz http://hg.clusterlabs.org/pacemaker/1.1/archive/c6a01b02950b.tar.gz
tar -zxf pacemaker.tar.gz
cd pacemaker

export PREFIX=/usr
export LCRSODIR=$PREFIX/libexec/lcrso
export CLUSTER_USER=hacluster
export CLUSTER_GROUP=haclient

./autogen.sh && ./configure --prefix=$PREFIX --with-lcrso-dir=$LCRSODIR --with-cman
make

rpm -e --nodeps pacemaker

make install

Comment 10 Andrew Beekhof 2011-04-27 07:08:37 UTC
(In reply to comment #9)
> I'm just pressing Ctrl-Alt-Delete and expecting things to stop in the correct
> order on their own.  I'm not entering any commands prior to sending the VM the
> reboot signal.  I may have run "chkconfig pacemaker on" and/or "chkconfig --add
> pacemaker" when I set up these VM's though -- I don't remember for sure. 

Thats an important detail if you're not using RPM to do the install.

> Also,
> when I ran "make install" for Pacemaker-1-1-c6a01b02950b I ended up with
> inconsistent results on my two VM's.  On one VM, I found an
> /etc/rc.d/init.d/pacemaker file with a size of 0 bytes.

Thats... odd.  How did pacemaker even start if that file was empty?

[snip]

> P.S. This is how I did the upgrade on the VM's:

Its time i refreshed Fedora-14 anyway, how about I prepare an update and you try again with that.

Comment 11 Andrew Beekhof 2011-04-27 11:20:25 UTC
Update submitted:
   https://admin.fedoraproject.org/updates/pacemaker-1.1.5-1.fc14

I believe it should be possible to test by running:
   yum --enablerepo=updates-testing update pacemaker

Comment 12 Gregory Lee Bartholomew 2011-04-27 18:50:46 UTC
YES!!  Issue resolved.  Everything appears to be working great with the pacemaker-1.1.5-1.fc14 rpm.

Thanks!

P.S. Am I supposed to mark this bug closed or is that something the package maintainer is supposed to do?

Comment 13 Andrew Beekhof 2011-04-28 06:21:54 UTC
I'll close it.  Thanks for the feedback.
Did you add karma to the build at:
   https://admin.fedoraproject.org/updates/pacemaker-1.1.5-1.fc14 ?


Note You need to log in before you can comment on or make changes to this bug.