Bug 1448646

Summary: Deployment of composable role of rabbitmq on remotes pacemaker nodes fails on "Could not evaluate: backup_cib"
Product: Red Hat OpenStack Reporter: Marian Krcmarik <mkrcmari>
Component: puppet-tripleoAssignee: Michele Baldessari <michele>
Status: CLOSED ERRATA QA Contact: Udi Shkalim <ushkalim>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 11.0 (Ocata)CC: chjones, fdinitto, jjoyce, jschluet, michele, slinaber, tvignaud, ushkalim
Target Milestone: z2Keywords: Triaged, ZStream
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-6.5.0-7.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-13 21:43:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1435982    

Description Marian Krcmarik 2017-05-06 16:44:25 UTC
Description of problem:
The overcloud deployment fails on following error:
/Stage[main]/Tripleo::Profile::Pacemaker::Rabbitmq/Pacemaker::Property[rabbitmq-role-node-property]/Pcmk_property[property-messaging-0-rabbitmq-role] (err): Could not evaluate: backup_cib: Running: /usr/sbin/pcs cluster cib /var/lib/pacemaker/cib/puppet-cib-backup20170506-15994-1swnk1i failed with code: 1 ->

This error appears on first pacemaker remote node with rabbitmq role, the other two are okay and no such error appears there and rabbitmq-role is added to these two remote nodes.

Version-Release number of selected component (if applicable):
puppet-pacemaker-0.5.0-4.el7ost.noarch
puppet-tripleo-6.3.0-12.el7ost.noarch

How reproducible:
Often

Steps to Reproduce:
1. Deploy overcloud with rabbitmq composable role on pacemaker remote nodes

Actual results:
Overcloud deploy fails

Expected results:
Successful overcloud deploy

Additional info:

Comment 2 Michele Baldessari 2017-05-19 06:15:09 UTC
Master review merged. Added the ocata stable review for the puppet-tripleo change. puppet-pacemaker has no branches so I just linked the review for it.
The rdo review to bring the puppet-pacemaker fix in Ocata/Newton is here: https://review.rdoproject.org/r/#/c/6669/

Will move to POST once the ocata puppet-triple one merges

Comment 6 Udi Shkalim 2017-08-30 16:30:38 UTC
puppet-pacemaker-0.6.0-1.el7ost.noarch
puppet-tripleo-6.5.0-8.el7ost.noarch

Deployment of Composable roles with rabbitmq on pacemaker remote succeeded:
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Wed Aug 30 16:30:07 2017
Last change: Wed Aug 30 16:24:51 2017 by root via cibadmin on controller-0

6 nodes configured
34 resources configured

Online: [ controller-0 controller-1 controller-2 ]
RemoteOnline: [ messaging-0 messaging-1 messaging-2 ]

Full list of resources:

 messaging-0    (ocf::pacemaker:remote):        Started controller-0
 messaging-1    (ocf::pacemaker:remote):        Started controller-1
 messaging-2    (ocf::pacemaker:remote):        Started controller-2
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ messaging-0 messaging-1 messaging-2 ]
     Stopped: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-2 ]
     Slaves: [ controller-0 controller-1 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 ip-192.168.24.12       (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-10.0.0.107  (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.1.19 (ocf::heartbeat:IPaddr2):       Started controller-2
 ip-172.17.1.14 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-172.17.3.18 (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.4.18 (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 8 errata-xmlrpc 2017-09-13 21:43:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2721