Bug 1528632

Summary: stack update operation fails in rabbitmq config generation
Product: Red Hat OpenStack Reporter: Damien Ciabrini <dciabrin>
Component: puppet-tripleoAssignee: Damien Ciabrini <dciabrin>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: high Docs Contact:
Priority: high    
Version: 12.0 (Pike)CC: dciabrin, jamsmith, jjoyce, jschluet, pkomarov, slinaber, tvignaud, uemit.seren
Target Milestone: z3Keywords: Triaged, ZStream
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-7.4.12-4.el7ost Doc Type: Bug Fix
Doc Text:
Prior to this update, running a "stack update" operation on an existing stack to reassess the state of Heat resources caused a failure in container docker-puppet-rabbitmq. This failure prevented users from running stack update operations. This update fixes the issue by changing the way puppet configuration is done in the rabbitmq container docker-puppet-rabbitmq.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-20 12:58:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Damien Ciabrini 2017-12-22 13:14:37 UTC
Description of problem:

After a HA overcloud has been successfully deployed, subsequent stack update operation fail in docker-puppet-rabbitmq container.

This is because the tripleo puppet module [1] that configures rabbitmq create a rabbitmq_user resource by calling an explicit puppet provider:

    if $stack_action == 'UPDATE' {
      # Required for changing password on update scenario. Password will be changed only when
      # called explicity, if the rabbitmq service is already running.
      rabbitmq_user { $rabbitmq_user :
        password => $rabbitmq_pass,
        provider => 'rabbitmqctl',
        admin => true,
      }
    }

By doing so, the usual noop_resource override cannot be used, and puppet triggers a call to /usr/bin/rabbitmqctl. This unwanted call then fails because the container is not configured to allow calls to rabbitmq or access to the Erlang VM.

[1] tripleo/manifests/profile/base/rabbitmq.pp


How reproducible:
Always

Steps to Reproduce:
1. deploy a stack
2. re-deploy on the same stack (UPDATE action)

Actual results:
The redeploy should finish

Expected results:
The redeploy errors out during the config generation (docker-puppet-rabbitmq)

Additional info:

Comment 2 Damien Ciabrini 2017-12-23 10:39:55 UTC
Fixed in Pike upstream https://review.openstack.org/#/c/529856/

Comment 8 pkomarov 2018-07-17 08:02:49 UTC
Verified ,

 OC stack update finished succesfully , all cluster resources are on good health

Details:

(undercloud) [stack@undercloud-0 ~]$ rhos-release -L
Installed repositories (rhel-7.5):
  12
  ceph-2
  ceph-osd-2
  rhel-7.5


[stack@undercloud-0 ~]$ cat core_puddle_version
2018-07-13.1[stack@undercloud-0 ~]$ rpm -qa|grep puppet-tripleo
puppet-tripleo-7.4.12-4.el7ost.noarch


(undercloud) [stack@undercloud-0 ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| ID                                   | Stack Name | Project                          | Stack Status    | Creation Time        | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| 21941980-681d-44d2-972d-44ad1aeb5563 | overcloud  | b53bc16fd86a483ba31f252093bb7ea6 | CREATE_COMPLETE | 2018-07-17T04:48:36Z | None         |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+

Minor update procedure:

cat > custom_params.yaml <<EOF
parameter_defaults:
  ExtraConfig:
    tripleo::haproxy::haproxy_globals_override:
      'maxconn': 1111
EOF

echo "-e /home/stack/custom_params.yaml >> overcloud_deploy.sh"

./overcloud_deploy.sh
...



2018-07-17 07:54:23Z [overcloud-AllNodesDeploySteps-jcmuxrksrhe5.ObjectStorageExtraConfigPost]: UPDATE_COMPLETE  state changed
2018-07-17 07:54:23Z [overcloud-AllNodesDeploySteps-jcmuxrksrhe5.CephStorageExtraConfigPost]: UPDATE_COMPLETE  state changed
2018-07-17 07:54:23Z [overcloud-AllNodesDeploySteps-jcmuxrksrhe5.ComputeExtraConfigPost]: UPDATE_COMPLETE  state changed
2018-07-17 07:54:34Z [overcloud-AllNodesDeploySteps-jcmuxrksrhe5]: UPDATE_COMPLETE  Stack UPDATE completed successfully
2018-07-17 07:54:35Z [AllNodesDeploySteps]: UPDATE_COMPLETE  state changed
2018-07-17 07:54:51Z [overcloud]: UPDATE_COMPLETE  Stack UPDATE completed successfully

 Stack overcloud UPDATE_COMPLETE

Overcloud Endpoint: http://10.0.0.104:5000/v2.0
Overcloud Deployed


...


(undercloud) [stack@undercloud-0 ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+----------------------+
| ID                                   | Stack Name | Project                          | Stack Status       | Creation Time        | Updated Time         |
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+----------------------+
| 21941980-681d-44d2-972d-44ad1aeb5563 | overcloud  | b53bc16fd86a483ba31f252093bb7ea6 | UPDATE_IN_PROGRESS | 2018-07-17T04:48:36Z | 2018-07-17T07:17:53Z |
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+----------------------+
(undercloud) [stack@undercloud-0 ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| ID                                   | Stack Name | Project                          | Stack Status    | Creation Time        | Updated Time         |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+
| 21941980-681d-44d2-972d-44ad1aeb5563 | overcloud  | b53bc16fd86a483ba31f252093bb7ea6 | UPDATE_COMPLETE | 2018-07-17T04:48:36Z | 2018-07-17T07:17:53Z |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+----------------------+


[root@controller-0 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Tue Jul 17 08:00:58 2018
Last change: Tue Jul 17 05:47:48 2018 by root via cibadmin on controller-0

12 nodes configured
37 resources configured

Online: [ controller-0 controller-1 controller-2 ]
GuestOnline: [ galera-bundle-0@controller-0 galera-bundle-1@controller-1 galera-bundle-2@controller-2 rabbitmq-bundle-0@controller-0 rabbitmq-bundle-1@controller-1 rabbitmq-bundle-2@controller-2 redis-bundle-0@controller-0 redis-bundle-1@controller-1 redis-bundle-2@controller-2 ]

Full list of resources:

 Docker container set: rabbitmq-bundle [192.168.24.1:8787/rhosp12/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0    (ocf::heartbeat:rabbitmq-cluster):      Started controller-0
   rabbitmq-bundle-1    (ocf::heartbeat:rabbitmq-cluster):      Started controller-1
   rabbitmq-bundle-2    (ocf::heartbeat:rabbitmq-cluster):      Started controller-2
 Docker container set: galera-bundle [192.168.24.1:8787/rhosp12/openstack-mariadb:pcmklatest]
   galera-bundle-0      (ocf::heartbeat:galera):        Master controller-0
   galera-bundle-1      (ocf::heartbeat:galera):        Master controller-1
   galera-bundle-2      (ocf::heartbeat:galera):        Master controller-2
 Docker container set: redis-bundle [192.168.24.1:8787/rhosp12/openstack-redis:pcmklatest]
   redis-bundle-0       (ocf::heartbeat:redis): Master controller-0
   redis-bundle-1       (ocf::heartbeat:redis): Slave controller-1
   redis-bundle-2       (ocf::heartbeat:redis): Slave controller-2
 ip-192.168.24.12       (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-10.0.0.104  (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.1.10 (ocf::heartbeat:IPaddr2):       Started controller-2
 ip-172.17.1.16 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-172.17.3.16 (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.4.14 (ocf::heartbeat:IPaddr2):       Started controller-2
 Docker container set: haproxy-bundle [192.168.24.1:8787/rhosp12/openstack-haproxy:pcmklatest]
   haproxy-bundle-docker-0      (ocf::heartbeat:docker):        Started controller-0
   haproxy-bundle-docker-1      (ocf::heartbeat:docker):        Started controller-1
   haproxy-bundle-docker-2      (ocf::heartbeat:docker):        Started controller-2
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 11 errata-xmlrpc 2018-08-20 12:58:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2331