Bug 1579514

Summary: Unable to enable barbican on a deployed OSP13 overcloud.
Product: Red Hat OpenStack Reporter: Gregory Charot <gcharot>
Component: openstack-tripleo-heat-templatesAssignee: Harry Rybacki <hrybacki>
Status: CLOSED ERRATA QA Contact: Joe H. Rahme <jhakimra>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 13.0 (Queens)CC: abishop, acanan, alee, asimonel, dbecker, dcadzow, hrybacki, jamsmith, jcoufal, jhakimra, jschluet, kbasil, lbopf, mburns, michele, morazi, pkesavar
Target Milestone: rcKeywords: Triaged
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.0.2-28.el7ost Doc Type: If docs needed, set a value
Doc Text:
If this bug requires documentation, please select an appropriate Doc Type value.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-27 13:57:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1559105    
Bug Blocks:    

Description Gregory Charot 2018-05-17 20:13:55 UTC
Description of problem:

Enable barbican on an already deployed OSP13 overcloud fails.

Version-Release number of selected component (if applicable):

13 (rhos-release)

How reproducible:

Always

Steps to Reproduce:
1. Deploy OSP13 (without Barbican)
2. Add relevant templates / parameters
3. Redeploy (re-run openstack overcloud deploy)

Added:
  BarbicanSimpleCryptoGlobalDefault: true (as parameter_defaults)

-e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \

to the openstack overcloud deploy command. Worth noting that doing so on a fresh deployment works.

Actual results:

2018-05-17 15:02:36Z [overcloud-AllNodesDeploySteps-pbzpmmxe5mkn-ControllerDeployment_Step3-yk43l37uiu4j.0]: UPDATE_FAILED  Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2018-05-17 15:02:36Z [overcloud-AllNodesDeploySteps-pbzpmmxe5mkn-ControllerDeployment_Step3-yk43l37uiu4j]: UPDATE_FAILED  Resource UPDATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2018-05-17 15:02:36Z [overcloud-AllNodesDeploySteps-pbzpmmxe5mkn.ControllerDeployment_Step3]: UPDATE_FAILED  Error: resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2018-05-17 15:02:36Z [overcloud-AllNodesDeploySteps-pbzpmmxe5mkn]: UPDATE_FAILED  Resource UPDATE failed: Error: resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2018-05-17 15:02:37Z [AllNodesDeploySteps]: UPDATE_FAILED  Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2018-05-17 15:02:37Z [overcloud]: UPDATE_FAILED  Resource UPDATE failed: Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step3.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2

 Stack overcloud UPDATE_FAILED

overcloud.AllNodesDeploySteps.ControllerDeployment_Step3.0:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 8fc0a4aa-e4db-458f-98a9-884236fa673c
  status: UPDATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
            "stdout: ee8cbd0a718db0118ca438a4ad01902b4a99494ac79a8b024b9af593d7c9de20",
            "stdout: 7af0f45b608cec27552065a90fe9365ee164051b39e2ec0c8deb621a1c048918",
            "stdout: 832254d6c85d4564f28b7430fcc20f30b20db5863736ff298a2ed632d10cb30e"
        ]
    }
        to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/ccc5aa45-fac0-41c8-8c3c-8d6781397c8b_playbook.retry

    PLAY RECAP *********************************************************************
    localhost                  : ok=6    changed=2    unreachable=0    failed=1

    (truncated, view all with --long)
  deploy_stderr: |


Expected results:

Barbican enabled.

Additional info:

Looking at logs on the controllers we can see it can't connect to the DB:
/var/log/containers/barbican/main.log:

2018-05-17 20:04:40.732 18 ERROR barbican.model.repositories OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user 'barbican'@'172.17.1.201' (using password: YES)") (Background on this error at: http://sqlalche.me/e/e3q8)

More output at
http://pastebin.test.redhat.com/592939

Looking at the DB, I can confirm there is no mysql user with "barbican" as login

MariaDB [mysql]> select * from user where user = "barbican" \G;
Empty set (0.00 sec)

ERROR: No query specified

Comment 2 Ade Lee 2018-05-18 14:11:05 UTC
This is a general problem with updates - not Barbican specific.  It just happens to be manifested here.

There are two parts to this fix.  The first is the mysql init fix which was fixed in master, but not yet backported.  The gerrit review for the upstream backport is linked above.

The second part is being able to update HAProxy, which as I understand, is something that PIDONE is working on.

Comment 3 Harry Rybacki 2018-05-18 14:18:13 UTC
Re-assigning to DFG:PIDONE for awareness/triaging

Comment 4 Michele Baldessari 2018-05-18 15:22:25 UTC
The haproxy restart on config changes is tracked here: https://bugzilla.redhat.com/show_bug.cgi?id=1559105 have some ideas but did not get time yet to try them out

Comment 5 Harry Rybacki 2018-05-18 15:34:26 UTC
Ack, thanks Michele. Noting the depends on and moving back to our DFG.

Comment 7 Gregory Charot 2018-05-18 15:51:21 UTC
Manually patching mysql.yaml with [1] worked for me.

Had to manually restart Haproxy after the stack update:
pcs resource restart haproxy-bundle

[1] https://review.openstack.org/#/c/567816/

Comment 8 Gregory Charot 2018-05-22 11:43:52 UTC
FYI need to restart cinder-volume as well as it is managed by pck. Failing to do so prevents from using Cinder encrypted volumes:

Exception during message handling: KeyError: '3277be34-4d4c-4a88-91c0-721170cb443a !=
00000000-0000-0000-0000-000000000000'

Current workaround:
pcs resource restart openstack-cinder-volume

Comment 9 Alan Bishop 2018-05-22 12:00:39 UTC
The need to restart pacemaker/cinder-volume might be due to the issue described in bug #1559105 comment #9.

Comment 10 Gregory Charot 2018-05-22 12:11:07 UTC
Yes much likely as the problem seems to target all pck managed services. My comment is only a FYI - should be solved when 1559105 is fixed.

Comment 12 Harry Rybacki 2018-05-29 15:38:43 UTC
Downstream build complete. Bug fixed-in: openstack-tripleo-heat-templates-8.0.2-28.el7ost

Moving but to MODIFIED state

Comment 21 Joe H. Rahme 2018-06-01 15:24:51 UTC
Deploying Barbican failed with the following error:


	2018-06-01 14:46:03Z [overcloud-AllNodesDeploySteps-5vhs2kdhcatp-ControllerDeployment_Step1-hsp2tkea6q4l.0]: SIGNAL_IN_PROGRESS  Signal: deployment 2f2da265-1966-4380-a5da-5b3675ad443d failed (2)
    2018-06-01 14:46:03Z [overcloud-AllNodesDeploySteps-5vhs2kdhcatp-ControllerDeployment_Step1-hsp2tkea6q4l.0]: UPDATE_FAILED  Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
    2018-06-01 14:46:03Z [overcloud-AllNodesDeploySteps-5vhs2kdhcatp-ControllerDeployment_Step1-hsp2tkea6q4l]: UPDATE_FAILED  Resource UPDATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
    2018-06-01 14:46:03Z [overcloud-AllNodesDeploySteps-5vhs2kdhcatp.ControllerDeployment_Step1]: UPDATE_FAILED  Error: resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
    2018-06-01 14:46:03Z [overcloud-AllNodesDeploySteps-5vhs2kdhcatp]: UPDATE_FAILED  Resource UPDATE failed: Error: resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
    2018-06-01 14:46:04Z [AllNodesDeploySteps]: UPDATE_FAILED  Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
    2018-06-01 14:46:04Z [overcloud]: UPDATE_FAILED  Resource UPDATE failed: Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
    
     Stack overcloud UPDATE_FAILED 
    
    overcloud.AllNodesDeploySteps.ControllerDeployment_Step1.0:
      resource_type: OS::Heat::StructuredDeployment
      physical_resource_id: 2f2da265-1966-4380-a5da-5b3675ad443d
      status: UPDATE_FAILED
      status_reason: |
        Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
      deploy_stdout: |
        ...
                "2018-06-01 14:46:01,545 INFO: 464886 -- Removing container: docker-puppet-horizon", 
                "2018-06-01 14:46:01,595 INFO: 464886 -- Finished processing puppet configs for horizon", 
                "2018-06-01 14:46:01,595 ERROR: 464883 -- ERROR configuring barbican"
            ]
        }
        	to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/1e9eb85c-0ed3-47d2-ad9f-ed279eec5784_playbook.retry
        
        PLAY RECAP *********************************************************************
        localhost                  : ok=27   changed=12   unreachable=0    failed=1   
        
        (truncated, view all with --long)
      deploy_stderr: |
    
    Heat Stack update failed.
    Heat Stack update failed.



Steps to reproduce
==================

Steps to Reproduce:
1. Deploy OSP13 with TLS everywhere
2. Add relevant templates / parameters
3. Redeploy (re-run openstack overcloud deploy)

Added:
  BarbicanSimpleCryptoGlobalDefault: true (as parameter_defaults)

-e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \


Puddle tested
=============

	(overcloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
	13   -p 2018-05-29.2
	(overcloud) [stack@undercloud-0 ~]$ yum info openstack-tripleo-heat-templates
	Loaded plugins: search-disabled-repos
	Installed Packages
	Name        : openstack-tripleo-heat-templates
	Arch        : noarch
	Version     : 8.0.2
	Release     : 28.el7ost
	Size        : 3.3 M
	Repo        : installed
	From repo   : rhelosp-13.0-puddle
	Summary     : Heat templates for TripleO
	URL         : https://wiki.openstack.org/wiki/TripleO
	License     : ASL 2.0
	Description : OpenStack TripleO Heat Templates is a collection of templates and tools for
		    : building Heat Templates to do deployments of OpenStack.

Comment 23 Aharon Canan 2018-06-01 15:50:31 UTC
Following comment #21, re open

A.

Comment 33 Joe H. Rahme 2018-06-08 15:16:35 UTC
Moving this bug to VERIFIED as I was able to update an overcloud to enable barbican successfully.


Relevant info: With the introduction of containers, it's not enough anymore to simply add the THT files and redeploy. An operator has to prepare the container images and upload them to the registry too.

Here's the complete procedure to enable barbican on an existing cloud (copying the exact commands I ran for reference):

1. Deploy overcloud
    
    openstack overcloud deploy \
    --timeout 100 \
    --templates /usr/share/openstack-tripleo-heat-templates \
    --stack overcloud \
    --libvirt-type kvm \
    --ntp-server clock.redhat.com \
    -e /home/stack/virt/config_lvm.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
    -e /home/stack/virt/network/network-environment.yaml \
    -e /home/stack/virt/hostnames.yml \
    -e /home/stack/virt/nodes_data.yaml \
    -e /home/stack/virt/extra_templates.yaml \
    -e /home/stack/container-parameters2.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \
    -e /home/stack/custom_params.yaml \
    --log-file overcloud_deployment_39.log

2. custom params

	[stack@undercloud-0 ~]$ cat custom_params.yaml 
	parameter_defaults:
	  BarbicanSimpleCryptoGlobalDefault: true


3. Prepare new images, include custom_params.yaml and the relevant tht files.

	openstack overcloud container image prepare \
	--namespace rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13 \
	--tag 2018-06-06.1 \
	--push-destination 192.168.24.1:8787 \
	--output-images-file ~/container-images-with-barbican.yaml \
	-e /home/stack/virt/config_lvm.yaml \
	-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
	-e /home/stack/virt/network/network-environment.yaml \
	-e /home/stack/virt/hostnames.yml \
	-e /home/stack/virt/nodes_data.yaml \
	-e /home/stack/virt/extra_templates.yaml \
	-e /home/stack/virt/docker-images.yaml \
	-e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
	-e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \
	-e /home/stack/custom_params.yaml


4. upload container images to undercloud registry

	openstack overcloud container image upload --debug --config-file container-images-with-barbican.yaml 

5. prepare the new environment file

    openstack overcloud container image prepare \
      --tag 2018-06-06.1 \
      --namespace 192.168.24.1:8787/rhosp13 \
      --output-env-file ~/container-parameters-with-barbican.yaml \
      -e /home/stack/virt/config_lvm.yaml \
      -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
      -e /home/stack/virt/network/network-environment.yaml \
      -e /home/stack/virt/hostnames.yml \
      -e /home/stack/virt/nodes_data.yaml \
      -e /home/stack/virt/extra_templates.yaml \
      -e /home/stack/virt/docker-images.yaml \
      -e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
      -e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \
      -e /home/stack/custom_params.yaml

6. update overcloud 

    openstack overcloud deploy \
    --timeout 100 \
    --templates /usr/share/openstack-tripleo-heat-templates \
    --stack overcloud \
    --libvirt-type kvm \
    --ntp-server clock.redhat.com \
    -e /home/stack/virt/config_lvm.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
    -e /home/stack/virt/network/network-environment.yaml \
    -e /home/stack/virt/hostnames.yml \
    -e /home/stack/virt/nodes_data.yaml \
    -e /home/stack/virt/extra_templates.yaml \
    -e /home/stack/container-parameters-with-barbican.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \
    -e /home/stack/custom_params.yaml \
    --log-file overcloud_deployment_38.log

Comment 34 Harry Rybacki 2018-06-10 18:51:44 UTC
Joe, thanks for your hard work and the informative update! I'll make sure the team sees this and discuss relevant doc updates.

Comment 38 errata-xmlrpc 2018-06-27 13:57:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086