Description of problem: After a minor update to 13z9 , "overcloud deploy" fails on compute steps: 2019-11-12 22:24:24Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn.2]: CREATE_FAILED Error: resources[2]: Deployment to server failed: deploy_status_code : D eployment exited with non-zero status code: 2 2019-11-12 22:24:24Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn]: UPDATE_FAILED Resource CREATE failed: Error: resources[2]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh.ComputeDeployment_Step1]: UPDATE_FAILED resources.ComputeDeployment_Step1: Resource CREATE failed: Error: resources[2]: Deploym ent to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh]: UPDATE_FAILED Resource UPDATE failed: resources.ComputeDeployment_Step1: Resource CREATE failed: Error: resources[2]: Deploym ent to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [AllNodesDeploySteps]: UPDATE_FAILED resources.ComputeDeployment_Step1: resources.AllNodesDeploySteps.Resource CREATE failed: Error: resources[2]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [overcloud]: UPDATE_FAILED Resource UPDATE failed: resources.ComputeDeployment_Step1: resources.AllNodesDeploySteps.Resource CREATE failed: Error: resources[2]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 2019-11-12 22:24:27Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn.1]: SIGNAL_IN_PROGRESS Signal: deployment f5a25447-4dd1-4955-ae55-38e3fd7998a7 failed (2) 2019-11-12 22:24:28Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn.1]: CREATE_FAILED Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:28Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn]: UPDATE_FAILED Resource CREATE failed: Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 Stack overcloud UPDATE_FAILED overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.1: resource_type: OS::Heat::StructuredDeployment physical_resource_id: f5a25447-4dd1-4955-ae55-38e3fd7998a7 status: CREATE_FAILED status_reason: | Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 deploy_stdout: | ... TASK [Create puppet caching structures] **************************************** changed: [localhost] TASK [Write facter cache config] *********************************************** fatal: [localhost]: FAILED! => {"changed": false, "msg": "can not use content with a dir as dest"} to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/0a1e882a-daa3-4132-84c3-ed9396b5fcf7_playbook.retry PLAY RECAP ********************************************************************* localhost : ok=27 changed=9 unreachable=0 failed=1 (truncated, view all with --long) deploy_stderr: | Version-Release number of selected component (if applicable): How reproducible: This environment Steps to Reproduce: 1. Update from z8 to z9 , complete update an then run an 'overcloud deploy' 2. 3. Actual results: Fails Expected results: Succeeds Additional info:
It looks as though the configs were generated for fluentd/collectd/sensu prior to the facter cache being generated. The execution of the container configs for these containers would have created the facter.conf as a directory if it didn't previously exist. In the logs we see: Nov 11 15:17:49 ocd97-compute-0 systemd: Started libcontainer container 40abe582f905223301c821238d8c2272ac54170d30321b134243c508d067101f. Nov 11 15:17:49 ocd97-compute-0 journal: + mkdir -p /etc/puppet Nov 11 15:17:49 ocd97-compute-0 journal: + mkdir -p /etc/puppet Nov 11 15:17:49 ocd97-compute-0 journal: + mkdir -p /etc/puppet Nov 11 15:17:49 ocd97-compute-0 journal: + cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet Nov 11 15:17:49 ocd97-compute-0 journal: + cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet Nov 11 15:17:49 ocd97-compute-0 journal: + cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet Nov 11 15:17:49 ocd97-compute-0 journal: + rm -Rf /etc/puppet/ssl Nov 11 15:17:49 ocd97-compute-0 journal: + rm -Rf /etc/puppet/ssl Nov 11 15:17:49 ocd97-compute-0 journal: + rm -Rf /etc/puppet/ssl Nov 11 15:17:49 ocd97-compute-0 journal: + echo '{"step": 6}' Nov 11 15:17:49 ocd97-compute-0 journal: + echo '{"step": 6}' Nov 11 15:17:49 ocd97-compute-0 journal: + echo '{"step": 6}' Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS= Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS= Nov 11 15:17:49 ocd97-compute-0 journal: + '[' -n file,file_line,concat,augeas,cron,sensu_rabbitmq_config,sensu_client_config,sensu_check_config,sensu_check ']' Nov 11 15:17:49 ocd97-compute-0 journal: + '[' -n file,file_line,concat,augeas,cron,collectd_client_config ']' Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS='--tags file,file_line,concat,augeas,cron,collectd_client_config' Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS='--tags file,file_line,concat,augeas,cron,sensu_rabbitmq_config,sensu_client_config,sensu_check_config,sensu_check' Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS= Nov 11 15:17:49 ocd97-compute-0 journal: + origin_of_time=/var/lib/config-data/collectd.origin_of_time Nov 11 15:17:49 ocd97-compute-0 journal: + touch /var/lib/config-data/collectd.origin_of_time Nov 11 15:17:49 ocd97-compute-0 journal: + origin_of_time=/var/lib/config-data/sensu.origin_of_time Nov 11 15:17:49 ocd97-compute-0 journal: + touch /var/lib/config-data/sensu.origin_of_time Nov 11 15:17:49 ocd97-compute-0 journal: + '[' -n file,file_line,concat,augeas,cron,config ']' Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS='--tags file,file_line,concat,augeas,cron,config' Nov 11 15:17:49 ocd97-compute-0 journal: + origin_of_time=/var/lib/config-data/fluentd.origin_of_time Nov 11 15:17:49 ocd97-compute-0 journal: + touch /var/lib/config-data/fluentd.origin_of_time This is the start of docker-puppet.py which would mount the facter.conf into the container being used to run this script. The facter.conf creation doesn't happen until 16:02 Nov 11 16:02:49 ocd97-compute-0 python: ansible-stat Invoked with checksum_algorithm=sha1 get_checksum=True follow=False checksum_algo=sha1 path=/var/lib/container-puppet/puppetlabs/facter.conf get_md5=None get_mime=True get_attributes=True Since this is a folder now, the deployment fails: TASK [Write facter cache config] *********************************************** fatal: [localhost]: FAILED! => {\"changed\": false, \"msg\": \"can not use content with a dir as dest\"} \tto retry, use: --limit @/var/lib/heat-config/heat-config-ansible/ceeec832-1552-48ac-9998-a748e883641d_playbook.retry I'll need to look into the deployment templates to understand why collectd/sensu/fluentd config execution occurred prior to the actual deployment. @dhill, Do you know if the customer is using an out of band configuration for collectd/sunsu/fluentd?
Also can we get the templates and undercloud logs?
In the mean time, i'll work on adding some additional checks to ensure we don't hit this case again.
*** Bug 1772955 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0760