Bug 1772201
| Summary: | After a minor update to 13z9 , "overcloud deploy" fails on compute steps | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | David Hill <dhill> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Alex Schultz <aschultz> |
| Status: | CLOSED ERRATA | QA Contact: | Sasha Smolyak <ssmolyak> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 13.0 (Queens) | CC: | aschultz, bshephar, cjeanner, ljozsa, mburns, michele |
| Target Milestone: | --- | Keywords: | Triaged, ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-8.4.1-21.el7ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-03-10 11:22:07 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
It looks as though the configs were generated for fluentd/collectd/sensu prior to the facter cache being generated. The execution of the container configs for these containers would have created the facter.conf as a directory if it didn't previously exist.
In the logs we see:
Nov 11 15:17:49 ocd97-compute-0 systemd: Started libcontainer container 40abe582f905223301c821238d8c2272ac54170d30321b134243c508d067101f.
Nov 11 15:17:49 ocd97-compute-0 journal: + mkdir -p /etc/puppet
Nov 11 15:17:49 ocd97-compute-0 journal: + mkdir -p /etc/puppet
Nov 11 15:17:49 ocd97-compute-0 journal: + mkdir -p /etc/puppet
Nov 11 15:17:49 ocd97-compute-0 journal: + cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet
Nov 11 15:17:49 ocd97-compute-0 journal: + cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet
Nov 11 15:17:49 ocd97-compute-0 journal: + cp -a /tmp/puppet-etc/auth.conf /tmp/puppet-etc/hiera.yaml /tmp/puppet-etc/hieradata /tmp/puppet-etc/modules /tmp/puppet-etc/puppet.conf /tmp/puppet-etc/ssl /etc/puppet
Nov 11 15:17:49 ocd97-compute-0 journal: + rm -Rf /etc/puppet/ssl
Nov 11 15:17:49 ocd97-compute-0 journal: + rm -Rf /etc/puppet/ssl
Nov 11 15:17:49 ocd97-compute-0 journal: + rm -Rf /etc/puppet/ssl
Nov 11 15:17:49 ocd97-compute-0 journal: + echo '{"step": 6}'
Nov 11 15:17:49 ocd97-compute-0 journal: + echo '{"step": 6}'
Nov 11 15:17:49 ocd97-compute-0 journal: + echo '{"step": 6}'
Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS=
Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS=
Nov 11 15:17:49 ocd97-compute-0 journal: + '[' -n file,file_line,concat,augeas,cron,sensu_rabbitmq_config,sensu_client_config,sensu_check_config,sensu_check ']'
Nov 11 15:17:49 ocd97-compute-0 journal: + '[' -n file,file_line,concat,augeas,cron,collectd_client_config ']'
Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS='--tags file,file_line,concat,augeas,cron,collectd_client_config'
Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS='--tags file,file_line,concat,augeas,cron,sensu_rabbitmq_config,sensu_client_config,sensu_check_config,sensu_check'
Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS=
Nov 11 15:17:49 ocd97-compute-0 journal: + origin_of_time=/var/lib/config-data/collectd.origin_of_time
Nov 11 15:17:49 ocd97-compute-0 journal: + touch /var/lib/config-data/collectd.origin_of_time
Nov 11 15:17:49 ocd97-compute-0 journal: + origin_of_time=/var/lib/config-data/sensu.origin_of_time
Nov 11 15:17:49 ocd97-compute-0 journal: + touch /var/lib/config-data/sensu.origin_of_time
Nov 11 15:17:49 ocd97-compute-0 journal: + '[' -n file,file_line,concat,augeas,cron,config ']'
Nov 11 15:17:49 ocd97-compute-0 journal: + TAGS='--tags file,file_line,concat,augeas,cron,config'
Nov 11 15:17:49 ocd97-compute-0 journal: + origin_of_time=/var/lib/config-data/fluentd.origin_of_time
Nov 11 15:17:49 ocd97-compute-0 journal: + touch /var/lib/config-data/fluentd.origin_of_time
This is the start of docker-puppet.py which would mount the facter.conf into the container being used to run this script. The facter.conf creation doesn't happen until 16:02
Nov 11 16:02:49 ocd97-compute-0 python: ansible-stat Invoked with checksum_algorithm=sha1 get_checksum=True follow=False checksum_algo=sha1 path=/var/lib/container-puppet/puppetlabs/facter.conf get_md5=None get_mime=True get_attributes=True
Since this is a folder now, the deployment fails:
TASK [Write facter cache config] ***********************************************
fatal: [localhost]: FAILED! => {\"changed\": false, \"msg\": \"can not use content with a dir as dest\"}
\tto retry, use: --limit @/var/lib/heat-config/heat-config-ansible/ceeec832-1552-48ac-9998-a748e883641d_playbook.retry
I'll need to look into the deployment templates to understand why collectd/sensu/fluentd config execution occurred prior to the actual deployment.
@dhill, Do you know if the customer is using an out of band configuration for collectd/sunsu/fluentd?
Also can we get the templates and undercloud logs? In the mean time, i'll work on adding some additional checks to ensure we don't hit this case again. *** Bug 1772955 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0760 |
Description of problem: After a minor update to 13z9 , "overcloud deploy" fails on compute steps: 2019-11-12 22:24:24Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn.2]: CREATE_FAILED Error: resources[2]: Deployment to server failed: deploy_status_code : D eployment exited with non-zero status code: 2 2019-11-12 22:24:24Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn]: UPDATE_FAILED Resource CREATE failed: Error: resources[2]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh.ComputeDeployment_Step1]: UPDATE_FAILED resources.ComputeDeployment_Step1: Resource CREATE failed: Error: resources[2]: Deploym ent to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh]: UPDATE_FAILED Resource UPDATE failed: resources.ComputeDeployment_Step1: Resource CREATE failed: Error: resources[2]: Deploym ent to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [AllNodesDeploySteps]: UPDATE_FAILED resources.ComputeDeployment_Step1: resources.AllNodesDeploySteps.Resource CREATE failed: Error: resources[2]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 2019-11-12 22:24:25Z [overcloud]: UPDATE_FAILED Resource UPDATE failed: resources.ComputeDeployment_Step1: resources.AllNodesDeploySteps.Resource CREATE failed: Error: resources[2]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2 2019-11-12 22:24:27Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn.1]: SIGNAL_IN_PROGRESS Signal: deployment f5a25447-4dd1-4955-ae55-38e3fd7998a7 failed (2) 2019-11-12 22:24:28Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn.1]: CREATE_FAILED Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 2019-11-12 22:24:28Z [overcloud-AllNodesDeploySteps-uby2uxg5d4wh-ComputeDeployment_Step1-ncqbfiswytpn]: UPDATE_FAILED Resource CREATE failed: Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 Stack overcloud UPDATE_FAILED overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.1: resource_type: OS::Heat::StructuredDeployment physical_resource_id: f5a25447-4dd1-4955-ae55-38e3fd7998a7 status: CREATE_FAILED status_reason: | Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 deploy_stdout: | ... TASK [Create puppet caching structures] **************************************** changed: [localhost] TASK [Write facter cache config] *********************************************** fatal: [localhost]: FAILED! => {"changed": false, "msg": "can not use content with a dir as dest"} to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/0a1e882a-daa3-4132-84c3-ed9396b5fcf7_playbook.retry PLAY RECAP ********************************************************************* localhost : ok=27 changed=9 unreachable=0 failed=1 (truncated, view all with --long) deploy_stderr: | Version-Release number of selected component (if applicable): How reproducible: This environment Steps to Reproduce: 1. Update from z8 to z9 , complete update an then run an 'overcloud deploy' 2. 3. Actual results: Fails Expected results: Succeeds Additional info: