Bug 1775731
| Summary: | Undercloud update fails during during configuration generation (step1) | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Sofer Athlan-Guyot <sathlang> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Alex Schultz <aschultz> |
| Status: | CLOSED ERRATA | QA Contact: | Ronnie Rasouli <rrasouli> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 16.0 (Train) | CC: | aschultz, jfrancoa, mburns |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | 16.0 (Train on RHEL 8.1) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-11.3.1-0.20191129201420.8343952.el8ost.noarch.rpm | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-02-06 14:42:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Sofer Athlan-Guyot
2019-11-22 16:52:23 UTC
So in troubleshooting this, it seems to be failing when trying to cleanup files that were being removed via rsync. This is related to https://opendev.org/openstack/tripleo-heat-templates/commit/34107c3b1c548552f5c2c5823a57be82937f9cbd which was trying to ensure the files are properly cleaned from the puppet-generated folder. The issue in this case is that the swift ring builder has some files that get created in etc/swift/backup/. So this code gets a list like: deleting etc/swift/backups/1574440742.container.ring.gz deleting etc/swift/backups/1574440742.container.builder deleting etc/swift/backups/1574440741.object.ring.gz deleting etc/swift/backups/1574440741.object.builder deleting etc/swift/backups/1574440741.account.ring.gz deleting etc/swift/backups/1574440741.account.builder deleting etc/swift/backups/1574440736.container.builder deleting etc/swift/backups/1574440736.account.builder deleting etc/swift/backups/1574440735.object.builder These lines are outputted to $TMPFILE rsync -av -R --dry-run --delete-after $exclude_files $rsync_srcs ${conf_data_path} |\ awk '/^deleting/ {print $2}' > $TMPFILE The code then takes these files and tries to make sure they are removed: cat $TMPFILE | xargs -n1 -r -I{} \ bash -c "test -f ${puppet_generated_path}/{} && rm -f ${puppet_generated_path}/{}" However if the files don't exist, this command actually fails with a 123 causing the task to fail. Considering this line is trying to remove these files, it likely shouldn't fail if the file is already missing [root@undercloud-0 container-puppet]# cat foo
/does/not/exist
[root@undercloud-0 container-puppet]# cat foo | xargs -n1 -r -I{} bash -c "test -f {} && echo 'hi'"
[root@undercloud-0 container-puppet]# echo $?
123
The undercloud update failed again.
Error starting containers which are in use
2019-12-03 20:09:11 | "<13>Dec 3 20:08:59 puppet-user: Notice: /Stage[main]/Swift::Proxy/Swift_proxy_config[pipeline:main/pipeline]/value: value changed catch_errors gatekeeper healthcheck proxy-logging cache container_sync bulk tempurl ratelimit copy container-quotas account-quotas slo dlo versioned_writes proxy-logging proxy-server to catch_errors healthcheck proxy-logging cache ratelimit bulk tempurl formpost authtoken s3api s3token keystone staticweb copy container_quotas account_quotas slo dlo versioned_writes proxy-logging proxy-server",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"keepalived\" is already in use by \"892313d1465566693e604b374b75a953f0bf4e6049a1310f0a4a7d5bde3fafe2\". You have to remove that container to be able to reuse that name.: that name is already in use",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"memcached\" is already in use by \"faa5ed43b0b65e485dbddeb6949aea6547791c3e5edaf7d13f23dd68265a56db\". You have to remove that container to be able to reuse that name.: that name is already in use",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"mysql_init_logs\" is already in use by \"9a8ec7c87c646d50a4c28b034d4e4f74e3d77837fd827a09b4b2b2b67be9a30d\". You have to remove that container to be able to reuse that name.: that name is already in use",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"rabbitmq_init_logs\" is already in use by \"bfaf6fb9d51a08f0738440950a94dc5d342a504eaaa4a35c030cd489fd153e5b\". You have to remove that container to be able to reuse that name.: that name is already in use",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"haproxy\" is already in use by \"3171adcf80051d24c783bf7d2fbe2580fc14c3580ecbe4b663c5539077de976f\". You have to remove that container to be able to reuse that name.: that name is already in use",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"rabbitmq_bootstrap\" is already in use by \"3d77f3265f22ece2c0b0b91b473c38beaf44b96b67fa17eb45eadf4df1ff58b3\". You have to remove that container to be able to reuse that name.: that name is already in use",
2019-12-03 20:09:22 | "Error: error creating container storage: the container name \"rabbitmq\" is already in use by \"d59e5669e7ac4b6660fd9f56a6d755f2c5f263b7a7f390cdb3fb3358861366a6\". You have to remove that container to be able to reuse that name.: that name is already in use"
2019-12-03 20:09:22 | raise exceptions.DeploymentError('Deployment failed')
2019-12-03 20:03:22 3114 [Warning] Aborted connection 3114 to db: 'ironic' user: 'ironic' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:22 2763 [Warning] Aborted connection 2763 to db: 'nova_api' user: 'nova_api' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:22 2784 [Warning] Aborted connection 2784 to db: 'nova' user: 'nova' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:22 3021 [Warning] Aborted connection 3021 to db: 'heat' user: 'heat' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:22 2966 [Warning] Aborted connection 2966 to db: 'heat' user: 'heat' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:22 3139 [Warning] Aborted connection 3139 to db: 'ironic' user: 'ironic' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:22 3116 [Warning] Aborted connection 3116 to db: 'ironic' user: 'ironic' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 2755 [Warning] Aborted connection 2755 to db: 'nova' user: 'nova' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 2756 [Warning] Aborted connection 2756 to db: 'nova' user: 'nova' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 2752 [Warning] Aborted connection 2752 to db: 'nova_cell0' user: 'nova' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 3078 [Warning] Aborted connection 3078 to db: 'ovs_neutron' user: 'neutron' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 3076 [Warning] Aborted connection 3076 to db: 'ovs_neutron' user: 'neutron' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 3077 [Warning] Aborted connection 3077 to db: 'ovs_neutron' user: 'neutron' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 2749 [Warning] Aborted connection 2749 to db: 'nova_api' user: 'nova_api' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
2019-12-03 20:03:23 3012 [Warning] Aborted connection 3012 to db: 'heat' user: 'heat' host: 'undercloud-0.redhat.local' (Got an error reading communication packets)
This doesn't happen anymore with latest puddle, moving to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:0283 |