Description of problem: After update to osp10z3, I am unable to perform a successful deploy with templates that worked fine since OSP10GA. Note that using an OSP10z3 undercloud with the n-1 OSP10 images works fine (Those are 10.0-20170504.2). Version-Release number of selected component (if applicable): rhosp-director-images-10.0-20170615.1.el7ost.noarch rhosp-director-images-ipa-10.0-20170615.1.el7ost.noarch How reproducible: 100% Steps to Reproduce: In my environment file, I have: ControllerEnableSwiftStorage: false # CinderEnableIscsiBackend: false NovaEnableRbdBackend: true CinderEnableRbdBackend: true CinderBackupBackend: ceph GlanceBackend: rbd GnocchiBackend: rbd # GlanceBackend: swift NovaRbdPoolName: vms CinderRbdPoolName: volumes GlanceRbdPoolName: images My deploy command line is as follows: time openstack overcloud deploy \ --templates \ --control-scale 1 \ --compute-scale 1 \ --ceph-storage-scale 1 \ --swift-storage-scale 0 \ --control-flavor control \ --compute-flavor compute \ --ceph-storage-flavor ceph-storage \ --swift-storage-flavor swift-storage \ --ntp-server '10.20.0.1", "10.20.0.2' \ --validation-errors-fatal \ -e ${TRIPLEO_DIR}/environments/network-isolation.yaml \ -e ${TRIPLEO_DIR}/environments/storage-environment.yaml \ -e ${TRIPLEO_DIR}/environments/ceph-radosgw.yaml \ -e ${TOP_DIR}/net-bond-with-vlans-with-nic4.yaml \ -e ${TOP_DIR}/rhel-registration-environment.yaml \ -e ${TOP_DIR}/storage-environment.yaml \ -e ${TOP_DIR}/krynn-environment.yaml \ -e ${TOP_DIR}/extraconfig-environment.yaml \ -e ${TOP_DIR}/enable-tls.yaml \ -e ${TOP_DIR}/inject-trust-anchor.yaml \ -e ${TRIPLEO_DIR}/environments/tls-endpoints-public-ip.yaml \ -e ${TOP_DIR}/local-environment.yaml \ -e ${TOP_DIR}/token_flush-environment.yaml \ "$@" || exit 127 Actual results: Note that using or getting rid of ceph-radosgw.yaml doesn't make a difference. Deploy fails with: 2017-07-05 18:46:11Z [overcloud.AllNodesDeploySteps.ComputeExtraConfigPost]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:12Z [overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate.SwiftRingUpdate]: CREATE_COMPLETE state changed 2017-07-05 18:46:12Z [overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate]: CREATE_COMPLETE Stack CREATE completed successfully 2017-07-05 18:46:12Z [overcloud.AllNodesDeploySteps.BlockStorageExtraConfigPost]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:12Z [overcloud.AllNodesDeploySteps.ObjectStorageExtraConfigPost]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:12Z [overcloud.AllNodesDeploySteps.CephStorageExtraConfigPost]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.ControllerExtraConfigPost]: CREATE_COMPLETE state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.BlockStorageExtraConfigPost]: CREATE_COMPLETE state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.ComputeExtraConfigPost]: CREATE_COMPLETE state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.ObjectStorageExtraConfigPost]: CREATE_COMPLETE state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.CephStorageExtraConfigPost]: CREATE_COMPLETE state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate]: CREATE_COMPLETE state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:13Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_IN_PROGRESS Stack CREATE started 2017-07-05 18:46:14Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeConfig]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:14Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeConfig]: CREATE_COMPLETE state changed 2017-07-05 18:46:14Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment]: CREATE_IN_PROGRESS state changed 2017-07-05 18:46:59Z [overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate.SwiftRingUpdate]: CREATE_FAILED Error: resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 18:46:59Z [overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate]: CREATE_FAILED Resource CREATE failed: Error: resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 18:46:59Z [overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate]: CREATE_FAILED Error: resources.ControllerSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 18:46:59Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_FAILED CREATE aborted 2017-07-05 18:46:59Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED Resource CREATE failed: Error: resources.ControllerSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 18:47:00Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment]: CREATE_FAILED CREATE aborted 2017-07-05 18:47:00Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED Error: resources.AllNodesDeploySteps.resources.ControllerSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 18:47:00Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_FAILED Resource CREATE failed: Operation cancelled 2017-07-05 18:47:00Z [overcloud]: CREATE_FAILED Resource CREATE failed: Error: resources.AllNodesDeploySteps.resources.ControllerSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 Stack overcloud CREATE_FAILED Heat Stack create failed. Expected results: CREATE_COMPLETE Additional info:
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
When I do a: openstack stack failures list --long overcloud I get this: [stack@instack (osp5-rh) ~]$ openstack stack failures list --long overcloud overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment: resource_type: OS::Heat::SoftwareDeployments physical_resource_id: 0306b08f-ed1c-4d9b-a993-4e9788f264c2 status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate.SwiftRingUpdate.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: f863187e-4af7-45c4-95a6-9e5cd30fed7f status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 deploy_stdout: | /tmp/tmp.YNPmpWkCxU /var/lib/heat-config/heat-config-script /etc/swift/account.builder /etc/swift/container.builder /etc/swift/object.builder /etc/swift/account.ring.gz /etc/swift/container.ring.gz /etc/swift/object.ring.gz /etc/swift/backups/1499279557.object.builder /etc/swift/backups/1499279558.container.builder /etc/swift/backups/1499279574.account.builder /etc/swift/backups/1499279592.account.builder /etc/swift/backups/1499279592.account.ring.gz /etc/swift/backups/1499279592.container.builder /etc/swift/backups/1499279592.container.ring.gz /etc/swift/backups/1499279593.object.builder /etc/swift/backups/1499279593.object.ring.gz /var/lib/heat-config/heat-config-script deploy_stderr: | tar: Removing leading `/' from member names
I'm currently attempting to re-deploy with a swift node and without radosGW.
Just tried it with: --control-scale 1 \ --compute-scale 1 \ --ceph-storage-scale 1 \ --swift-storage-scale 1 \ It failed too.
Failure: 2017-07-05 20:25:11Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeConfig]: CREATE_IN_PROGRESS state changed 2017-07-05 20:25:11Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeConfig]: CREATE_COMPLETE state changed 2017-07-05 20:25:11Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment]: CREATE_IN_PROGRESS state changed 2017-07-05 20:25:50Z [overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate.SwiftRingUpdate]: CREATE_FAILED Error: resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 20:25:50Z [overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate]: CREATE_FAILED Resource CREATE failed: Error: resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 20:25:51Z [overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate]: CREATE_FAILED Error: resources.ObjectStorageSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 20:25:51Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_FAILED CREATE aborted 2017-07-05 20:25:51Z [overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate]: CREATE_FAILED CREATE aborted 2017-07-05 20:25:51Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED Resource CREATE failed: Error: resources.ObjectStorageSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 20:25:51Z [overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate.SwiftRingUpdate]: CREATE_FAILED CREATE aborted 2017-07-05 20:25:51Z [overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate]: CREATE_FAILED Resource CREATE failed: Operation cancelled 2017-07-05 20:25:52Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment]: CREATE_FAILED CREATE aborted 2017-07-05 20:25:52Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_FAILED Resource CREATE failed: Operation cancelled 2017-07-05 20:25:52Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED Error: resources.AllNodesDeploySteps.resources.ObjectStorageSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 2017-07-05 20:25:52Z [overcloud]: CREATE_FAILED Resource CREATE failed: Error: resources.AllNodesDeploySteps.resources.ObjectStorageSwiftRingUpdate.resources.SwiftRingUpdate.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1 Stack overcloud CREATE_FAILED Heat Stack create failed. real 57m42.885s user 0m20.362s sys 0m2.654s + exit 127
overcloud.AllNodesDeploySteps.ControllerPostPuppet.ControllerPostPuppetMaintenanceModeDeployment: resource_type: OS::Heat::SoftwareDeployments physical_resource_id: c7bbee29-71f9-463f-b09f-f10327f7d3ba status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.ControllerSwiftRingUpdate.SwiftRingUpdate.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: a47e5e03-bd65-4f53-aa37-bcb1b4d2e27e status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 deploy_stdout: | /tmp/tmp.GvgLN8QmsH /var/lib/heat-config/heat-config-script /etc/swift/account.builder /etc/swift/container.builder /etc/swift/object.builder /etc/swift/account.ring.gz /etc/swift/container.ring.gz /etc/swift/object.ring.gz /etc/swift/backups/1499285410.container.builder /etc/swift/backups/1499285410.object.builder /etc/swift/backups/1499285428.account.builder /etc/swift/backups/1499285448.account.builder /etc/swift/backups/1499285448.account.ring.gz /etc/swift/backups/1499285449.container.builder /etc/swift/backups/1499285449.container.ring.gz /etc/swift/backups/1499285449.object.builder /etc/swift/backups/1499285449.object.ring.gz /var/lib/heat-config/heat-config-script deploy_stderr: | tar: Removing leading `/' from member names overcloud.AllNodesDeploySteps.ObjectStorageSwiftRingUpdate.SwiftRingUpdate.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: 3dff28ab-e85b-422f-81c9-4509283bcd67 status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 deploy_stdout: | /tmp/tmp.3B19ye8RTg /var/lib/heat-config/heat-config-script /etc/swift/account.builder /etc/swift/container.builder /etc/swift/object.builder /etc/swift/account.ring.gz /etc/swift/container.ring.gz /etc/swift/object.ring.gz /etc/swift/backups/1499285369.object.builder /etc/swift/backups/1499285370.account.builder /etc/swift/backups/1499285370.container.builder /etc/swift/backups/1499285374.account.builder /etc/swift/backups/1499285374.account.ring.gz /etc/swift/backups/1499285375.container.builder /etc/swift/backups/1499285375.container.ring.gz /etc/swift/backups/1499285375.object.builder /etc/swift/backups/1499285375.object.ring.gz /var/lib/heat-config/heat-config-script deploy_stderr: | tar: Removing leading `/' from member names
I think this is related: logging in to ctrl0 and trying the same commands than those which were attempted during deploy: [root@krynn-ctrl-0 swift]# /var/lib/heat-config/heat-config-script/136d4baf-ce45-4c00-8206-0289dd2e65bd /tmp/tmp.EtO3DVnBZW /etc/swift tar: Removing leading `/' from member names /etc/swift/account.builder /etc/swift/container.builder /etc/swift/object.builder /etc/swift/account.ring.gz /etc/swift/container.ring.gz /etc/swift/object.ring.gz /etc/swift/backups/1499285410.container.builder /etc/swift/backups/1499285410.object.builder /etc/swift/backups/1499285428.account.builder /etc/swift/backups/1499285448.account.builder /etc/swift/backups/1499285448.account.ring.gz /etc/swift/backups/1499285449.container.builder /etc/swift/backups/1499285449.container.ring.gz /etc/swift/backups/1499285449.object.builder /etc/swift/backups/1499285449.object.ring.gz /etc/swift [root@krynn-ctrl-0 swift]# echo $? 1 [root@krynn-ctrl-0 swift]# cat /var/lib/heat-config/heat-config-script/136d4baf-ce45-4c00-8206-0289dd2e65bd #!/bin/sh TMP_DATA=$(mktemp -d) function cleanup { rm -Rf "$TMP_DATA" } trap cleanup EXIT # sanity check in case rings are not consistent within cluster swift-recon --md5 | grep -q "doesn't match" && exit 1 pushd ${TMP_DATA} tar -cvzf swift-rings.tar.gz /etc/swift/*.builder /etc/swift/*.ring.gz /etc/swift/backups/* resp=`curl --insecure --silent -X PUT "${swift_ring_put_tempurl}" --write-out "%{http_code}" --data-binary @swift-rings.tar.gz` popd if [ "$resp" != "201" ]; then exit 1 fi
[root@krynn-ctrl-0 swift]# bash -x /var/lib/heat-config/heat-config-script/136d4baf-ce45-4c00-8206-0289dd2e65bd ++ mktemp -d + TMP_DATA=/tmp/tmp.kQ0Zom6oCg + trap cleanup EXIT + swift-recon --md5 + grep -q 'doesn'\''t match' + pushd /tmp/tmp.kQ0Zom6oCg /tmp/tmp.kQ0Zom6oCg /etc/swift + tar -cvzf swift-rings.tar.gz /etc/swift/account.builder /etc/swift/container.builder /etc/swift/object.builder /etc/swift/account.ring.gz /etc/swift/container.ring.gz /etc/swift/object.ring.gz /etc/swift/backups/1499285410.container.builder /etc/swift/backups/1499285410.object.builder /etc/swift/backups/1499285428.account.builder /etc/swift/backups/1499285448.account.builder /etc/swift/backups/1499285448.account.ring.gz /etc/swift/backups/1499285449.container.builder /etc/swift/backups/1499285449.container.ring.gz /etc/swift/backups/1499285449.object.builder /etc/swift/backups/1499285449.object.ring.gz tar: Removing leading `/' from member names /etc/swift/account.builder /etc/swift/container.builder /etc/swift/object.builder /etc/swift/account.ring.gz /etc/swift/container.ring.gz /etc/swift/object.ring.gz /etc/swift/backups/1499285410.container.builder /etc/swift/backups/1499285410.object.builder /etc/swift/backups/1499285428.account.builder /etc/swift/backups/1499285448.account.builder /etc/swift/backups/1499285448.account.ring.gz /etc/swift/backups/1499285449.container.builder /etc/swift/backups/1499285449.container.ring.gz /etc/swift/backups/1499285449.object.builder /etc/swift/backups/1499285449.object.ring.gz ++ curl --insecure --silent -X PUT '' --write-out '%{http_code}' --data-binary @swift-rings.tar.gz + resp=000 + popd /etc/swift + '[' 000 '!=' 201 ']' + exit 1 + cleanup + rm -Rf /tmp/tmp.kQ0Zom6oCg
One thing I don't understand is why it's trying to do a: ControllerSwiftRingUpdate.SwiftRingUpdate since this is a fresh deploy (not an update) and I don't have any swift nodes.. I've commented this stuff on on my undercloud and I'm attempting to re-deploy.
One other thing: I re-installed clean one of my OSP10 underclouds and it did NOT run into this issue using the -same- templates. I wonder if this could be happening because the underclouds I used had been 'upgraded' (as in 'yum update') to reach OSP10z3. I still have a few non-working underclouds and will try to investigate there.
If the undercloud cannot be re-installed from scratch, please execute: 1) mistral environment-list | grep -v -e ID -e "---" | cut -f2 -d"|" | mistral environment-delete 2) mistral workflow-list | grep -v -e ID -e "---" | cut -f2 -d"|" | mistral workflow-delete 3) rm -rf /srv/node/* 4) openstack undercloud install
This failed for me with another customer, but I got a slighly different error message. So I'm unsure if this is the same bug: heat deployment-show de73f634-7334-4e80-9c64-70cfa6dbc6d3 (...) "deploy_stderr": "exception: connect failed\n\u001b[1;31mWarning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Ceilometer]): Both $metering_secret and $telemetry_secret defined, using $telemetry_secret\u001b[0m\n\u001b[1;31mWarning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[account]/Exec[rebalance_account]: Failed to call refresh: swift-ring-builder /etc/swift/account.builder rebalance 999 returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[account]/Exec[rebalance_account]: swift-ring-builder /etc/swift/account.builder rebalance 999 returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[container]/Exec[rebalance_container]: Failed to call refresh: swift-ring-builder /etc/swift/container.builder rebalance 999 returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[container]/Exec[rebalance_container]: swift-ring-builder /etc/swift/container.builder rebalance 999 returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[object]/Exec[rebalance_object]: Failed to call refresh: swift-ring-builder /etc/swift/object.builder rebalance 999 returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Profile::Base::Swift::Ringbuilder/Swift::Ringbuilder::Rebalance[object]/Exec[rebalance_object]: swift-ring-builder /etc/swift/object.builder rebalance 999 returned 1 instead of one of [0]\u001b[0m\n", "deploy_status_code": 6
@Andreas: I think you stumbled upon this: https://bugzilla.redhat.com/show_bug.cgi?id=1459919
Can we confirm this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1459919 and if so, is it sufficient for that fix to be present in the next 10z?