Description of problem: when trying to delete a failed stack with 'openstack stack delete overcloud --wait --yes', the delete process fails. Version-Release number of selected component (if applicable): 13 -p 2018-06-15.2 How reproducible: sometimes Steps to Reproduce: 1. install rhos13 with 3 controllers, 2 computes 2. include /home/stack/swift.yaml in overcloud_deploy.sh 3. include barbican parameters in overcloud_deploy.sh 4. deploy will fail 5. try to delete the stack using: openstack stack delete overcloud --wait --yes Actual results: (undercloud) [stack@undercloud-0 ~]$ openstack stack delete overcloud --wait --yes 2018-06-19 09:51:42Z [overcloud]: DELETE_IN_PROGRESS Stack DELETE started 2018-06-19 09:51:43Z [overcloud.Compute]: DELETE_IN_PROGRESS state changed 2018-06-19 09:51:43Z [overcloud.Compute]: DELETE_FAILED ResourceInError: resources.Compute.resources[0].resources.NovaCompute: Went to status ERROR due to "Server compute-0 delete failed: (500) Error destroying the instance on node ca8d3c45-d79b-4217-a08f-64c3233f01d1. Provision state still 'deleting'." 2018-06-19 09:51:43Z [overcloud]: DELETE_FAILED Resource DELETE failed: ResourceInError: resources.Compute.resources[0].resources.NovaCompute: Went to status ERROR due to "Server compute-0 delete failed: (500) Error destroying the instance on node ca8d3c45-d79b-4217-a08f-64c3233f01d1. Provision state s Stack overcloud DELETE_FAILED Unable to delete 1 of the 1 stacks. (undercloud) [stack@undercloud-0 ~]$ Expected results: overcloud gets deleted Additional info: (undercloud) [stack@undercloud-0 ~]$ cat overcloud_deploy.sh #!/bin/bash openstack overcloud deploy \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ -e /home/stack/virt/config_lvm.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e /home/stack/virt/extra_templates.yaml \ -e /home/stack/virt/docker-images.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \ -e /home/stack/swift.yaml \ --log-file overcloud_deployment_69.log (undercloud) [stack@undercloud-0 ~]$ cat swift.yaml parameter_defaults: BarbicanSimpleCryptoGlobalDefault: True SwiftEncryptionEnabled: True DockerInsecureRegistryAddress: rhos-qe-mirror-tlv.usersys.redhat.com:5000 DockerBarbicanApiImage: rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13/openstack-barbican-api:2018-06-15.2 DockerBarbicanConfigImage: rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13/openstack-barbican-api:2018-06-15.2 DockerBarbicanKeystoneListenerConfigImage: rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13/openstack-barbican-keystone-listener:2018-06-15.2 DockerBarbicanKeystoneListenerImage: rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13/openstack-barbican-keystone-listener:2018-06-15.2 DockerBarbicanWorkerConfigImage: rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13/openstack-barbican-worker:2018-06-15.2 DockerBarbicanWorkerImage: rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp13/openstack-barbican-worker:2018-06-15.2 (undercloud) [stack@undercloud-0 ~]$
Created attachment 1453757 [details] ironic-conductor log It likely was failing because IPMI failed. From a Heat perspective it just calls a server delete. Which talks to ironic/nova. Since it was still deleting (probably because it failed with impi) the stack delete failed. 2018-06-19 05:51:55.575 21918 ERROR ironic.drivers.modules.ipmitool [req-aca21f86-bd26-4a51-9962-4321a7516160 70a32a35f8ab413aacf1626134da7e1c df8a44d7ac5547eca5426e50b8ebe8a7 - default default] IPMI Error while a ttempting "ipmitool -I lanplus -H 172.16.0.1 -L ADMINISTRATOR -p 6234 -U admin -R 12 -N 5 -f /tmp/tmpe4jk39 power status" for node ca8d3c45-d79b-4217-a08f-64c3233f01d1. Error: Unexpected error while running comman d. Command: ipmitool -I lanplus -H 172.16.0.1 -L ADMINISTRATOR -p 6234 -U admin -R 12 -N 5 -f /tmp/tmpe4jk39 power status Exit code: 1 Stdout: u'' Stderr: u'Error: Unable to establish IPMI v2 / RMCP+ session\n': ProcessExecutionError: Unexpected error while running command. 2018-06-19 05:51:55.577 21918 WARNING ironic.drivers.modules.ipmitool [req-aca21f86-bd26-4a51-9962-4321a7516160 70a32a35f8ab413aacf1626134da7e1c df8a44d7ac5547eca5426e50b8ebe8a7 - default default] IPMI power statu s failed for node ca8d3c45-d79b-4217-a08f-64c3233f01d1 with error: Unexpected error while running command. Command: ipmitool -I lanplus -H 172.16.0.1 -L ADMINISTRATOR -p 6234 -U admin -R 12 -N 5 -f /tmp/tmpe4jk39 power status Exit code: 1 Stdout: u'' Stderr: u'Error: Unable to establish IPMI v2 / RMCP+ session\n'.: ProcessExecutionError: Unexpected error while running command. 2018-06-19 05:51:55.603 21918 DEBUG ironic.conductor.manager [req-0a87110a-2e95-4aa3-954a-81598360c859 70a32a35f8ab413aacf1626134da7e1c df8a44d7ac5547eca5426e50b8ebe8a7 - default default] RPC vif_detach called for the node ca8d3c45-d79b-4217-a08f-64c3233f01d1 with vif_id ea4f72d6-8329-4bfe-a010-ae00bbf743ab vif_detach /usr/lib/python2.7/site-packages/ironic/conductor/manager.py:2991 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager [req-aca21f86-bd26-4a51-9962-4321a7516160 70a32a35f8ab413aacf1626134da7e1c df8a44d7ac5547eca5426e50b8ebe8a7 - default default] Error in tear_down of nod e ca8d3c45-d79b-4217-a08f-64c3233f01d1: IPMI call failed: power status.: IPMIFailure: IPMI call failed: power status. 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager Traceback (most recent call last): 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 909, in _do_node_tear_down 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager task.driver.deploy.tear_down(task) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 60, in wrapped 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager result = f(*args, **kwargs) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager return f(*args, **kwargs) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/iscsi_deploy.py", line 498, in tear_down 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager manager_utils.node_power_action(task, states.POWER_OFF) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager return f(*args, **kwargs) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/conductor/utils.py", line 209, in node_power_action 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager if _can_skip_state_change(task, new_state): 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/conductor/utils.py", line 168, in _can_skip_state_change 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager fields.NotificationStatus.ERROR, new_state) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager self.force_reraise() 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager six.reraise(self.type_, self.value, self.tb) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/conductor/utils.py", line 158, in _can_skip_state_change 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager curr_state = task.driver.power.get_power_state(task) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic_lib/metrics.py", line 60, in wrapped 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager result = f(*args, **kwargs) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/ipmitool.py", line 781, in get_power_state 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager return _power_status(driver_info) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/ipmitool.py", line 564, in _power_status 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager raise exception.IPMIFailure(cmd=cmd) 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager IPMIFailure: IPMI call failed: power status. 2018-06-19 05:51:55.644 21918 ERROR ironic.conductor.manager
Are these virtual nodes or baremetal? If virtual, the IPMI failures you are seeing are likely due to VBMC failures because of the livbirt bug in RHEL 7.4/7.5 - https://bugzilla.redhat.com/show_bug.cgi?id=1581364. The location of a patch to install can be found in the bug, otherwise the next rhel release with the fix is pending. For reference, see this similar delete bug which was due to the libvirt issue - https://bugzilla.redhat.com/show_bug.cgi?id=1549571. If these are baremetal nodes, there is an issue with the BM hardware that is causing IPMI failures. Most likely these are virtual nodes and you are hitting the libvirt issue though.
Closing this as a duplicate. Please reopen if this appears to not be due to the IPMI issue with vbmc/libvirt. *** This bug has been marked as a duplicate of bug 1581364 ***
This bug was originally created when using a virtual environment and was due to an IPMI issue with vbmc/libvirt which was a clear duplicate to https://bugzilla.redhat.com/show_bug.cgi?id=1581364. This problem appears to not be related. Please open a NEW bug and provide the following: - sosreport capturing when the problem occurs - related case linked to external tracker - version and pkg ids for nova, ironic etc. We will close this bug again as a duplicate.
*** This bug has been marked as a duplicate of bug 1581364 ***