Description of problem: The deploy fail when try to replace a disk with the same name as 'sdi' Version-Release number of selected component (if applicable): Red Hat Openstack 13 How reproducible: - there are lot of 'Medium' error for disk sdi which is the backing device for osd.66 in host prd-ceph5 - Removed the OSD from the Ceph cluster manually - Replaced the disk, and the new disk has the same name as 'sdi' - To replace this disk to the cluster, run openstack deployment command Steps to Reproduce: 1. Deploy openstack to replace a disk to the ceph cluster. Actual results: overcloud.CephStorageAllNodesDeployment.5: resource_type: OS::Heat::StructuredDeployment physical_resource_id: 9d68eac1-2e8c-4783-9da2-1379dc23a963 status: CREATE_FAILED status_reason: | CREATE aborted (Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] Timed out) deploy_stdout: | None deploy_stderr: | None # openstack stack list | 2c329dbb-6179-4b36-9cc6-ad58a9ad1cb6 | overcloud | 30b77608e23f400a8fc058fecf260bf4 | UPDATE_FAILED | 2018-09-23T11:52:10Z | 2019-12-04T08:26:45Z | # openstack stack resource list 2c329dbb-6179-4b36-9cc6-ad58a9ad1cb6 | CephStorageAllNodesDeployment | 064846ba-147a-4012-b1db-5884213e43b1 | OS::TripleO::AllNodesDeployment | UPDATE_FAILED | 2019-12-04T08:44:09Z | 2019-12-04 13:26:44.635 3099 DEBUG heat.engine.scheduler [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209 2019-12-04 13:26:44.684 3099 DEBUG heat.engine.scheduler [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] sleeping _sleep /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:150 2019-12-04 13:26:45.582 3098 INFO heat.engine.scheduler [req-07e03e5f-85bc-4b36-8440-ca625e440b37 - - - - -] Task update from StructuredDeployments "CephStorageAllNodesDeployment" [064846ba-147a-4012-b1db-5884213e43b1] Stack "overcloud" [2c329dbb-6179-4b36-9cc6-ad58a9ad1cb6] timed out 2019-12-04 13:26:45.638 3099 INFO heat.engine.service [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Starting cancel of updating stack overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv 2019-12-04 13:26:45.657 3099 INFO heat.engine.stack [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Stack UPDATE FAILED (overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv): Stack UPDATE cancelled 2019-12-04 13:26:45.668 3099 DEBUG heat.engine.stack [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Persisting stack overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv status UPDATE FAILED _send_notification_and_add_event /usr/lib/python2.7/site-packages/heat/engine/stack.py:1020 2019-12-04 13:26:45.685 3099 DEBUG heat.engine.scheduler [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209 2019-12-04 13:26:45.720 3098 INFO heat.engine.stack [req-07e03e5f-85bc-4b36-8440-ca625e440b37 - - - - -] Stack UPDATE FAILED (overcloud): Timed out 2019-12-04 13:26:45.739 3098 DEBUG heat.engine.stack [req-07e03e5f-85bc-4b36-8440-ca625e440b37 - - - - -] Persisting stack overcloud status UPDATE FAILED _send_notification_and_add_event /usr/lib/python2.7/site-packages/heat/engine/stack.py:1020 Expected results: Update succeded Additional info: ---- CephAnsibleDisksConfig: devices: - /dev/sdd - /dev/sde - /dev/sdf - /dev/sdg - /dev/sdh - /dev/sdi --------> The disk to be replaced - /dev/sdj - /dev/sdk - /dev/sdl - /dev/sdm - /dev/sdn - /dev/sdo dedicated_devices: - /dev/sda - /dev/sdb - /dev/sdc osd_scenario: non-collocated $ cat sos_commands/block/lsblk | grep sdi sdi 8:128 0 3.7T 0 disk
As the reporter described the issue, the disk in question had an OSD directory in it, thus the deployment failed, as intended. Closing as not a bug