1786063 – Deploy fail when try to replace a disk with the same name

Bug 1786063 - Deploy fail when try to replace a disk with the same name

Summary: Deploy fail when try to replace a disk with the same name

Keywords:
Status:	CLOSED DUPLICATE of bug 1747126
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo
Sub Component:
Version:	13.0 (Queens)
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	John Fulton
QA Contact:	Yogev Rabl
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-12-23 09:44 UTC by Luigi Tamagnone
Modified:	2023-03-24 16:33 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-01-20 15:25:34 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Luigi Tamagnone 2019-12-23 09:44:28 UTC

Description of problem:
The deploy fail when try to replace a disk with the same name as 'sdi'

Version-Release number of selected component (if applicable):
Red Hat Openstack 13

How reproducible:
- there are lot of 'Medium' error for disk sdi which is the backing device for osd.66 in host prd-ceph5 
- Removed the OSD from the Ceph cluster manually
- Replaced the disk, and the new disk has the same name as 'sdi'
- To replace this disk to the cluster, run openstack deployment command


Steps to Reproduce:
1. Deploy openstack to replace a disk to the ceph cluster.

Actual results:
overcloud.CephStorageAllNodesDeployment.5:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 9d68eac1-2e8c-4783-9da2-1379dc23a963
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted (Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] Timed out)
  deploy_stdout: |
None
  deploy_stderr: |
None

#  openstack stack list
| 2c329dbb-6179-4b36-9cc6-ad58a9ad1cb6 | overcloud  | 30b77608e23f400a8fc058fecf260bf4 | UPDATE_FAILED | 2018-09-23T11:52:10Z | 2019-12-04T08:26:45Z |
# openstack stack resource list 2c329dbb-6179-4b36-9cc6-ad58a9ad1cb6
| CephStorageAllNodesDeployment           | 064846ba-147a-4012-b1db-5884213e43b1                    | OS::TripleO::AllNodesDeployment                  | UPDATE_FAILED   | 2019-12-04T08:44:09Z |

2019-12-04 13:26:44.635 3099 DEBUG heat.engine.scheduler [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209
2019-12-04 13:26:44.684 3099 DEBUG heat.engine.scheduler [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] sleeping _sleep /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:150
2019-12-04 13:26:45.582 3098 INFO heat.engine.scheduler [req-07e03e5f-85bc-4b36-8440-ca625e440b37 - - - - -] Task update from StructuredDeployments "CephStorageAllNodesDeployment" [064846ba-147a-4012-b1db-5884213e43b1] Stack "overcloud" [2c329dbb-6179-4b36-9cc6-ad58a9ad1cb6] timed out
2019-12-04 13:26:45.638 3099 INFO heat.engine.service [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Starting cancel of updating stack overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv
2019-12-04 13:26:45.657 3099 INFO heat.engine.stack [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Stack UPDATE FAILED (overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv): Stack UPDATE cancelled
2019-12-04 13:26:45.668 3099 DEBUG heat.engine.stack [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Persisting stack overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv status UPDATE FAILED _send_notification_and_add_event /usr/lib/python2.7/site-packages/heat/engine/stack.py:1020
2019-12-04 13:26:45.685 3099 DEBUG heat.engine.scheduler [req-81bf6a7c-50f1-4a0c-8bbf-c0f365e21fa2 - admin - default default] Task create from StructuredDeployment "5" Stack "overcloud-CephStorageAllNodesDeployment-hu2i3kkm6ezv" [064846ba-147a-4012-b1db-5884213e43b1] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:209
2019-12-04 13:26:45.720 3098 INFO heat.engine.stack [req-07e03e5f-85bc-4b36-8440-ca625e440b37 - - - - -] Stack UPDATE FAILED (overcloud): Timed out
2019-12-04 13:26:45.739 3098 DEBUG heat.engine.stack [req-07e03e5f-85bc-4b36-8440-ca625e440b37 - - - - -] Persisting stack overcloud status UPDATE FAILED _send_notification_and_add_event /usr/lib/python2.7/site-packages/heat/engine/stack.py:1020

Expected results:
Update succeded


Additional info:
----
CephAnsibleDisksConfig:
    devices:
      - /dev/sdd
      - /dev/sde
      - /dev/sdf
      - /dev/sdg
      - /dev/sdh
      - /dev/sdi   --------> The disk to be replaced
      - /dev/sdj
      - /dev/sdk
      - /dev/sdl
      - /dev/sdm
      - /dev/sdn
      - /dev/sdo
    dedicated_devices:
      - /dev/sda
      - /dev/sdb
      - /dev/sdc
    osd_scenario: non-collocated

$ cat sos_commands/block/lsblk | grep sdi
sdi      8:128  0   3.7T  0 disk

Comment 4 Yogev Rabl 2020-01-10 18:11:57 UTC

As the reporter described the issue, the disk in question had an OSD directory in it, thus the deployment failed, as intended. 

Closing as not a bug

Note You need to log in before you can comment on or make changes to this bug.