Bug 1516429

Summary: openstack-nova: failed to launch instance on OC: "Host 'overcloud-compute-1.localdomain' is not mapped to any cell" after Scale-UP
Product: Red Hat OpenStack Reporter: Artem Hrechanychenko <ahrechan>
Component: openstack-tripleo-heat-templatesAssignee: Dan Prince <dprince>
Status: CLOSED ERRATA QA Contact: Artem Hrechanychenko <ahrechan>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 12.0 (Pike)CC: ahrechan, dprince, geguileo, maandre, m.andre, mburns, ohochman, owalsh, rhel-osp-director-maint, sasha, sgordon
Target Milestone: rcKeywords: AutomationBlocker, Regression, TestBlocker, Triaged
Target Release: 12.0 (Pike)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-7.0.3-13.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-13 22:22:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Artem Hrechanychenko 2017-11-22 15:38:04 UTC
Description of problem:

Failed to launch new instance after scale-up compute node on cluster with 3ctrl+1comp 

| fault                                | {"message": "Host 'compute-1.localdomain' is not mapped to any cell", "code": 400, "created": "2017-11-22T15:02:58Z"} |

But in the same time I can perform live-migration of instance from compute-0 to compute-1 

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.3-10.el7ost.noarch


How reproducible:

always
Steps to Reproduce:
1.Deploy 3ctrl+1comp
2.Scale compute
3.Launch instance on new compute

Actual results:
{"message": "Host 'compute-1.localdomain' is not mapped to any cell", "code": 400, "created": "2017-11-22T15:02:58Z"}

Expected results:
instance was launch

Additional info:

Comment 3 Dan Prince 2017-11-22 18:45:19 UTC
One idea on how we might fix this might be to append either the DeployIdentifier to the environment of the discovory hosts "init container" thus forcing it to re-run during update.

The more concerning issue here is this likely isn't the only container we need to do this to. Scaling out and adding a net-new OpenStack service or role for example is likely also broken as it would require us to re-run some of the DB and Keystone init tasks as well. Similar fixes would apply there I think.

Comment 5 Ollie Walsh 2017-11-22 19:09:32 UTC
(In reply to Dan Prince from comment #3)

> The more concerning issue here is this likely isn't the only container we
> need to do this to. Scaling out and adding a net-new OpenStack service or
> role for example is likely also broken as it would require us to re-run some
> of the DB and Keystone init tasks as well. Similar fixes would apply there I
> think.

Yea, that's my concern. dbsync tasks need to be run for all services during a minor updates for example.

Comment 6 Ollie Walsh 2017-11-22 19:13:20 UTC
Maybe should be the default for docker config steps with detach: false, since most will be based on puppet exec resources

Comment 7 Dan Prince 2017-11-22 22:08:20 UTC
I've posted a potential upstream fix here for the Nova discover hosts issue:

https://review.openstack.org/522397 Add DeployIdentifier to Nova discover hosts container

Going to try to test a scale out locally. Given the timing would be good to have early QE feedback on this as well. Simply patching it into your t-h-t directory on the Undercloud would be all you'd need to do.

Artem: does it work for you?

Comment 8 Artem Hrechanychenko 2017-11-23 18:06:09 UTC
I applied your patch[0] to my local t-h-t after deploy before scale-up

after scale-up using infrared :
infrared virsh -v --host-address $HOST --host-key ~/.ssh/id_rsa  --topology-nodes compute:1 --image-url http://download-node-02.eng.bos.redhat.com/brewroot/packages/rhel-guest-image/7.4/176/images/rhel-guest-image-7.4-191.x86_64.qcow2  --topology-extend True

infrared cloud-config --tasks scale_up --scale-nodes compute-1

infrared cloud-config --tasks add_overcloud_hosts


undercloud) [stack@undercloud-0 ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
| id                                   | stack_name | stack_status    | creation_time        | updated_time         | project                          |
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
| fa1f28ce-4ec1-4f6a-b285-b237aa49af2e | overcloud  | UPDATE_COMPLETE | 2017-11-23T16:35:38Z | 2017-11-23T17:25:25Z | ada9b3af540e42bd84fce45e9972d958 |
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
(undercloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks               |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| 6f3ea5a4-3d5c-4270-bb29-c755319811c8 | compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.24.9  |
| 59bea50d-aebb-4f42-8f05-b4e6a046f2d3 | compute-1    | ACTIVE | -          | Running     | ctlplane=192.168.24.13 |
| 9a1b15fd-95bc-470a-a559-aedca2b742e9 | controller-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 3c101b5d-3e9a-4326-a4b2-07c3451479a8 | controller-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
| ea210cbe-be51-49d6-80a5-e13168d6637b | controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+


(overcloud) [stack@undercloud-0 ~]$ openstack hypervisor list
+----+-----------------------+-----------------+-------------+-------+
| ID | Hypervisor Hostname   | Hypervisor Type | Host IP     | State |
+----+-----------------------+-----------------+-------------+-------+
|  2 | compute-0.localdomain | QEMU            | 172.17.1.13 | up    |
|  5 | compute-1.localdomain | QEMU            | 172.17.1.12 | up    |
+----+-----------------------+-----------------+-------------+-------+
(overcloud) [stack@undercloud-0 ~]$ nova show new_instance |grep hyp
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-1.localdomain



live migration also working as expected 
(overcloud) [stack@undercloud-0 ~]$ nova live-migration after_deploy compute-1.localdomain
(overcloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+--------------+--------+------------+-------------+---------------------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks                              |
+--------------------------------------+--------------+--------+------------+-------------+---------------------------------------+
| e5058bab-335e-47a2-b644-fb8c653be3cf | after_deploy | ACTIVE | -          | Running     | tenantvxlan=192.168.32.11, 10.0.0.182 |
| 3d67dfb8-302e-498a-a092-9927d5a97299 | new_instance | ACTIVE | -          | Running     | tenantvxlan=192.168.32.10, 10.0.0.185 |
+--------------------------------------+--------------+--------+------------+-------------+---------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova show after_deploy |grep hyper
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-1.localdomain

Comment 10 Artem Hrechanychenko 2017-11-28 12:07:32 UTC
Instance was launch but cannot create and attach cinder volume. 

/var/log/cinder/cinder-api.log:2017-11-28 10:50:01.786 5936 DEBUG cinder.volume.api [req-10c1b98e-66ad-4dca-b990-e79cf5a3abf3 2cb5d5cc90e644cb965649f5ce42a637 1a2a0f8c85224dfeb5180fddca4062aa - default default] Task 'cinder.volume.flows.api.create_volume.EntryCreateTask;volume:create' (3d62a65d-a47b-4548-99c8-6b55d5f9c2a2) transitioned into state 'SUCCESS' from state 'RUNNING' with result '{'volume': Volume(_name_id=None,admin_metadata=<?>,attach_status='detached',availability_zone='nova',bootable=False,cluster=<?>,cluster_name=None,consistencygroup=<?>,consistencygroup_id=None,created_at=2017-11-28T10:50:01Z,deleted=False,deleted_at=None,display_description=None,display_name=None,ec2_id=None,encryption_key_id=None,glance_metadata=<?>,group=<?>,group_id=None,host=None,id=5345b78f-423c-4a79-bfb0-d3f2f4cdaa97,launched_at=None,metadata={},migration_status=None,multiattach=False,previous_status=None,project_id='1a2a0f8c85224dfeb5180fddca4062aa',provider_auth=None,provider_geometry=None,provider_id=None,provider_location=None,replication_driver_data=None,replication_extended_status=None,replication_status=None,scheduled_at=None,size=1,snapshot_id=None,snapshots=<?>,source_volid=None,status='creating',terminated_at=None,updated_at=None,user_id='2cb5d5cc90e644cb965649f5ce42a637',volume_attachment=<?>,volume_type=<?>,volume_type_id=None), 'volume_properties': VolumeProperties(attach_status='detached',availability_zone='nova',cgsnapshot_id=None,consistencygroup_id=None,display_description=None,display_name=None,encryption_key_id=None,group_id=None,group_type_id=<?>,metadata={},multiattach=False,project_id='1a2a0f8c85224dfeb5180fddca4062aa',qos_specs=None,replication_status=<?>,reservations=['36e23726-3f60-4649-98cd-df242c4bc8b9','81a5f77d-43d5-4c58-b719-ac700f96270e'],size=1,snapshot_id=None,source_replicaid=None,source_volid=None,status='creating',user_id='2cb5d5cc90e644cb965649f5ce42a637',volume_type_id=None), 'volume_id': '5345b78f-423c-4a79-bfb0-d3f2f4cdaa97'}' _task_receiver /usr/lib/python2.7/site-packages/taskflow/listeners/logging.py:183


(overcloud) [stack@undercloud-0 ~]$ openstack volume show 5345b78f-423c-4a79-bfb0-d3f2f4cdaa97  
+--------------------------------+--------------------------------------+
| Field                          | Value                                |
+--------------------------------+--------------------------------------+
| attachments                    | []                                   |
| availability_zone              | nova                                 |
| bootable                       | false                                |
| consistencygroup_id            | None                                 |
| created_at                     | 2017-11-28T10:50:01.000000           |
| description                    | None                                 |
| encrypted                      | False                                |
| id                             | 5345b78f-423c-4a79-bfb0-d3f2f4cdaa97 |
| migration_status               | None                                 |
| multiattach                    | False                                |
| name                           | None                                 |
| os-vol-host-attr:host          | None                                 |
| os-vol-mig-status-attr:migstat | None                                 |
| os-vol-mig-status-attr:name_id | None                                 |
| os-vol-tenant-attr:tenant_id   | 1a2a0f8c85224dfeb5180fddca4062aa     |
| properties                     |                                      |
| replication_status             | None                                 |
| size                           | 1                                    |
| snapshot_id                    | None                                 |
| source_volid                   | None                                 |
| status                         | error                                |
| type                           | None                                 |
| updated_at                     | 2017-11-28T10:50:01.000000           |
| user_id                        | 2cb5d5cc90e644cb965649f5ce42a637     |
+--------------------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ openstack volume list
+--------------------------------------+------+--------+------+---------------------------------------+
| ID                                   | Name | Status | Size | Attached to                           |
+--------------------------------------+------+--------+------+---------------------------------------+
| 5345b78f-423c-4a79-bfb0-d3f2f4cdaa97 | None | error  |    1 |                                       |
| f08b4a96-fb66-4f4a-a0d8-7c034bd20b29 | None | error  |    1 |                                       |
| 22463cd4-a97e-480b-8b2b-b4bab85df973 | None | in-use |    1 | Attached to after_deploy on /dev/vdb  |
+--------------------------------------+------+--------+------+---------------------------------------+


[stack@undercloud-0 ~]$ source stackrc 
(undercloud) [stack@undercloud-0 ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
| id                                   | stack_name | stack_status    | creation_time        | updated_time         | project                          |
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
| 66c61d2e-79ca-4b7c-9888-d595f3d5c9d7 | overcloud  | UPDATE_COMPLETE | 2017-11-27T15:38:10Z | 2017-11-27T16:33:39Z | ba1fdf4e130746ab897197a48257a565 |
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
(undercloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks               |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| d6c3464d-2935-4ed0-ba5a-3fccf29f4620 | compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| f5c15a80-2663-4ddf-a0a4-484ea37cf0ec | compute-1    | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 2fceacc9-e004-4b5d-8d30-d0758fb771a9 | controller-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.18 |
| 23047b29-0b57-4d44-a70b-dba0fa87d146 | controller-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.14 |
| 4041ba13-fa00-49cb-b2b2-dbb075573038 | controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+----------------+---------+------------+-------------+---------------------------------------+
| ID                                   | Name           | Status  | Task State | Power State | Networks                              |
+--------------------------------------+----------------+---------+------------+-------------+---------------------------------------+
| 3fde8ce0-c8c2-438c-81ba-6089ea55fc07 | after_deploy   | SHUTOFF | -          | NOSTATE     | tenantvxlan=192.168.32.14, 10.0.0.177 |
| 38029aac-ee44-4ffe-b765-b2ae6891eedc | after_reboot   | ACTIVE  | -          | Running     | tenantvxlan=192.168.32.6, 10.0.0.180  |
| 41e3c029-e5ac-4ba7-859b-701ae6688505 | after_reboot_2 | ACTIVE  | -          | Running     | tenantvxlan=192.168.32.12, 10.0.0.183 |
+--------------------------------------+----------------+---------+------------+-------------+---------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova show after_reboot|grep hyp
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-1.localdomain                                    |
(overcloud) [stack@undercloud-0 ~]$ nova show after_reboot_2|grep hyp
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute-0.localdomain                                    |
(overcloud) [stack@undercloud-0 ~]$ ping 10.0.0.180
PING 10.0.0.180 (10.0.0.180) 56(84) bytes of data.
64 bytes from 10.0.0.180: icmp_seq=1 ttl=63 time=3.08 ms
^C
--- 10.0.0.180 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.087/3.087/3.087/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping 10.0.0.183
PING 10.0.0.183 (10.0.0.183) 56(84) bytes of data.
64 bytes from 10.0.0.183: icmp_seq=1 ttl=63 time=2.03 ms
^C
--- 10.0.0.183 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.031/2.031/2.031/0.000 ms

Comment 11 Artem Hrechanychenko 2017-11-28 13:09:14 UTC
VERIFIED

volume wasn't created because - https://bugzilla.redhat.com/show_bug.cgi?id=1412661

openstack-tripleo-heat-templates-7.0.3-13.el7ost.noarch

Comment 15 errata-xmlrpc 2017-12-13 22:22:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462