Bug 1516429
| Summary: | openstack-nova: failed to launch instance on OC: "Host 'overcloud-compute-1.localdomain' is not mapped to any cell" after Scale-UP | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Artem Hrechanychenko <ahrechan> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Dan Prince <dprince> |
| Status: | CLOSED ERRATA | QA Contact: | Artem Hrechanychenko <ahrechan> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 12.0 (Pike) | CC: | ahrechan, dprince, geguileo, maandre, m.andre, mburns, ohochman, owalsh, rhel-osp-director-maint, sasha, sgordon |
| Target Milestone: | rc | Keywords: | AutomationBlocker, Regression, TestBlocker, Triaged |
| Target Release: | 12.0 (Pike) | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-7.0.3-13.el7ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-12-13 22:22:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
One idea on how we might fix this might be to append either the DeployIdentifier to the environment of the discovory hosts "init container" thus forcing it to re-run during update. The more concerning issue here is this likely isn't the only container we need to do this to. Scaling out and adding a net-new OpenStack service or role for example is likely also broken as it would require us to re-run some of the DB and Keystone init tasks as well. Similar fixes would apply there I think. (In reply to Dan Prince from comment #3) > The more concerning issue here is this likely isn't the only container we > need to do this to. Scaling out and adding a net-new OpenStack service or > role for example is likely also broken as it would require us to re-run some > of the DB and Keystone init tasks as well. Similar fixes would apply there I > think. Yea, that's my concern. dbsync tasks need to be run for all services during a minor updates for example. Maybe should be the default for docker config steps with detach: false, since most will be based on puppet exec resources I've posted a potential upstream fix here for the Nova discover hosts issue: https://review.openstack.org/522397 Add DeployIdentifier to Nova discover hosts container Going to try to test a scale out locally. Given the timing would be good to have early QE feedback on this as well. Simply patching it into your t-h-t directory on the Undercloud would be all you'd need to do. Artem: does it work for you? I applied your patch[0] to my local t-h-t after deploy before scale-up after scale-up using infrared : infrared virsh -v --host-address $HOST --host-key ~/.ssh/id_rsa --topology-nodes compute:1 --image-url http://download-node-02.eng.bos.redhat.com/brewroot/packages/rhel-guest-image/7.4/176/images/rhel-guest-image-7.4-191.x86_64.qcow2 --topology-extend True infrared cloud-config --tasks scale_up --scale-nodes compute-1 infrared cloud-config --tasks add_overcloud_hosts undercloud) [stack@undercloud-0 ~]$ heat stack-list WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead +--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+ | id | stack_name | stack_status | creation_time | updated_time | project | +--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+ | fa1f28ce-4ec1-4f6a-b285-b237aa49af2e | overcloud | UPDATE_COMPLETE | 2017-11-23T16:35:38Z | 2017-11-23T17:25:25Z | ada9b3af540e42bd84fce45e9972d958 | +--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+ (undercloud) [stack@undercloud-0 ~]$ nova list +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | 6f3ea5a4-3d5c-4270-bb29-c755319811c8 | compute-0 | ACTIVE | - | Running | ctlplane=192.168.24.9 | | 59bea50d-aebb-4f42-8f05-b4e6a046f2d3 | compute-1 | ACTIVE | - | Running | ctlplane=192.168.24.13 | | 9a1b15fd-95bc-470a-a559-aedca2b742e9 | controller-0 | ACTIVE | - | Running | ctlplane=192.168.24.15 | | 3c101b5d-3e9a-4326-a4b2-07c3451479a8 | controller-1 | ACTIVE | - | Running | ctlplane=192.168.24.16 | | ea210cbe-be51-49d6-80a5-e13168d6637b | controller-2 | ACTIVE | - | Running | ctlplane=192.168.24.11 | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack hypervisor list +----+-----------------------+-----------------+-------------+-------+ | ID | Hypervisor Hostname | Hypervisor Type | Host IP | State | +----+-----------------------+-----------------+-------------+-------+ | 2 | compute-0.localdomain | QEMU | 172.17.1.13 | up | | 5 | compute-1.localdomain | QEMU | 172.17.1.12 | up | +----+-----------------------+-----------------+-------------+-------+ (overcloud) [stack@undercloud-0 ~]$ nova show new_instance |grep hyp | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-1.localdomain live migration also working as expected (overcloud) [stack@undercloud-0 ~]$ nova live-migration after_deploy compute-1.localdomain (overcloud) [stack@undercloud-0 ~]$ nova list +--------------------------------------+--------------+--------+------------+-------------+---------------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+---------------------------------------+ | e5058bab-335e-47a2-b644-fb8c653be3cf | after_deploy | ACTIVE | - | Running | tenantvxlan=192.168.32.11, 10.0.0.182 | | 3d67dfb8-302e-498a-a092-9927d5a97299 | new_instance | ACTIVE | - | Running | tenantvxlan=192.168.32.10, 10.0.0.185 | +--------------------------------------+--------------+--------+------------+-------------+---------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ nova show after_deploy |grep hyper | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-1.localdomain Instance was launch but cannot create and attach cinder volume.
/var/log/cinder/cinder-api.log:2017-11-28 10:50:01.786 5936 DEBUG cinder.volume.api [req-10c1b98e-66ad-4dca-b990-e79cf5a3abf3 2cb5d5cc90e644cb965649f5ce42a637 1a2a0f8c85224dfeb5180fddca4062aa - default default] Task 'cinder.volume.flows.api.create_volume.EntryCreateTask;volume:create' (3d62a65d-a47b-4548-99c8-6b55d5f9c2a2) transitioned into state 'SUCCESS' from state 'RUNNING' with result '{'volume': Volume(_name_id=None,admin_metadata=<?>,attach_status='detached',availability_zone='nova',bootable=False,cluster=<?>,cluster_name=None,consistencygroup=<?>,consistencygroup_id=None,created_at=2017-11-28T10:50:01Z,deleted=False,deleted_at=None,display_description=None,display_name=None,ec2_id=None,encryption_key_id=None,glance_metadata=<?>,group=<?>,group_id=None,host=None,id=5345b78f-423c-4a79-bfb0-d3f2f4cdaa97,launched_at=None,metadata={},migration_status=None,multiattach=False,previous_status=None,project_id='1a2a0f8c85224dfeb5180fddca4062aa',provider_auth=None,provider_geometry=None,provider_id=None,provider_location=None,replication_driver_data=None,replication_extended_status=None,replication_status=None,scheduled_at=None,size=1,snapshot_id=None,snapshots=<?>,source_volid=None,status='creating',terminated_at=None,updated_at=None,user_id='2cb5d5cc90e644cb965649f5ce42a637',volume_attachment=<?>,volume_type=<?>,volume_type_id=None), 'volume_properties': VolumeProperties(attach_status='detached',availability_zone='nova',cgsnapshot_id=None,consistencygroup_id=None,display_description=None,display_name=None,encryption_key_id=None,group_id=None,group_type_id=<?>,metadata={},multiattach=False,project_id='1a2a0f8c85224dfeb5180fddca4062aa',qos_specs=None,replication_status=<?>,reservations=['36e23726-3f60-4649-98cd-df242c4bc8b9','81a5f77d-43d5-4c58-b719-ac700f96270e'],size=1,snapshot_id=None,source_replicaid=None,source_volid=None,status='creating',user_id='2cb5d5cc90e644cb965649f5ce42a637',volume_type_id=None), 'volume_id': '5345b78f-423c-4a79-bfb0-d3f2f4cdaa97'}' _task_receiver /usr/lib/python2.7/site-packages/taskflow/listeners/logging.py:183
(overcloud) [stack@undercloud-0 ~]$ openstack volume show 5345b78f-423c-4a79-bfb0-d3f2f4cdaa97
+--------------------------------+--------------------------------------+
| Field | Value |
+--------------------------------+--------------------------------------+
| attachments | [] |
| availability_zone | nova |
| bootable | false |
| consistencygroup_id | None |
| created_at | 2017-11-28T10:50:01.000000 |
| description | None |
| encrypted | False |
| id | 5345b78f-423c-4a79-bfb0-d3f2f4cdaa97 |
| migration_status | None |
| multiattach | False |
| name | None |
| os-vol-host-attr:host | None |
| os-vol-mig-status-attr:migstat | None |
| os-vol-mig-status-attr:name_id | None |
| os-vol-tenant-attr:tenant_id | 1a2a0f8c85224dfeb5180fddca4062aa |
| properties | |
| replication_status | None |
| size | 1 |
| snapshot_id | None |
| source_volid | None |
| status | error |
| type | None |
| updated_at | 2017-11-28T10:50:01.000000 |
| user_id | 2cb5d5cc90e644cb965649f5ce42a637 |
+--------------------------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ openstack volume list
+--------------------------------------+------+--------+------+---------------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+------+--------+------+---------------------------------------+
| 5345b78f-423c-4a79-bfb0-d3f2f4cdaa97 | None | error | 1 | |
| f08b4a96-fb66-4f4a-a0d8-7c034bd20b29 | None | error | 1 | |
| 22463cd4-a97e-480b-8b2b-b4bab85df973 | None | in-use | 1 | Attached to after_deploy on /dev/vdb |
+--------------------------------------+------+--------+------+---------------------------------------+
[stack@undercloud-0 ~]$ source stackrc
(undercloud) [stack@undercloud-0 ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
| id | stack_name | stack_status | creation_time | updated_time | project |
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
| 66c61d2e-79ca-4b7c-9888-d595f3d5c9d7 | overcloud | UPDATE_COMPLETE | 2017-11-27T15:38:10Z | 2017-11-27T16:33:39Z | ba1fdf4e130746ab897197a48257a565 |
+--------------------------------------+------------+-----------------+----------------------+----------------------+----------------------------------+
(undercloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| d6c3464d-2935-4ed0-ba5a-3fccf29f4620 | compute-0 | ACTIVE | - | Running | ctlplane=192.168.24.6 |
| f5c15a80-2663-4ddf-a0a4-484ea37cf0ec | compute-1 | ACTIVE | - | Running | ctlplane=192.168.24.15 |
| 2fceacc9-e004-4b5d-8d30-d0758fb771a9 | controller-0 | ACTIVE | - | Running | ctlplane=192.168.24.18 |
| 23047b29-0b57-4d44-a70b-dba0fa87d146 | controller-1 | ACTIVE | - | Running | ctlplane=192.168.24.14 |
| 4041ba13-fa00-49cb-b2b2-dbb075573038 | controller-2 | ACTIVE | - | Running | ctlplane=192.168.24.12 |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+----------------+---------+------------+-------------+---------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+----------------+---------+------------+-------------+---------------------------------------+
| 3fde8ce0-c8c2-438c-81ba-6089ea55fc07 | after_deploy | SHUTOFF | - | NOSTATE | tenantvxlan=192.168.32.14, 10.0.0.177 |
| 38029aac-ee44-4ffe-b765-b2ae6891eedc | after_reboot | ACTIVE | - | Running | tenantvxlan=192.168.32.6, 10.0.0.180 |
| 41e3c029-e5ac-4ba7-859b-701ae6688505 | after_reboot_2 | ACTIVE | - | Running | tenantvxlan=192.168.32.12, 10.0.0.183 |
+--------------------------------------+----------------+---------+------------+-------------+---------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova show after_reboot|grep hyp
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute-1.localdomain |
(overcloud) [stack@undercloud-0 ~]$ nova show after_reboot_2|grep hyp
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0.localdomain |
(overcloud) [stack@undercloud-0 ~]$ ping 10.0.0.180
PING 10.0.0.180 (10.0.0.180) 56(84) bytes of data.
64 bytes from 10.0.0.180: icmp_seq=1 ttl=63 time=3.08 ms
^C
--- 10.0.0.180 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.087/3.087/3.087/0.000 ms
(overcloud) [stack@undercloud-0 ~]$ ping 10.0.0.183
PING 10.0.0.183 (10.0.0.183) 56(84) bytes of data.
64 bytes from 10.0.0.183: icmp_seq=1 ttl=63 time=2.03 ms
^C
--- 10.0.0.183 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.031/2.031/2.031/0.000 ms
VERIFIED volume wasn't created because - https://bugzilla.redhat.com/show_bug.cgi?id=1412661 openstack-tripleo-heat-templates-7.0.3-13.el7ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |
Description of problem: Failed to launch new instance after scale-up compute node on cluster with 3ctrl+1comp | fault | {"message": "Host 'compute-1.localdomain' is not mapped to any cell", "code": 400, "created": "2017-11-22T15:02:58Z"} | But in the same time I can perform live-migration of instance from compute-0 to compute-1 Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-7.0.3-10.el7ost.noarch How reproducible: always Steps to Reproduce: 1.Deploy 3ctrl+1comp 2.Scale compute 3.Launch instance on new compute Actual results: {"message": "Host 'compute-1.localdomain' is not mapped to any cell", "code": 400, "created": "2017-11-22T15:02:58Z"} Expected results: instance was launch Additional info: