Hide Forgot
Description of problem: During the deployment with network composable role, I noticed the "no valid host error" message continuously for one of the node. On checking further I found that it's happening because of wrong mapping between instance IDs and ironic IDs. ++++++++++++++++++ Flavor information ++++++++++++++++++ [stack@instack ~]$ openstack flavor list +--------------------------------------+-----------+------+------+-----------+-------+-----------+ | ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public | +--------------------------------------+-----------+------+------+-----------+-------+-----------+ | 7d63b254-2101-4989-932e-4130f107b469 | control | 4000 | 40 | 0 | 2 | True | | 919ef9d2-a0a5-4a4f-9a6d-c6f4309de2d3 | baremetal | 4096 | 40 | 0 | 1 | True | | e02dd064-8940-49cd-ac5b-c1614366e45b | compute | 3000 | 40 | 0 | 1 | True | | e6a72a1b-d3d3-4264-9cb3-ac0abae13172 | networker | 4000 | 40 | 0 | 2 | True | +--------------------------------------+-----------+------+------+-----------+-------+-----------+ [stack@instack ~]$ openstack flavor show control | grep properties | properties | capabilities:boot_option='local', capabilities:profile='control', cpu_arch='x86_64' | [stack@instack ~]$ openstack flavor show compute | grep properties | properties | capabilities:boot_option='local', capabilities:profile='compute', cpu_arch='x86_64' | [stack@instack ~]$ openstack flavor show networker | grep properties | properties | capabilities:boot_option='local', capabilities:profile='networker' | ++++++++++++++ Overcloud node ++++++++++++++ [stack@instack ~]$ nova list +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | c0abb67c-4471-4e87-ac13-b0a948d9e83d | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=192.0.2.10 | | fa09ec6a-ab21-4446-8e83-234ad79064c1 | overcloud-controller-0 | ERROR | - | NOSTATE | | | e712ca42-5d84-4e5f-b446-44ef52980a37 | overcloud-networking-0 | ACTIVE | - | Running | ctlplane=192.0.2.11 | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ [stack@instack ~]$ ironic node-list +--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+ | 12c0b6b9-0724-4e03-98ae-1328240aa846 | None | e712ca42-5d84-4e5f-b446-44ef52980a37 | power on | active | False | | 2bd2979a-b033-480f-a96f-496d407e1322 | None | c0abb67c-4471-4e87-ac13-b0a948d9e83d | power on | active | False | | a2e537db-0841-4dce-8f50-d8e5b8f05b4c | None | None | power off | available | False | | 43c408fd-9b06-480b-a9ea-0ebacec1ffd7 | None | None | power off | available | True | | 4df866f2-1a7f-4184-8b1e-e200b6cb247b | None | None | power off | available | True | +--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+ ++++++++++++++++++ ironic information ++++++++++++++++++ Wrong mapping for network node. $ for i in `ironic node-list | awk '/^\|/ {print $2}' | grep -v UUID` ; do echo "*******$i********" ; ironic node-show $i | egrep -A1 "instance_info|properties" ; done *******12c0b6b9-0724-4e03-98ae-1328240aa846******** | instance_info | {u'root_gb': u'40', u'display_name': u'overcloud-networking-0', | | | u'image_source': u'fb6bd3ff-299b-42d7-8827-5ec9d24a0b41', | -- | properties | {u'memory_mb': u'6096', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | | u'cpus': u'2', u'capabilities': u'profile:control,cpu_hugepages:true,boo | *******2bd2979a-b033-480f-a96f-496d407e1322******** | instance_info | {u'root_gb': u'40', u'display_name': u'overcloud-compute-0', | | | u'image_source': u'fb6bd3ff-299b-42d7-8827-5ec9d24a0b41', | -- | properties | {u'memory_mb': u'3096', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | | u'cpus': u'1', u'capabilities': u'profile:compute,cpu_hugepages:true,boo | *******a2e537db-0841-4dce-8f50-d8e5b8f05b4c******** | instance_info | {} | | instance_uuid | None | -- | properties | {u'memory_mb': u'4096', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | | u'cpus': u'2', u'capabilities': u'profile:networker,cpu_hugepages:true,b | *******43c408fd-9b06-480b-a9ea-0ebacec1ffd7******** | instance_info | {} | | instance_uuid | None | -- | properties | {u'memory_mb': u'6144', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | | u'cpus': u'1', u'capabilities': u'boot_option:local'} | *******4df866f2-1a7f-4184-8b1e-e200b6cb247b******** | instance_info | {} | | instance_uuid | None | -- | properties | {u'memory_mb': u'6144', u'cpu_arch': u'x86_64', u'local_gb': u'40', | | | u'cpus': u'1', u'capabilities': u'boot_option:local'} | Version-Release number of selected component (if applicable): RHEL OSP 10 How reproducible: I was able to reproduce this consistently. Steps to Reproduce: 1. Create networker and controller flavor with same specification. 2. Try to do the deployment. 3. One of the node is going into ERROR state because of no valid host error. Actual results: Deployment was getting failed because of wrong mapping. Expected results: It should not get failed. Additional info: I have to change the controller flavor specification to make the deployment successful.
Please provide more information regarding the inputs to this deployment, as I suspect either the nodes aren't tagged correctly to match the flavors, or you're not selecting the correct flavor for your new role (probably the latter - I think it's using the default "baremetal" flavor, which will pick any node). Please provide: 1. The full CLI command used to launch the deployment 2. The output of ironic node-show for each node 3. The custom roles_data.yaml file you used to launch the new role 4. Any additional environment files you're using to pass parameters in for the new role. I think you probably need to add an environment file like: parameter_defaults: OvercloudNetworkingFlavor: networker Where "Networking" is the role name in roles_data.yaml, and "networker" is the flavor you added to nova.
Today I saw a another issue of mapping which is relevant to this issue. ironic node tagged with control mapped as compute node during the deployment which is wrong. Deployment got successfully completed without any issue. Here the output from my setup: - Control profile was associated with ironic node "86f079e6-a49d-4789-ab64-d1475cd18ac4". ~~~ [stack@instack ~]$ openstack overcloud profiles list +--------------------------------------+-----------+-----------------+-----------------+-------------------+ | Node UUID | Node Name | Provision State | Current Profile | Possible Profiles | +--------------------------------------+-----------+-----------------+-----------------+-------------------+ | 86f079e6-a49d-4789-ab64-d1475cd18ac4 | | active | control | | | 2dbfc155-2ea3-4507-85d7-1ec100ba0157 | | active | Compute_1 | | | f2bd041c-03e8-46a0-a6c2-b5966b0982b6 | | active | Compute_2 | | +--------------------------------------+-----------+-----------------+-----------------+-------------------+ ~~~ - Here is the nova flavor-list output. ~~~ [stack@instack ~]$ nova flavor-list +--------------------------------------+--------------+-----------+------+-----------+------+-------+-------------+-----------+ | ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | +--------------------------------------+--------------+-----------+------+-----------+------+-------+-------------+-----------+ | 119ee68f-9e94-4f09-a12b-67dcf00ecc56 | control | 4000 | 40 | 0 | | 2 | 1.0 | True | | 4aece94b-8d2f-4b29-b3f3-65fbcf100cb0 | baremetal | 4096 | 40 | 0 | | 1 | 1.0 | True | | 5e7848ae-1830-4398-af20-4072894b64d7 | compute_1 | 4000 | 40 | 0 | | 1 | 1.0 | True | | 9740f691-6df0-4162-8c7e-a4f8ba544823 | ceph-storage | 5102 | 40 | 0 | | 1 | 1.0 | True | | e94a9a87-539a-4b28-a564-81e3ab722a13 | compute_2 | 5000 | 40 | 0 | | 1 | 1.0 | True | | ef7ae30d-e144-4bdf-bf6a-312627bdb9a7 | compute | 3000 | 40 | 0 | | 1 | 1.0 | True | +--------------------------------------+--------------+-----------+------+-----------+------+-------+-------------+-----------+ [stack@instack ~]$ nova flavor-show compute_1 ^[[A+----------------------------+--------------------------------------------------------------------------------------------------+ | Property | Value | +----------------------------+--------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | disk | 40 | | extra_specs | {"capabilities:boot_option": "local", "cpu_arch": "x86_64", "capabilities:profile": "Compute_1"} | | id | 5e7848ae-1830-4398-af20-4072894b64d7 | | name | compute_1 | | os-flavor-access:is_public | True | | ram | 4000 | | rxtx_factor | 1.0 | | swap | | | vcpus | 1 | +----------------------------+--------------------------------------------------------------------------------------------------+ [stack@instack ~]$ nova flavor-show compute_2 +----------------------------+--------------------------------------------------------------------------------------------------+ | Property | Value | +----------------------------+--------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | disk | 40 | | extra_specs | {"capabilities:boot_option": "local", "cpu_arch": "x86_64", "capabilities:profile": "Compute_2"} | | id | e94a9a87-539a-4b28-a564-81e3ab722a13 | | name | compute_2 | | os-flavor-access:is_public | True | | ram | 5000 | | rxtx_factor | 1.0 | | swap | | | vcpus | 1 | +----------------------------+--------------------------------------------------------------------------------------------------+ [stack@instack ~]$ nova flavor-show control +----------------------------+------------------------------------------------------------------------------------------------+ | Property | Value | +----------------------------+------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | disk | 40 | | extra_specs | {"capabilities:boot_option": "local", "cpu_arch": "x86_64", "capabilities:profile": "control"} | | id | 119ee68f-9e94-4f09-a12b-67dcf00ecc56 | | name | control | | os-flavor-access:is_public | True | | ram | 4000 | | rxtx_factor | 1.0 | | swap | | | vcpus | 2 | +----------------------------+------------------------------------------------------------------------------------------------+ ~~~ - After successful deployment. ~~~ [stack@instack ~]$ heat stack-list WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead +--------------------------------------+------------+-----------------+----------------------+--------------+ | id | stack_name | stack_status | creation_time | updated_time | +--------------------------------------+------------+-----------------+----------------------+--------------+ | 2cbf1474-bec6-4574-b21b-9be55dc79d10 | overcloud | CREATE_COMPLETE | 2017-03-20T12:19:43Z | None | +--------------------------------------+------------+-----------------+----------------------+--------------+ ~~~ here is the wrong mapping between nova and ironic servers. ~~~ [stack@instack ~]$ ironic node-list +--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+ | 86f079e6-a49d-4789-ab64-d1475cd18ac4 | None | 1237bbbe-0eff-4079-a480-8e8902d50af9 | power on | active | False | | 2dbfc155-2ea3-4507-85d7-1ec100ba0157 | None | cd74107e-7be1-4f94-9aeb-ed80ae34a0f2 | power on | active | False | | f2bd041c-03e8-46a0-a6c2-b5966b0982b6 | None | 76069553-70bd-49ab-a1ce-01e689b6b6b0 | power on | active | False | | 8d09756c-af60-4d49-bb4d-c58238169211 | None | None | power off | available | True | | bb0ab078-ba7c-43e6-be5e-2ab762498d5c | None | None | power off | available | True | +--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+ [stack@instack ~]$ nova list +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ | 1237bbbe-0eff-4079-a480-8e8902d50af9 | overcloud-compute_1-0 | ACTIVE | - | Running | ctlplane=192.0.2.15 | | 76069553-70bd-49ab-a1ce-01e689b6b6b0 | overcloud-compute_2-0 | ACTIVE | - | Running | ctlplane=192.0.2.6 | | cd74107e-7be1-4f94-9aeb-ed80ae34a0f2 | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=192.0.2.14 | +--------------------------------------+------------------------+--------+------------+-------------+---------------------+ ~~~ overcloud compute node with id "1237bbbe-0eff-4079-a480-8e8902d50af9" got mapped to controller ironic node with id "86f079e6-a49d-4789-ab64-d1475cd18ac4" which is wrong. Deployment command which i have used. ~~~ #nohup openstack overcloud deploy --templates -r ~/compute-composable/roles_data.yaml -e ~/compute-composable/network-isolation.yaml -e ~/compute-composable/network-environment.yaml --ntp-server pool.ntp.org --libvirt-type qemu & ~~~
Created attachment 1264936 [details] deployment templates
Created attachment 1265032 [details] outputs.txt
Created attachment 1265033 [details] scheduler.log
Looking at this bug for first time in HardProv DFG... The initial creation of this bug and the comments in comment 1-3 aren't sufficient to make progress as its missing quite a bit of the requested data. There is more useful data in comments 5-7 so we'll concentrate on that. However, as its been a year since those comments were posted, additional logs that are needed to debug - for example ironic-conductor.log - I assume are not available. Note that the "No valid hosts" error is quite a common configuration issue, hence this troubleshooting guide - https://docs.openstack.org/ironic/latest/admin/troubleshooting.html Jaison- have you had other occurrences like this in the year since this was created?
(In reply to Bob Fournier from comment #11) > Looking at this bug for first time in HardProv DFG... > Jaison- have you had other occurrences like this in the year since this was > created? I haven't noticed this recently. I think this may have been related to https://bugzilla.redhat.com/show_bug.cgi?id=1500157 , which was closed due to lack of reproducer.
Thanks Jaison. I'm going to close this for now as we don't enough to go on, and this is a pretty common issue normally traced back to configuration. Please reopen this if you get another occurrence.