Bug 1398652

Summary: No valid host error if the network and controller node have same specifications
Product: Red Hat OpenStack Reporter: VIKRANT <vaggarwa>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Omri Hochman <ohochman>
Severity: medium Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: aschultz, bfournie, chih-hsien.chien, dbecker, dtantsur, jraju, mburns, mcornea, mlammon, morazi, rhel-osp-director-maint, shardy
Target Milestone: ---Keywords: ZStream
Target Release: 10.0 (Newton)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-28 14:19:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
deployment templates
none
outputs.txt
none
scheduler.log none

Description VIKRANT 2016-11-25 13:31:44 UTC
Description of problem:

During the deployment with network composable role, I noticed the "no valid host error" message continuously for one of the node. On checking further I found that it's happening because of wrong mapping between instance IDs and ironic IDs. 

++++++++++++++++++
Flavor information
++++++++++++++++++
 
[stack@instack ~]$ openstack flavor list
+--------------------------------------+-----------+------+------+-----------+-------+-----------+
| ID                                   | Name      |  RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+-----------+------+------+-----------+-------+-----------+
| 7d63b254-2101-4989-932e-4130f107b469 | control   | 4000 |   40 |         0 |     2 | True      |
| 919ef9d2-a0a5-4a4f-9a6d-c6f4309de2d3 | baremetal | 4096 |   40 |         0 |     1 | True      |
| e02dd064-8940-49cd-ac5b-c1614366e45b | compute   | 3000 |   40 |         0 |     1 | True      |
| e6a72a1b-d3d3-4264-9cb3-ac0abae13172 | networker | 4000 |   40 |         0 |     2 | True      |
+--------------------------------------+-----------+------+------+-----------+-------+-----------+
 
[stack@instack ~]$ openstack flavor show control  | grep properties
| properties                 | capabilities:boot_option='local', capabilities:profile='control', cpu_arch='x86_64' |
[stack@instack ~]$ openstack flavor show compute  | grep properties
| properties                 | capabilities:boot_option='local', capabilities:profile='compute', cpu_arch='x86_64' |
[stack@instack ~]$ openstack flavor show networker  | grep properties
| properties                 | capabilities:boot_option='local', capabilities:profile='networker' |
 
++++++++++++++
Overcloud node
++++++++++++++
 
[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks            |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| c0abb67c-4471-4e87-ac13-b0a948d9e83d | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.0.2.10 |
| fa09ec6a-ab21-4446-8e83-234ad79064c1 | overcloud-controller-0 | ERROR  | -          | NOSTATE     |                     |
| e712ca42-5d84-4e5f-b446-44ef52980a37 | overcloud-networking-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
 
 
[stack@instack ~]$ ironic node-list
+--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID                        | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+
| 12c0b6b9-0724-4e03-98ae-1328240aa846 | None | e712ca42-5d84-4e5f-b446-44ef52980a37 | power on    | active             | False       |
| 2bd2979a-b033-480f-a96f-496d407e1322 | None | c0abb67c-4471-4e87-ac13-b0a948d9e83d | power on    | active             | False       |
| a2e537db-0841-4dce-8f50-d8e5b8f05b4c | None | None                                 | power off   | available          | False       |
| 43c408fd-9b06-480b-a9ea-0ebacec1ffd7 | None | None                                 | power off   | available          | True        |
| 4df866f2-1a7f-4184-8b1e-e200b6cb247b | None | None                                 | power off   | available          | True        |
+--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+
 
 
++++++++++++++++++
ironic information
++++++++++++++++++
 
Wrong mapping for network node.
 
$ for i in `ironic node-list | awk '/^\|/ {print $2}' | grep -v UUID` ; do  echo "*******$i********" ; ironic node-show $i | egrep -A1  "instance_info|properties"   ; done
*******12c0b6b9-0724-4e03-98ae-1328240aa846********
| instance_info          | {u'root_gb': u'40', u'display_name': u'overcloud-networking-0',          |
|                        | u'image_source': u'fb6bd3ff-299b-42d7-8827-5ec9d24a0b41',                |
--
| properties             | {u'memory_mb': u'6096', u'cpu_arch': u'x86_64', u'local_gb': u'40',      |
|                        | u'cpus': u'2', u'capabilities': u'profile:control,cpu_hugepages:true,boo |
*******2bd2979a-b033-480f-a96f-496d407e1322********
| instance_info          | {u'root_gb': u'40', u'display_name': u'overcloud-compute-0',             |
|                        | u'image_source': u'fb6bd3ff-299b-42d7-8827-5ec9d24a0b41',                |
--
| properties             | {u'memory_mb': u'3096', u'cpu_arch': u'x86_64', u'local_gb': u'40',      |
|                        | u'cpus': u'1', u'capabilities': u'profile:compute,cpu_hugepages:true,boo |
*******a2e537db-0841-4dce-8f50-d8e5b8f05b4c********
| instance_info          | {}                                                                       |
| instance_uuid          | None                                                                     |
--
| properties             | {u'memory_mb': u'4096', u'cpu_arch': u'x86_64', u'local_gb': u'40',      |
|                        | u'cpus': u'2', u'capabilities': u'profile:networker,cpu_hugepages:true,b |
*******43c408fd-9b06-480b-a9ea-0ebacec1ffd7********
| instance_info          | {}                                                                      |
| instance_uuid          | None                                                                    |
--
| properties             | {u'memory_mb': u'6144', u'cpu_arch': u'x86_64', u'local_gb': u'40',     |
|                        | u'cpus': u'1', u'capabilities': u'boot_option:local'}                   |
*******4df866f2-1a7f-4184-8b1e-e200b6cb247b********
| instance_info          | {}                                                                      |
| instance_uuid          | None                                                                    |
--
| properties             | {u'memory_mb': u'6144', u'cpu_arch': u'x86_64', u'local_gb': u'40',     |
|                        | u'cpus': u'1', u'capabilities': u'boot_option:local'}                   |

Version-Release number of selected component (if applicable):
RHEL OSP 10

How reproducible:
I was able to reproduce this consistently.

Steps to Reproduce:
1.  Create networker and controller flavor with same specification. 
2.  Try to do the deployment. 
3.  One of the node is going into ERROR state because of no valid host error. 

Actual results:
Deployment was getting failed because of wrong mapping.

Expected results:
It should not get failed. 

Additional info:

I have to change the controller flavor specification to make the deployment successful.

Comment 1 Steven Hardy 2016-11-28 09:51:33 UTC
Please provide more information regarding the inputs to this deployment, as I suspect either the nodes aren't tagged correctly to match the flavors, or you're not selecting the correct flavor for your new role (probably the latter - I think it's using the default "baremetal" flavor, which will pick any node).

Please provide:

1. The full CLI command used to launch the deployment

2. The output of ironic node-show for each node

3. The custom roles_data.yaml file you used to launch the new role

4. Any additional environment files you're using to pass parameters in for the new role.

I think you probably need to add an environment file like:

parameter_defaults:
  OvercloudNetworkingFlavor: networker

Where "Networking" is the role name in roles_data.yaml, and "networker" is the flavor you added to nova.

Comment 3 VIKRANT 2017-03-21 08:19:00 UTC
Today I saw a  another issue of mapping which is relevant to this issue. ironic node tagged with control mapped as compute node during the deployment which is wrong. Deployment got successfully completed without any issue.

Here the output from my setup:

- Control profile was associated with ironic node "86f079e6-a49d-4789-ab64-d1475cd18ac4".

~~~
[stack@instack ~]$ openstack overcloud profiles list
+--------------------------------------+-----------+-----------------+-----------------+-------------------+
| Node UUID                            | Node Name | Provision State | Current Profile | Possible Profiles |
+--------------------------------------+-----------+-----------------+-----------------+-------------------+
| 86f079e6-a49d-4789-ab64-d1475cd18ac4 |           | active          | control         |                   |
| 2dbfc155-2ea3-4507-85d7-1ec100ba0157 |           | active          | Compute_1       |                   |
| f2bd041c-03e8-46a0-a6c2-b5966b0982b6 |           | active          | Compute_2       |                   |
+--------------------------------------+-----------+-----------------+-----------------+-------------------+
~~~

- Here is the nova flavor-list output.

~~~
[stack@instack ~]$ nova flavor-list
+--------------------------------------+--------------+-----------+------+-----------+------+-------+-------------+-----------+
| ID                                   | Name         | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public |
+--------------------------------------+--------------+-----------+------+-----------+------+-------+-------------+-----------+
| 119ee68f-9e94-4f09-a12b-67dcf00ecc56 | control      | 4000      | 40   | 0         |      | 2     | 1.0         | True      |
| 4aece94b-8d2f-4b29-b3f3-65fbcf100cb0 | baremetal    | 4096      | 40   | 0         |      | 1     | 1.0         | True      |
| 5e7848ae-1830-4398-af20-4072894b64d7 | compute_1    | 4000      | 40   | 0         |      | 1     | 1.0         | True      |
| 9740f691-6df0-4162-8c7e-a4f8ba544823 | ceph-storage | 5102      | 40   | 0         |      | 1     | 1.0         | True      |
| e94a9a87-539a-4b28-a564-81e3ab722a13 | compute_2    | 5000      | 40   | 0         |      | 1     | 1.0         | True      |
| ef7ae30d-e144-4bdf-bf6a-312627bdb9a7 | compute      | 3000      | 40   | 0         |      | 1     | 1.0         | True      |
+--------------------------------------+--------------+-----------+------+-----------+------+-------+-------------+-----------+


[stack@instack ~]$ nova flavor-show compute_1
^[[A+----------------------------+--------------------------------------------------------------------------------------------------+
| Property                   | Value                                                                                            |
+----------------------------+--------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                            |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                |
| disk                       | 40                                                                                               |
| extra_specs                | {"capabilities:boot_option": "local", "cpu_arch": "x86_64", "capabilities:profile": "Compute_1"} |
| id                         | 5e7848ae-1830-4398-af20-4072894b64d7                                                             |
| name                       | compute_1                                                                                        |
| os-flavor-access:is_public | True                                                                                             |
| ram                        | 4000                                                                                             |
| rxtx_factor                | 1.0                                                                                              |
| swap                       |                                                                                                  |
| vcpus                      | 1                                                                                                |
+----------------------------+--------------------------------------------------------------------------------------------------+
[stack@instack ~]$ nova flavor-show compute_2
+----------------------------+--------------------------------------------------------------------------------------------------+
| Property                   | Value                                                                                            |
+----------------------------+--------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                            |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                |
| disk                       | 40                                                                                               |
| extra_specs                | {"capabilities:boot_option": "local", "cpu_arch": "x86_64", "capabilities:profile": "Compute_2"} |
| id                         | e94a9a87-539a-4b28-a564-81e3ab722a13                                                             |
| name                       | compute_2                                                                                        |
| os-flavor-access:is_public | True                                                                                             |
| ram                        | 5000                                                                                             |
| rxtx_factor                | 1.0                                                                                              |
| swap                       |                                                                                                  |
| vcpus                      | 1                                                                                                |
+----------------------------+--------------------------------------------------------------------------------------------------+
[stack@instack ~]$ nova flavor-show control
+----------------------------+------------------------------------------------------------------------------------------------+
| Property                   | Value                                                                                          |
+----------------------------+------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                          |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                              |
| disk                       | 40                                                                                             |
| extra_specs                | {"capabilities:boot_option": "local", "cpu_arch": "x86_64", "capabilities:profile": "control"} |
| id                         | 119ee68f-9e94-4f09-a12b-67dcf00ecc56                                                           |
| name                       | control                                                                                        |
| os-flavor-access:is_public | True                                                                                           |
| ram                        | 4000                                                                                           |
| rxtx_factor                | 1.0                                                                                            |
| swap                       |                                                                                                |
| vcpus                      | 2                                                                                              |
+----------------------------+------------------------------------------------------------------------------------------------+
~~~

- After successful deployment.

~~~
[stack@instack ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+--------------+
| id                                   | stack_name | stack_status    | creation_time        | updated_time |
+--------------------------------------+------------+-----------------+----------------------+--------------+
| 2cbf1474-bec6-4574-b21b-9be55dc79d10 | overcloud  | CREATE_COMPLETE | 2017-03-20T12:19:43Z | None         |
+--------------------------------------+------------+-----------------+----------------------+--------------+
~~~

here is the wrong mapping between nova and ironic servers. 

~~~
[stack@instack ~]$ ironic node-list
+--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID                        | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+
| 86f079e6-a49d-4789-ab64-d1475cd18ac4 | None | 1237bbbe-0eff-4079-a480-8e8902d50af9 | power on    | active             | False       |
| 2dbfc155-2ea3-4507-85d7-1ec100ba0157 | None | cd74107e-7be1-4f94-9aeb-ed80ae34a0f2 | power on    | active             | False       |
| f2bd041c-03e8-46a0-a6c2-b5966b0982b6 | None | 76069553-70bd-49ab-a1ce-01e689b6b6b0 | power on    | active             | False       |
| 8d09756c-af60-4d49-bb4d-c58238169211 | None | None                                 | power off   | available          | True        |
| bb0ab078-ba7c-43e6-be5e-2ab762498d5c | None | None                                 | power off   | available          | True        |
+--------------------------------------+------+--------------------------------------+-------------+--------------------+-------------+

[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks            |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| 1237bbbe-0eff-4079-a480-8e8902d50af9 | overcloud-compute_1-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |
| 76069553-70bd-49ab-a1ce-01e689b6b6b0 | overcloud-compute_2-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.6  |
| cd74107e-7be1-4f94-9aeb-ed80ae34a0f2 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
~~~

overcloud compute node with id "1237bbbe-0eff-4079-a480-8e8902d50af9" got mapped to controller ironic node with id "86f079e6-a49d-4789-ab64-d1475cd18ac4" which is wrong. 

Deployment command which i have used.

~~~
#nohup openstack overcloud deploy --templates -r ~/compute-composable/roles_data.yaml -e ~/compute-composable/network-isolation.yaml -e ~/compute-composable/network-environment.yaml --ntp-server pool.ntp.org --libvirt-type qemu &
~~~

Comment 4 VIKRANT 2017-03-21 08:19:54 UTC
Created attachment 1264936 [details]
deployment templates

Comment 6 Jaison Raju 2017-03-21 12:02:34 UTC
Created attachment 1265032 [details]
outputs.txt

Comment 7 Jaison Raju 2017-03-21 12:04:15 UTC
Created attachment 1265033 [details]
scheduler.log

Comment 11 Bob Fournier 2018-03-27 19:34:43 UTC
Looking at this bug for first time in HardProv DFG...

The initial creation of this bug and the comments in comment 1-3 aren't sufficient to make progress as its missing quite a bit of the requested data.  There is more useful data in comments 5-7 so we'll concentrate on that.  However, as its been a year since those comments were posted, additional logs that are needed to debug - for example ironic-conductor.log - I assume are not available.

Note that the "No valid hosts" error is quite a common configuration issue, hence this troubleshooting guide - https://docs.openstack.org/ironic/latest/admin/troubleshooting.html

Jaison- have you had other occurrences like this in the year since this was created?

Comment 12 Jaison Raju 2018-03-28 05:04:39 UTC
(In reply to Bob Fournier from comment #11)
> Looking at this bug for first time in HardProv DFG...

> Jaison- have you had other occurrences like this in the year since this was
> created?
 I haven't noticed this recently. 
I think this may have been related to https://bugzilla.redhat.com/show_bug.cgi?id=1500157 , which was closed due to lack of reproducer.

Comment 13 Bob Fournier 2018-03-28 14:19:22 UTC
Thanks Jaison.  I'm going to close this for now as we don't enough to go on, and this is a pretty common issue normally traced back to configuration.  Please reopen this if you get another occurrence.