Bug 2067261 - Using the baremetal node name results in "provide" failure
Summary: Using the baremetal node name results in "provide" failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: tripleo-ansible
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: beta
: 17.0
Assignee: Brendan Shephard
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-23 16:38 UTC by Brendan Shephard
Modified: 2022-09-21 12:20 UTC (History)
4 users (show)

Fixed In Version: tripleo-ansible-3.3.1-0.20220703001824.c562397.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-21 12:19:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 834926 0 None NEW Use node UUID for bridge_mapping agent check 2022-03-23 16:42:46 UTC
Red Hat Issue Tracker OSP-14196 0 None None None 2022-03-23 17:31:30 UTC
Red Hat Product Errata RHEA-2022:6543 0 None None None 2022-09-21 12:20:14 UTC

Description Brendan Shephard 2022-03-23 16:38:04 UTC
Description of problem:
If you try to introspect and --provide, or just openstack overcloud node provide and use the Name of a baremetal node, rather than the UUID. It will fail when it checks for bridge mappings:

PLAY [Overcloud Node Provide] **************************************************                                                                                               
2022-03-23 11:58:22.020331 | 5254009e-487c-eb1a-beba-000000000008 |       TASK | Check for required inputs                                                                     
2022-03-23 11:58:22.059015 | 5254009e-487c-eb1a-beba-000000000008 |    SKIPPED | Check for required inputs | localhost | item=node_uuids                                       
2022-03-23 11:58:22.061425 | 5254009e-487c-eb1a-beba-000000000008 |     TIMING | Check for required inputs | localhost | 0:00:00.101856 | 0.04s                                
2022-03-23 11:58:22.066065 | 5254009e-487c-eb1a-beba-00000000000a |       TASK | Set node_uuids_provide fact                                                                   
2022-03-23 11:58:22.101938 | 5254009e-487c-eb1a-beba-00000000000a |         OK | Set node_uuids_provide fact | localhost                                                       
2022-03-23 11:58:22.103026 | 5254009e-487c-eb1a-beba-00000000000a |     TIMING | Set node_uuids_provide fact | localhost | 0:00:00.143449 | 0.04s                              
2022-03-23 11:58:22.106783 | 5254009e-487c-eb1a-beba-00000000000c |       TASK | Notice                                                                                        
2022-03-23 11:58:22.132188 | 5254009e-487c-eb1a-beba-00000000000c |    SKIPPED | Notice | localhost                                                                            
2022-03-23 11:58:22.133342 | 5254009e-487c-eb1a-beba-00000000000c |     TIMING | Notice | localhost | 0:00:00.173775 | 0.03s                                                   
2022-03-23 11:58:22.142465 | 5254009e-487c-eb1a-beba-00000000000f |       TASK | Make nodes available                                                                          
2022-03-23 12:02:45.268808 | 5254009e-487c-eb1a-beba-00000000000f |      FATAL | Make nodes available | localhost | error={"changed": false, "msg": "Timeout waiting for node r530-13 to have bridge_mappings set in the ironic-neutron-agent entry"}





[stack@ccsosp-undercloud ~]$ openstack baremetal node list                                                                                                                     
/usr/lib64/python3.6/site-packages/_yaml/__init__.py:23: DeprecationWarning: The _yaml extension module is now located at yaml._yaml and its location is subject to change.  To
use the LibYAML-based parser and emitter, import from `yaml`: `from yaml import CLoader as Loader, CDumper as Dumper`.                                                         
  DeprecationWarning
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+                                                            
| UUID                                 | Name    | Instance UUID | Power State | Provisioning State | Maintenance |                                                            
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+                                                            
| f75f6ed4-4fd4-4e58-90f4-5d07a59026e2 | r530-14 | None          | power off   | available          | False       |                                                            
| 45f7199b-a79d-4064-a62d-1118011f64e6 | r530-13 | None          | power off   | manageable         | False       |

[stack@ccsosp-undercloud ~]$ openstack overcloud node introspect 45f7199b-a79d-4064-a62d-1118011f64e6 --provide



Version-Release number of selected component (if applicable):
RHOSP17
[stack@ccsosp-undercloud ~]$ rpm -qa | grep tripleo-ansible
tripleo-ansible-3.3.1-0.20220307002209.130185a.el8ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Try to introspect a node using the node name rather than the UUID
2. Fails on wait_for_bridge_mappings
3.



Additional info:
>>> [agent.id for agent in conn.network.agents(host="45f7199b-a79d-4064-a62d-1118011f64e6", binary="ironic-neutron-agent")]                                                    
['903ce241-d95c-479c-8d97-21d6feacfa4d']                                                                                                                                       
>>> [agent.id for agent in conn.network.agents(host="r530-13", binary="ironic-neutron-agent")]                                                                                 
                                                                                                                                                                               
[]

Comment 1 Brendan Shephard 2022-03-23 23:17:34 UTC
It was 3 am when I wrote this, and I'm just now realising it's a bit of a mess and maybe hard to follow.

So introspecting like this fails:
[stack@ccsosp-undercloud ~]$ openstack overcloud node introspect r530-13 --provide

It gets to the part where it checks for the neutron agent:
https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/ansible_plugins/modules/os_baremetal_provide_node.py#L386

This is using the node name of r530-13 to check for the Neutron agent. So like this:
>>> [agent.id for agent in conn.network.agents(host="r530-13", binary="ironic-neutron-agent")]                                                                                                                                                                                                                                                               
[]

If I use the UUID instead, it works, because that is how the agents are listed:
[stack@ccsosp-undercloud ~]$ openstack network agent list

+--------------------------------------+--------------------+--------------------------------------+-------------------+-------+-------+---------------------------+
| ID                                   | Agent Type         | Host                                 | Availability Zone | Alive | State | Binary                    |
+--------------------------------------+--------------------+--------------------------------------+-------------------+-------+-------+---------------------------+
| 11a44062-4201-499b-b04e-4a3c1984fdf6 | Baremetal Node     | 74f8ff17-7d2e-4465-8fa7-c01f0591d427 | None              | :-)   | UP    | ironic-neutron-agent      |
| 5a275c82-8eed-4b4f-85ee-e662f1a05baf | Baremetal Node     | f75f6ed4-4fd4-4e58-90f4-5d07a59026e2 | None              | :-)   | UP    | ironic-neutron-agent      |
| 629e9f6c-d5bc-40d7-976a-b3cffa728e63 | Baremetal Node     | e3251131-8b31-4268-a6cb-44e84a05e50f | None              | :-)   | UP    | ironic-neutron-agent      |
| 647f5ab6-9bf0-4f22-aec6-12c618ae5368 | Baremetal Node     | 00c7aa8d-d6ff-4e3f-ba1a-6a855ce1a37e | None              | :-)   | UP    | ironic-neutron-agent      |
| 678fc9d7-841c-404a-9992-ca5069dd9f1f | Baremetal Node     | 4880593d-3bd9-4d2e-8db6-51d4b1a6f1a8 | None              | :-)   | UP    | ironic-neutron-agent      |
| 7404ad43-402a-4400-be81-0c87cefaa690 | Baremetal Node     | 4535d9cb-6c8b-431d-a387-154ee55e3093 | None              | :-)   | UP    | ironic-neutron-agent      |
| 903ce241-d95c-479c-8d97-21d6feacfa4d | Baremetal Node     | 45f7199b-a79d-4064-a62d-1118011f64e6 | None              | :-)   | UP    | ironic-neutron-agent      |
| 9ce9c217-8f8a-4c63-a868-d0639a6d1dc2 | DHCP agent         | ccsosp-undercloud.localdomain        | nova              | :-)   | UP    | neutron-dhcp-agent        |
| 9f36e4ca-bfe1-45ee-8333-32034d595ab4 | Baremetal Node     | 4555269f-fbf0-405f-b6a7-9c9ca98beb38 | None              | :-)   | UP    | ironic-neutron-agent      |
| a381bb9d-d5fa-4cd5-99a0-fab74cc843d1 | Baremetal Node     | 6b289c67-72c8-46e1-b5ce-d9a7f561d325 | None              | :-)   | UP    | ironic-neutron-agent      |
| a5015e34-87fc-4634-b83c-9f2a0fe5deb9 | Baremetal Node     | cd35c842-005f-4fb0-b567-ea4c69c626ac | None              | :-)   | UP    | ironic-neutron-agent      |
| b1b9b08d-4b10-4bed-88bc-111435c9330b | L3 agent           | ccsosp-undercloud.localdomain        | nova              | :-)   | UP    | neutron-l3-agent          |
| b4a40d93-59c4-43f8-95fb-c28bb77450e4 | Open vSwitch agent | ccsosp-undercloud.localdomain        | None              | :-)   | UP    | neutron-openvswitch-agent |
| b7e90858-1483-4d8d-8e20-f2943206e674 | Baremetal Node     | de8cb33b-9da0-4c5d-b6ae-6697b5b958a8 | None              | :-)   | UP    | ironic-neutron-agent      |
| bb9d16d1-a925-48ff-8d5c-24eab61188bb | Baremetal Node     | c5c0ae8d-6ad8-48e3-b387-129948f9f846 | None              | :-)   | UP    | ironic-neutron-agent      |
+--------------------------------------+--------------------+--------------------------------------+-------------------+-------+-------+---------------------------+

So, instead, we need to always use the ID rather than the name:
>>> [agent.id for agent in conn.network.agents(host="45f7199b-a79d-4064-a62d-1118011f64e6", binary="ironic-neutron-agent")]                                                    
['903ce241-d95c-479c-8d97-21d6feacfa4d']

My patch proposes that we use whatever input the user provides (name or id) to lookup the node id:
https://review.opendev.org/c/openstack/tripleo-ansible/+/834926/1/tripleo_ansible/ansible_plugins/modules/os_baremetal_provide_node.py#386

Comment 2 Marian Krcmarik 2022-03-29 04:25:44 UTC
I hit that bug too but I couldn't figure out why It worked with some nodes and It didnt (when I was scaling up) with other nodes and the culprit is the use of IDs vs Names :-/. I should have searched bugzilla more.

Comment 3 Brendan Shephard 2022-03-29 04:31:39 UTC
(In reply to Marian Krcmarik from comment #2)
> I hit that bug too but I couldn't figure out why It worked with some nodes
> and It didnt (when I was scaling up) with other nodes and the culprit is the
> use of IDs vs Names :-/. I should have searched bugzilla more.

fwiw, It took me longer than I care to admit to figure it out as well.

The proposed solution can probably use some optimising, but it should fix the immediate problem for now.

Comment 4 Cédric Jeanneret 2022-06-30 12:07:38 UTC
Hello Brendan,

the patch merged in master back in April - care to push it to wallaby? We will more than probably miss the July 6th deadline though, is it something to consider for 17.1 instead?

Thanks!

Comment 5 Brendan Shephard 2022-06-30 12:12:06 UTC
Ah, done. My bad:
https://review.opendev.org/c/openstack/tripleo-ansible/+/848129

Comment 12 errata-xmlrpc 2022-09-21 12:19:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543


Note You need to log in before you can comment on or make changes to this bug.