Document URL: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.0/html/director_installation_and_usage/assembly_scaling-overcloud-nodes#proc_scaling-down-bare-metal-nodes_scaling-overcloud-nodes Section Number and Name: Section Number: 19.4 Section Name: Removing or replacing a Compute node Describe the issue: Tried to delete compute node which is in shutdown and unreachable state as mentioned section 19.4 of RHOSP17.0 document. But overcloud node delete command is failed with ssh issue. Steps to reproduce: Steps as mentioned in Section 19.4 of Document 1. Disable compute service for that node that we need to delete (overcloud)[stack@manager ~]$ openstack compute service set overcloud-novacompute-0.example.com nova-compute --disable 2. Verify that the service is disable or not (overcloud)[stack@manager ~]$ openstack compute service list +--------------------------------------+----------------+-------------------------------------+----------+----------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +--------------------------------------+----------------+-------------------------------------+----------+----------+-------+----------------------------+ | 1d50d9f3-c871-4d95-ad1a-1692f98673e9 | nova-compute | overcloud-novacompute-0.example.com | nova | disabled | up | 2022-11-24T12:27:48.000000 | +--------------------------------------+----------------+-------------------------------------+----------+----------+-------+----------------------------+ 3. Power off compute node through baremetal node command (undercloud) [stack@manager ~]$ openstack baremetal node power off a53f0d4b-3436-44ac-b83a-74a7413d4863 (undercloud) [stack@manager ~]$ openstack baremetal node list +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ | a53f0d4b-3436-44ac-b83a-74a7413d4863 | compute2 | 39522868-301d-468d-a1e1-33e91a7e6e37 | power off | active | False | +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ 4. Check ping and reachability of node (node is unreachable) (undercloud) [stack@manager ~]$ ping 192.168.100.188 PING 192.168.100.188 (192.168.100.188) 56(84) bytes of data. From 192.168.100.30 icmp_seq=10 Destination Host Unreachable From 192.168.100.30 icmp_seq=11 Destination Host Unreachable 5. Update node count in "overcloud-baremetal-deploy.yaml" file and add parameter "provisioned: false" - name: Compute count: 1 instances: - hostname: overcloud-novacompute-0 name: compute2 provisioned: false 6. Execute overcloud deployment command and command failed with ssh error mesg [WARNING]: Unhandled error in Python interpreter discovery for host 192.168.100.188: Failed to connect to the host via ssh: ssh: connect to host 192.168.100.188 port 22: No route to host 2022-11-24 18:56:27.812466 | 52540049-b3bc-047a-4d73-000000000038 | FATAL | Wait for connection to become available | 192.168.100.188 | error={"changed": false, "elapsed": 2416, "msg": "timed out waiting for ping module test: Data could not be sent to remote host \"192.168.100.188\". Make sure this host can be reached over ssh: ssh: connect to host 192.168.100.188 port 22: No route to host\r\n"} 2022-11-24 18:56:27.817609 | 52540049-b3bc-047a-4d73-000000000038 | TIMING | Wait for connection to become available | 192.168.100.188 | 0:40:23.580331 | 2416.99s 7. Execute overcloud node delete command, this command also failed below error mesg (overcloud)[stack@manager ~]$ openstack overcloud node delete --stack overcloud --baremetal-deployment /home/stack/templates/overcloud-baremetal-deploy.yaml 2022-11-24 19:09:01.596360 | 52540049-b3bc-81e2-d9b9-00000000000c | TIMING | Expand roles | localhost | 0:00:02.554421 | 2.29s 2022-11-24 19:09:01.611634 | 52540049-b3bc-81e2-d9b9-00000000000d | TASK | Find existing instances 2022-11-24 19:09:04.965473 | 52540049-b3bc-81e2-d9b9-00000000000d | FATAL | Find existing instances | localhost | error={"changed": false, "msg": "Instance overcloud-novacompute-0 is not specified as pre-provisioned (managed: False), and no connection to the baremetal service was provided."} 2022-11-24 19:09:04.968353 | 52540049-b3bc-81e2-d9b9-00000000000d | TIMING | Find existing instances | localhost | 0:00:05.926413 | 3.35s 8. Now as mentioned in document Section 19.4.1, if the overcloud node delete command failed due to an unreachable node then redirect towards manually node deletion procedure 9. Set Baremetal node to maintenance mode (undercloud) [stack@manager ~]$ openstack baremetal node maintenance set a53f0d4b-3436-44ac-b83a-74a7413d4863 10. Verify status (undercloud) [stack@manager ~]$ openstack baremetal node list +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ | ccf89e1a-b45f-437e-8e9f-3a605b614b1e | compute1 | fadcbce5-e925-4846-a996-f616a6b26ff5 | power on | active | False | | 825b1dc8-9875-44c5-b32c-2548d74797d4 | controller0 | b146a976-2258-4684-a0b3-157ecfe16beb | power on | active | False | | 316247a9-a06a-4e29-aefe-9a92ee77eb2c | controller1 | 96021f32-a27f-423b-ab07-ccc211f5875c | power on | active | False | | aace3156-d428-4464-a590-81d69a15d5d1 | controller2 | 9f802b46-8c21-44d7-b125-fb1700fe5d77 | power on | active | False | | 777a0672-ce34-49db-93ee-caec2a6dcf03 | storage1 | 05e73efe-139c-4635-9f1d-469832015355 | power on | active | False | | bc35858a-adf8-437c-8de3-d498bbff621d | storage2 | 17a28eb9-8b23-4c26-b27a-703cab2d50a8 | power on | active | False | | 54fff889-5d8b-496e-a0fb-affdea006bc1 | storage0 | 440fa182-7dc7-4ed4-bc1e-3695ed34e644 | power on | active | False | | a53f0d4b-3436-44ac-b83a-74a7413d4863 | compute2 | 39522868-301d-468d-a1e1-33e91a7e6e37 | power off | active | True | +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ 11. Delete network agent for that node (overcloud) [stack@manager ~]$ for AGENT in $(openstack network agent list --host overcloud-novacompute-0.example.com -c ID -f value) ; do openstack network agent delete $AGENT ; done (overcloud) [stack@manager ~]$ openstack network agent list +--------------------------------------+------------------------------+-------------------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+-------------------------------------+-------------------+-------+-------+----------------------------+ | 8c0778ab-bf49-4a35-ba0a-4f9921c06536 | OVN Controller Gateway agent | overcloud-controller-0.example.com | | :-) | UP | ovn-controller | | 8c433437-ee33-4970-a7ab-6777813c987d | OVN Controller agent | overcloud-novacompute-1.example.com | | :-) | UP | ovn-controller | | 867828e6-b41e-5ba7-94d1-ff9eede53b01 | OVN Metadata agent | overcloud-novacompute-1.example.com | | :-) | UP | neutron-ovn-metadata-agent | | 3a60c798-8325-41ef-9551-b831f714b05b | OVN Controller Gateway agent | overcloud-controller-2.example.com | | :-) | UP | ovn-controller | | 6f5ab488-3b77-4802-a798-5eb3b997a620 | OVN Controller Gateway agent | overcloud-controller-1.example.com | | :-) | UP | ovn-controller | +--------------------------------------+------------------------------+-------------------------------------+-------------------+-------+-------+----------------------------+ 12. Delete Resource provide list for that node (overcloud) [stack@manager ~]$ openstack resource provider list +--------------------------------------+-------------------------------------+------------+ | uuid | name | generation | +--------------------------------------+-------------------------------------+------------+ | 434c4c7a-cbd0-42c4-815c-3d85199f9ee9 | overcloud-novacompute-1.example.com | 699 | | 54522276-0d1d-491d-9760-ba7863d90611 | overcloud-novacompute-0.example.com | 17 | +--------------------------------------+-------------------------------------+------------+ (overcloud) [stack@manager ~]$ openstack resource provider delete 54522276-0d1d-491d-9760-ba7863d90611 (overcloud) [stack@manager ~]$ openstack resource provider list +--------------------------------------+-------------------------------------+------------+ | uuid | name | generation | +--------------------------------------+-------------------------------------+------------+ | 434c4c7a-cbd0-42c4-815c-3d85199f9ee9 | overcloud-novacompute-1.example.com | 699 | +--------------------------------------+-------------------------------------+------------+ 13. Now again execute overcloud node delete command (undercloud) [stack@manager ~]$ openstack overcloud node delete --stack overcloud overcloud-novacompute-0 Are you sure you want to delete these overcloud nodes [y/N]? y [DEPRECATION WARNING]: ANSIBLE_CALLBACK_WHITELIST option, normalizing names to new standard, use ANSIBLE_CALLBACKS_ENABLED instead. This feature will be removed from ansible-core in version 2.15. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. PLAY [Check if required variables are defined] ********************************* skipping: no hosts matched PLAY [Clear cached facts] ****************************************************** PLAY [Gather facts] ************************************************************ 2022-11-25 11:25:22.141140 | 52540049-b3bc-b519-c4be-00000000003f | TASK | Gathering Facts [WARNING]: Unhandled error in Python interpreter discovery for host overcloud- novacompute-0: Failed to connect to the host via ssh: ssh: connect to host 192.168.100.188 port 22: No route to host 2022-11-25 11:25:47.518861 | 52540049-b3bc-b519-c4be-00000000003f | UNREACHABLE | Gathering Facts | overcloud-novacompute-0 2022-11-25 11:25:47.525352 | 52540049-b3bc-b519-c4be-00000000003f | TIMING | Gathering Facts | overcloud-novacompute-0 | 0:00:25.484844 | 25.38s NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* overcloud-novacompute-0 : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0 2022-11-25 11:25:47.537067 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2022-11-25 11:25:47.538103 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2022-11-25 11:25:47.539144 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:00:25.498694 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2022-11-25 11:25:47.540284 | UUID | Info | Host | Task Name | Run Time 2022-11-25 11:25:47.541269 | 52540049-b3bc-b519-c4be-00000000003f | SUMMARY | overcloud-novacompute-0 | Gathering Facts | 25.38s 2022-11-25 11:25:47.542267 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2022-11-25 11:25:47.543349 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2022-11-25 11:25:47.544421 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2022-11-25 11:25:47.545436 | The following node(s) had failures: overcloud-novacompute-0 2022-11-25 11:25:47.546457 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Again, Node is not deleted from the system and overcloud node delete command failed. (undercloud) [stack@manager ~]$ openstack baremetal node list +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ | ccf89e1a-b45f-437e-8e9f-3a605b614b1e | compute1 | fadcbce5-e925-4846-a996-f616a6b26ff5 | power on | active | False | | 825b1dc8-9875-44c5-b32c-2548d74797d4 | controller0 | b146a976-2258-4684-a0b3-157ecfe16beb | power on | active | False | | 316247a9-a06a-4e29-aefe-9a92ee77eb2c | controller1 | 96021f32-a27f-423b-ab07-ccc211f5875c | power on | active | False | | aace3156-d428-4464-a590-81d69a15d5d1 | controller2 | 9f802b46-8c21-44d7-b125-fb1700fe5d77 | power on | active | False | | 777a0672-ce34-49db-93ee-caec2a6dcf03 | storage1 | 05e73efe-139c-4635-9f1d-469832015355 | power on | active | False | | bc35858a-adf8-437c-8de3-d498bbff621d | storage2 | 17a28eb9-8b23-4c26-b27a-703cab2d50a8 | power on | active | False | | 54fff889-5d8b-496e-a0fb-affdea006bc1 | storage0 | 440fa182-7dc7-4ed4-bc1e-3695ed34e644 | power on | active | False | | a53f0d4b-3436-44ac-b83a-74a7413d4863 | compute2 | 39522868-301d-468d-a1e1-33e91a7e6e37 | power off | active | True | +--------------------------------------+-------------+--------------------------------------+-------------+--------------------+-------------+ Actual Result: Overcloud node delete command is failed with error message Expected Result: Overcloud node delete command is executed successfully
If it incurs a procedural change, please include this in the documentation change log. Thanks, Andy
(In reply to Andy Stillman from comment #1) > If it incurs a procedural change, please include this in the documentation > change log. > > Thanks, > Andy Hi Andy, Thanks for the response. Yes This is a procedural change, as the mentioned in procedure this is not working as expected. Procedure needs to be updated with correct steps. We have already mentioned reproduction steps in description. Requesting Redhat Team to look into it. >>please include this in the documentation change log. This statement is not clear to us. Can you explain a little more this. If this bug is not falls in documentation category then please assign this bug to correct owner. Thanks & Regards Rahul Kaushal
This bug is linked: https://bugzilla.redhat.com/show_bug.cgi?id=2147614 Hey Harald, Could you help with this procedure? Thanks Fiona
You're running node delete after sourcing overcloudrc[1]. You should source stackrc before running the command. The error is clear 7. Execute overcloud node delete command, this command also failed below error mesg "no connection to the baremetal service was provided." [1] (overcloud)[stack@manager ~]$ openstack overcloud node delete --stack overcloud --baremetal-deployment /home/stack/templates/overcloud-baremetal-deploy.yaml
(In reply to Rabi Mishra from comment #8) > You're running node delete after sourcing overcloudrc[1]. You should source > stackrc before running the command. The error is clear > > 7. Execute overcloud node delete command, this command also failed below > error mesg "no connection to the baremetal service was provided." > > [1] (overcloud)[stack@manager ~]$ openstack overcloud node delete --stack > overcloud --baremetal-deployment > /home/stack/templates/overcloud-baremetal-deploy.yaml Thanks Rabi, For pointing out mistake in our execution procedure. We will again execute the procedure by source stackrc and share findings again on this bug.
Dear All, Thank you for your investigation and analysis on this bug. we have tried node deletion procedure by sourcing stackrc and this time procedure works well according to steps mentioned in document. Hence this functionality is working as expected so we are closing this bug. Thanks & Regards, Piyush