Bug 1849166 - [Scale down] OVN agents cannot be deleted and OVS agents get recreated after deletion
Summary: [Scale down] OVN agents cannot be deleted and OVS agents get recreated after ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Terry Wilson
QA Contact: Eran Kuris
URL:
Whiteboard:
: 1975264 2064794 2068069 2168403 (view as bug list)
Depends On:
Blocks: 1768678 1841011
TreeView+ depends on / blocked
 
Reported: 2020-06-19 18:11 UTC by Archit Modi
Modified: 2023-09-18 00:21 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1946835 (view as bug list)
Environment:
Last Closed: 2022-08-04 21:12:14 UTC
Target Upstream Version:
Embargoed:
pmannidi: needinfo-
pmannidi: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-1435 0 None None None 2021-11-18 14:36:39 UTC
Red Hat Knowledge Base (Solution) 5393161 0 None None None 2020-09-11 18:40:29 UTC
Red Hat Knowledge Base (Solution) 6958605 0 None None None 2023-03-13 14:36:07 UTC

Description Archit Modi 2020-06-19 18:11:49 UTC
Description of problem: After performing a compute scale down on RHOS 16.1, network agents for the deleted compute host still exist and for OVN deployments these agents cannot be deleted. For OVS deployments, the agents reappear after they have been deleted.

Related Docs BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1841011

How reproducible: always

Steps to Reproduce:
1. Deploy RHOS 16.1 with at least 2 compute nodes
2. Scale down 1 compute host
3. OVN agents for the deleted compute are still up and running

Actual results:
(overcloud) [stack@undercloud-0 ~]$ openstack network agent list
+--------------------------------------+----------------------+---------------------------+-------------------+-------+-------+-------------------------------+
| ID                                   | Agent Type           | Host                      | Availability Zone | Alive | State | Binary                        |
+--------------------------------------+----------------------+---------------------------+-------------------+-------+-------+-------------------------------+
| 79362159-1532-4473-b4cf-295ac7970cb9 | OVN Controller agent | compute-0.redhat.local    | n/a               | XXX   | UP    | ovn-controller                |
| 49991374-00fa-4d70-9dc1-598e9c4c83d9 | OVN Metadata agent   | compute-0.redhat.local    | n/a               | XXX   | UP    | networking-ovn-metadata-agent |
| a0eeb079-4fe1-4bb0-be14-0a57a0c487ce | OVN Controller agent | compute-1.redhat.local    | n/a               | :-)   | UP    | ovn-controller                |
| 2cee6616-96c5-4bd8-89e9-12416e663a3d | OVN Metadata agent   | compute-1.redhat.local    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 8b8b2933-9ab1-4740-9358-fcabf5b4b88e | OVN Controller agent | controller-0.redhat.local | n/a               | :-)   | UP    | ovn-controller                |
+--------------------------------------+----------------------+---------------------------+-------------------+-------+-------+-------------------------------+
(overcloud) [stack@undercloud-0 ~]$  openstack network agent delete 79362159-1532-4473-b4cf-295ac7970cb9
Failed to delete network agent with ID '79362159-1532-4473-b4cf-295ac7970cb9': BadRequestException: 400: Client Error for url: http://10.0.0.148:9696/v2.0/agents/79362159-1532-4473-b4cf-295ac7970cb9, Bad agent request: OVN agents cannot be deleted.
1 of 1 network agents failed to delete.
(overcloud) [stack@undercloud-0 ~]$  openstack network agent delete 49991374-00fa-4d70-9dc1-598e9c4c83d9
Failed to delete network agent with ID '49991374-00fa-4d70-9dc1-598e9c4c83d9': BadRequestException: 400: Client Error for url: http://10.0.0.148:9696/v2.0/agents/49991374-00fa-4d70-9dc1-598e9c4c83d9, Bad agent request: OVN agents cannot be deleted.
1 of 1 network agents failed to delete.

Expected results:
Agents for the deleted node should not exist at all and should be done as a part of tripleo scale down tasks 

Additional info:
Documentation: Section 15.3. Removing Compute nodes
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/director_installation_and_usage/scaling-overcloud-nodes

From the related BZ, notes:
```
Dan Macpherson tested this on 16.1 - results:
9. Remove the Open vSwitch agent from the node: 

DDF: agents cant be deleted

Dan: This is true for OVN agents. This step can be deleted.

For OVS agents (not applicable to this procedure tho): You can delete the agent but it gets recreated after you delete it. So we probably need to find out what keeps recreating the agent and disable it if possible.
```

Comment 26 PURANDHAR SAIRAM MANNIDI 2021-07-23 01:25:08 UTC
*** Bug 1975264 has been marked as a duplicate of this bug. ***

Comment 29 Riccardo Bruzzone 2022-05-12 15:40:52 UTC
Hello,
My Customer (Bank Of Italy) is asking a FIX of this problem in OSP 16.2.2 (or next Z stream).
Could you provide me a progress about this Bugzilla ?

Thank you so much
Riccardo

Comment 30 Riccardo Bruzzone 2022-05-12 16:01:01 UTC
Hello,
As point raised at Comment 25, could you include this bugzilla (and the solution connected to it) to the "Removing Compute nodes" procedure [1]

Thank you so much in advance
Riccardo


[1]
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/director_installation_and_usage/assembly_scaling-overcloud-nodes#proc_removing-compute-nodes_scaling-overcloud-nodes

Comment 31 Terry Wilson 2022-05-26 15:51:21 UTC
In general, the agent api is designed so that one *has* to manually call openstack network agent delete when they want an agent to be deleted. It will not (and should not) ever disappear on its own. It should show as down if the agent has not been reachable in DEFAULT.agent_down_time seconds.

Comment 32 Terry Wilson 2022-05-26 15:54:47 UTC
So if we want those agents to disappear as part of some scale down operation or whatever, then tripleo / whatever is doing the scale down procedure will need to call the agent delete command for the agents that are on those nodes. The docs here https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/director_installation_and_usage/assembly_scaling-overcloud-nodes#proc_removing-compute-nodes_scaling-overcloud-nodes also mention having to call 'openstack network agent delete'

Comment 33 Miro Tomaska 2022-06-09 13:28:40 UTC
[600+ days bug note] Terry to follow up with support if comment 32 is sufficient resolution

Comment 41 Brendan Shephard 2023-02-12 23:55:13 UTC
*** Bug 2168403 has been marked as a duplicate of this bug. ***

Comment 42 Jakub Libosvar 2023-03-13 14:36:08 UTC
*** Bug 2068069 has been marked as a duplicate of this bug. ***

Comment 43 Jakub Libosvar 2023-03-13 14:36:53 UTC
*** Bug 2064794 has been marked as a duplicate of this bug. ***

Comment 44 Red Hat Bugzilla 2023-09-18 00:21:22 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.