Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2139393

Summary: [OSP16.2] OVN Metadata agent stuck after crash or reboot
Product: Red Hat OpenStack Reporter: Ricardo Ramos Thomas <riramos>
Component: puppet-ovnAssignee: Jakub Libosvar <jlibosva>
Status: CLOSED DUPLICATE QA Contact: Udi Shkalim <ushkalim>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16.2 (Train)CC: cshastri, mtomaska
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-08 14:33:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ricardo Ramos Thomas 2022-11-02 11:48:15 UTC
Description of problem:

We are facing intermittent issues on random compute nodes where OVN metadata agent gets stuck and newly spawned instances could not be accessed. The 'ovn_metadata_agent' container keeps on running but the output of 'ip netns' returns blank output even though there are instances running on it. I will attach a sosreport from one such node.

This is observed after server crash or deliberately rebooted but does not happen all the time.

~~~
[root@cpt-xxx /]# virsh list | grep -i running | wc -l
113
[root@cpt-xxx ~]# ip netns
[root@cpt-xxx ~]#
~~~

Workaround is to restart tripleo_ovn_metadata_agent service after which the network namespaces get created.


Version-Release number of selected component (if applicable):

RHOSP 16.2.2 (Train)

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

intermittent issues on random compute nodes with OVN metadata agent ip netns returns blank output

Expected results:

No Error 

Additional info:

sosreports of the nodes are available.