Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1721560

Summary: OVN controller network agents are not alive after rebooting all controller nodes
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: python-networking-ovnAssignee: Assaf Muller <amuller>
Status: CLOSED DUPLICATE QA Contact: Eran Kuris <ekuris>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15.0 (Stein)CC: apevec, lhh, lmartins, majopela, scohen
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-25 12:10:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Safronov 2019-06-18 14:45:22 UTC
Description of problem:

OVN controller network agents are not alive after turning off/turning on all controllers, see below

(overcloud) [stack@undercloud-0 ~]$ openstack network agent list
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+-------------------------------+
| ID                                   | Agent Type                   | Host                     | Availability Zone | Alive | State | Binary                        |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+-------------------------------+
| cd0be7d2-5965-4bf2-aaaf-d69290558ff8 | OVN Controller agent         | compute-1.localdomain    | n/a               | XXX   | UP    | ovn-controller                |
| 3f33c7ef-7226-430f-8e36-11851c2b3e6e | OVN Metadata agent           | compute-1.localdomain    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 862f5204-0b50-4412-8263-a08a7c762832 | OVN Controller Gateway agent | controller-0.localdomain | n/a               | XXX   | UP    | ovn-controller                |
| 3f8bd2c6-49bc-4a7d-93fe-c31ef0be041e | OVN Controller Gateway agent | controller-1.localdomain | n/a               | XXX   | UP    | ovn-controller                |
| 6636500c-c048-4e00-9dd4-9cd0e0440952 | OVN Controller agent         | compute-0.localdomain    | n/a               | XXX   | UP    | ovn-controller                |
| 3fd9452b-bd0b-4dcf-9b44-712ddb98dbe8 | OVN Metadata agent           | compute-0.localdomain    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 84f405f4-03c8-4a0a-acb5-2beeead4f1c5 | OVN Controller Gateway agent | controller-2.localdomain | n/a               | XXX   | UP    | ovn-controller                |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+-------------------------------+


Version-Release number of selected component (if applicable):
15.0-RHEL-8/RHOS_TRUNK-15.0-RHEL-8-20190527.n.0

How reproducible:
100%

Steps to Reproduce:
Using an environment with 3 controllers. 
1. Turn off all 3 controllers (I ran 'pcs cluster stop' then 'shutdown -h now' on each)
2. Turn on all 3 controllers and wait until they are up and ovn-dbs-bundle has a master node. Make sure 'sudo pcs status' shows all resources running properly.
3. Check 'openstack network agent list'

Actual results:
openstack network agent list shows that ovn controller agents are not alive
it is not possible to run a new instance, it is created in ERROR state

Expected results:
ovn controller agents are alive
It is possible to run new instances

Additional info:

The following error shown in ovn controller log  
2019-06-18T13:24:13Z|00025|physical|ERR|No tunnel endpoint found for HA chassis in HA chassis group of port cr-lrp-e18b45c0-aa77-4908-a7e3-07dba18dc73a
2019-06-18T13:24:13Z|00026|binding|INFO|Releasing lport cr-lrp-7e9e8da9-687d-459e-8ba4-c0b29098254c from this chassis.


Note: all instances that were created previously remain running and accessible.
After restarting all ovn-controller containers network agents are alive and it is possible to run instances.

Comment 2 Lucas Alvares Gomes 2019-06-25 12:10:26 UTC

*** This bug has been marked as a duplicate of bug 1720947 ***