Bug 1721560 - OVN controller network agents are not alive after rebooting all controller nodes
Summary: OVN controller network agents are not alive after rebooting all controller nodes
Keywords:
Status: CLOSED DUPLICATE of bug 1720947
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Assaf Muller
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-18 14:45 UTC by Roman Safronov
Modified: 2019-09-09 13:16 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-25 12:10:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Roman Safronov 2019-06-18 14:45:22 UTC
Description of problem:

OVN controller network agents are not alive after turning off/turning on all controllers, see below

(overcloud) [stack@undercloud-0 ~]$ openstack network agent list
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+-------------------------------+
| ID                                   | Agent Type                   | Host                     | Availability Zone | Alive | State | Binary                        |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+-------------------------------+
| cd0be7d2-5965-4bf2-aaaf-d69290558ff8 | OVN Controller agent         | compute-1.localdomain    | n/a               | XXX   | UP    | ovn-controller                |
| 3f33c7ef-7226-430f-8e36-11851c2b3e6e | OVN Metadata agent           | compute-1.localdomain    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 862f5204-0b50-4412-8263-a08a7c762832 | OVN Controller Gateway agent | controller-0.localdomain | n/a               | XXX   | UP    | ovn-controller                |
| 3f8bd2c6-49bc-4a7d-93fe-c31ef0be041e | OVN Controller Gateway agent | controller-1.localdomain | n/a               | XXX   | UP    | ovn-controller                |
| 6636500c-c048-4e00-9dd4-9cd0e0440952 | OVN Controller agent         | compute-0.localdomain    | n/a               | XXX   | UP    | ovn-controller                |
| 3fd9452b-bd0b-4dcf-9b44-712ddb98dbe8 | OVN Metadata agent           | compute-0.localdomain    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 84f405f4-03c8-4a0a-acb5-2beeead4f1c5 | OVN Controller Gateway agent | controller-2.localdomain | n/a               | XXX   | UP    | ovn-controller                |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+-------------------------------+


Version-Release number of selected component (if applicable):
15.0-RHEL-8/RHOS_TRUNK-15.0-RHEL-8-20190527.n.0

How reproducible:
100%

Steps to Reproduce:
Using an environment with 3 controllers. 
1. Turn off all 3 controllers (I ran 'pcs cluster stop' then 'shutdown -h now' on each)
2. Turn on all 3 controllers and wait until they are up and ovn-dbs-bundle has a master node. Make sure 'sudo pcs status' shows all resources running properly.
3. Check 'openstack network agent list'

Actual results:
openstack network agent list shows that ovn controller agents are not alive
it is not possible to run a new instance, it is created in ERROR state

Expected results:
ovn controller agents are alive
It is possible to run new instances

Additional info:

The following error shown in ovn controller log  
2019-06-18T13:24:13Z|00025|physical|ERR|No tunnel endpoint found for HA chassis in HA chassis group of port cr-lrp-e18b45c0-aa77-4908-a7e3-07dba18dc73a
2019-06-18T13:24:13Z|00026|binding|INFO|Releasing lport cr-lrp-7e9e8da9-687d-459e-8ba4-c0b29098254c from this chassis.


Note: all instances that were created previously remain running and accessible.
After restarting all ovn-controller containers network agents are alive and it is possible to run instances.

Comment 2 Lucas Alvares Gomes 2019-06-25 12:10:26 UTC

*** This bug has been marked as a duplicate of bug 1720947 ***


Note You need to log in before you can comment on or make changes to this bug.