Bug 1989932

Summary: OVN metadata agents didn't register in chassis any vm on this compute will not boot
Product: Red Hat OpenStack Reporter: Eran Kuris <ekuris>
Component: python-networking-ovnAssignee: Terry Wilson <twilson>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: apevec, atragler, drosenfe, lhh, lmartins, majopela, scohen, spower, twilson
Target Milestone: z7Keywords: Regression, Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-networking-ovn-7.3.1-1.20210714143308.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-09 20:20:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eran Kuris 2021-08-04 11:03:59 UTC
Description of problem:
After overcloud reboot it looks like ovn metadata agent are not display in 
openstack network  agent list 
[stack@undercloud-0 ~]$ openstack network agent list --host compute-0.redhat.local
+--------------------------------------+----------------------+------------------------+-------------------+-------+-------+----------------+
| ID                                   | Agent Type           | Host                   | Availability Zone | Alive | State | Binary         |
+--------------------------------------+----------------------+------------------------+-------------------+-------+-------+----------------+
| d836828e-bcc8-40fc-81e4-d586d5c15333 | OVN Controller agent | compute-0.redhat.local |                   | :-)   | UP    | ovn-controller |
+--------------------------------------+----------------------+------------------------+-------------------+-------+-------+----------------+
metadata didn't register itself the unique id wasn't present in the chassis table

()[root@controller-0 /]# ovn-sbctl list chassis_private | grep ovn-metadata-id
external_ids        : {"neutron:liveness_check_at"="2021-08-04T10:48:41.560397+00:00", "neutron:metadata_liveness_check_at"="2021-08-04T10:48:41.565089+00:00", "neutron:ovn-metadata-id"="33256a55-7157-4cd4-8990-304043d10463", "neutron:ovn-metadata-sb-cfg"="1705"}


The metatata agent failed to register itself with their corresponding chassis

()[root@controller-0 /]# ovn-sbctl list chassis_private | grep ovn-metadata-sb-cfg
external_ids        : {"neutron:liveness_check_at"="2021-08-04T10:50:08.984226+00:00", "neutron:ovn-metadata-sb-cfg"="1707"}
 external_ids        : {"neutron:liveness_check_at"="2021-08-04T10:50:08.977937+00:00", "neutron:ovn-metadata-sb-cfg"="1707"}
 external_ids        : {"neutron:liveness_check_at"="2021-08-04T10:50:08.991956+00:00", "neutron:metadata_liveness_check_at"="2021-08-04T10:50:08.997057+00:00", "neutron:ovn-metadata-id"="33256a55-7157-4cd4-8990-304043d10463", "neutron:ovn-metadata-sb-cfg"="1707"}


Version-Release number of selected component (if applicable):
()[root@controller-0 /]# rpm -qa | grep ovn 
puppet-ovn-15.4.1-1.20210528102649.192ac4e.el8ost.noarch
rhosp-ovn-2.13-12.el8ost.noarch
ovn2.13-20.12.0-149.el8fdp.x86_64
rhosp-ovn-host-2.13-12.el8ost.noarch
ovn2.13-host-20.12.0-149.el8fdp.x86_64


How reproducible:

100%
Steps to Reproduce:
1. run deployment job 
2. reboot all overcloud nodes
3.

Actual results:


Expected results:


Additional info:

Comment 8 David Rosenfeld 2021-09-17 13:33:26 UTC
DF still see's reboot failure in Phase 3 regression of RHOS-16.1-RHEL-8-20210916.n.0. Failure has been seen in every Phase 3 regression since: RHOS-16.1-RHEL-8-20210727.n.1 except for: RHOS-16.1-RHEL-8-20210903.n.0 which was an async build and had a different code base.

Logs to a failing job: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/deployment/job/DFG-df-deployment-16.1-virthost-3cont_3comp_3ceph-yes_UC_SSL-yes_OC_SSL-ceph-ipv4-geneve-reboot-overcloud/50/

Comment 15 Eran Kuris 2021-10-11 10:52:49 UTC
It passes when using:  RHOS-16.1-RHEL-8-20211007.n.1

Comment 25 errata-xmlrpc 2021-12-09 20:20:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3762