Description of problem: After overcloud reboot it looks like ovn metadata agent are not display in openstack network agent list [stack@undercloud-0 ~]$ openstack network agent list --host compute-0.redhat.local +--------------------------------------+----------------------+------------------------+-------------------+-------+-------+----------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+----------------------+------------------------+-------------------+-------+-------+----------------+ | d836828e-bcc8-40fc-81e4-d586d5c15333 | OVN Controller agent | compute-0.redhat.local | | :-) | UP | ovn-controller | +--------------------------------------+----------------------+------------------------+-------------------+-------+-------+----------------+ metadata didn't register itself the unique id wasn't present in the chassis table ()[root@controller-0 /]# ovn-sbctl list chassis_private | grep ovn-metadata-id external_ids : {"neutron:liveness_check_at"="2021-08-04T10:48:41.560397+00:00", "neutron:metadata_liveness_check_at"="2021-08-04T10:48:41.565089+00:00", "neutron:ovn-metadata-id"="33256a55-7157-4cd4-8990-304043d10463", "neutron:ovn-metadata-sb-cfg"="1705"} The metatata agent failed to register itself with their corresponding chassis ()[root@controller-0 /]# ovn-sbctl list chassis_private | grep ovn-metadata-sb-cfg external_ids : {"neutron:liveness_check_at"="2021-08-04T10:50:08.984226+00:00", "neutron:ovn-metadata-sb-cfg"="1707"} external_ids : {"neutron:liveness_check_at"="2021-08-04T10:50:08.977937+00:00", "neutron:ovn-metadata-sb-cfg"="1707"} external_ids : {"neutron:liveness_check_at"="2021-08-04T10:50:08.991956+00:00", "neutron:metadata_liveness_check_at"="2021-08-04T10:50:08.997057+00:00", "neutron:ovn-metadata-id"="33256a55-7157-4cd4-8990-304043d10463", "neutron:ovn-metadata-sb-cfg"="1707"} Version-Release number of selected component (if applicable): ()[root@controller-0 /]# rpm -qa | grep ovn puppet-ovn-15.4.1-1.20210528102649.192ac4e.el8ost.noarch rhosp-ovn-2.13-12.el8ost.noarch ovn2.13-20.12.0-149.el8fdp.x86_64 rhosp-ovn-host-2.13-12.el8ost.noarch ovn2.13-host-20.12.0-149.el8fdp.x86_64 How reproducible: 100% Steps to Reproduce: 1. run deployment job 2. reboot all overcloud nodes 3. Actual results: Expected results: Additional info:
DF still see's reboot failure in Phase 3 regression of RHOS-16.1-RHEL-8-20210916.n.0. Failure has been seen in every Phase 3 regression since: RHOS-16.1-RHEL-8-20210727.n.1 except for: RHOS-16.1-RHEL-8-20210903.n.0 which was an async build and had a different code base. Logs to a failing job: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/deployment/job/DFG-df-deployment-16.1-virthost-3cont_3comp_3ceph-yes_UC_SSL-yes_OC_SSL-ceph-ipv4-geneve-reboot-overcloud/50/
It passes when using: RHOS-16.1-RHEL-8-20211007.n.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3762