Hi, Description of problem: Ping loss during update of OSP-16.2.0 to 16.2.passed_phase2 (currently abouve z4, puddle RHOS-16.2-RHEL-8-20221201.n.1) during ovn controller update even using the external_ids:ovn-ofctrl-wait-before-clear parameter. How reproducible: To reproduce it you need to generate some traffic before running the ovn-controller update, I'm using iperf to the vm for 1min. Then run openstack overcloud external-update run -y \ --stack qe-Cloud-0 \ --tags ovn 2>&1 with a ping running in parallel and you get something like: --- 10.0.0.183 ping statistics --- 57 packets transmitted, 50 received, 12.2807% packet loss, time 56914ms rtt min/avg/max/mdev = 0.459/1.375/9.151/1.736 ms Additional info:
The issue has been debugged and identified and is due to that issue in ovn: https://bugzilla.redhat.com/show_bug.cgi?id=2158626 Until we get a fix for this we have to make sure that ovn-monitor-all is set: ovs-vsctl set open . external_ids:ovn-monitor-all=true before the update. That parameter became default from 16.2.2 on, so an update from earlier version won't have that parameter during update and will experience data plane disruption during update of the ovn-controller.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.2.5 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:1763