Description of problem: We've seen instances where the migration initially failed because the db wasn't ready yet. This change adds some retrying to the migration to handle that situation more gracefully and efficient, as failing this a couple of times currently results in getting the pod into crashlooop backoff. A sample transient error for initial failures looks like e.G. this: F0511 04:31:57.928725 1 ovndbmanager.go:44] NBDB Upgrade failed: %!w(*fmt.wrapError=&{failed to get schema version for NBDB, stderr: "ovsdb-client: transaction returned error: {\"details\":\"get_schema request specifies database OVN_Northbound which is not yet available because it has not completed joining its cluster\",\"error\":\"database not available\"}\n", error: OVN command '/usr/bin/ovsdb-client -t 10 get-schema-version unix:/var/run/ovn/ovnnb_db.sock OVN_Northbound' failed: exit status 1 0xc00042a160}) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069