Bug 2032541

Summary: ovnkube-master in CrashBackOffLoop: OVN_Southbound is not available
Product: OpenShift Container Platform Reporter: Matthew Booth <mbooth>
Component: NetworkingAssignee: Mohamed Mahmoud <mmahmoud>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED DUPLICATE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bpickard
Version: 4.9   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-17 14:02:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matthew Booth 2021-12-14 17:00:19 UTC
Description of problem:
This is a failure in CI during installation. Installation doesn't complete because networking fails to come up because $SUBJECT.

Job:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.10-e2e-openstack-ovn/1470697722712428544.

All logs are available in Artifacts. A specific log of potential interest to start:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.10-e2e-openstack-ovn/1470697722712428544/artifacts/e2e-openstack-ovn/gather-extra/artifacts/pods/openshift-ovn-kubernetes_ovnkube-master-rhjtx_ovn-dbchecker.log

contains:
F1214 11:42:04.051986       1 ovndbmanager.go:58] SBDB Upgrade failed: %!w(*fmt.wrapError=&{failed to get schema version for NBDB, stderr: "ovsdb-client: transaction returned error: {\"details\":\"get_schema request specifies database OVN_Southbound which is not yet available because it has not completed joining its cluster\",\"error\":\"database not available\"}\n", error: OVN command '/usr/bin/ovsdb-client -t 10 get-schema-version unix:/var/run/ovn/ovnsb_db.sock OVN_Southbound' failed: exit status 1 0xc000934c40})

Version-Release number of selected component (if applicable):
4.10

How reproducible:
This job is extremely flaky, and looking at the last few failures this doesn't seem to be a common cause.

Comment 2 Matthew Booth 2021-12-17 09:31:08 UTC
We'll continue to look out for it in CI, but unfortunately it's not something we can reproduce on demand. The problem does look like the one referenced, though. Do you want to just mark this one a duplicate?

Comment 4 Mohamed Mahmoud 2021-12-17 14:02:43 UTC

*** This bug has been marked as a duplicate of bug 2022144 ***