Bug 2032541 - ovnkube-master in CrashBackOffLoop: OVN_Southbound is not available
Summary: ovnkube-master in CrashBackOffLoop: OVN_Southbound is not available
Keywords:
Status: CLOSED DUPLICATE of bug 2022144
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Mohamed Mahmoud
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-14 17:00 UTC by Matthew Booth
Modified: 2021-12-17 14:02 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-17 14:02:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Matthew Booth 2021-12-14 17:00:19 UTC
Description of problem:
This is a failure in CI during installation. Installation doesn't complete because networking fails to come up because $SUBJECT.

Job:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.10-e2e-openstack-ovn/1470697722712428544.

All logs are available in Artifacts. A specific log of potential interest to start:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-shiftstack-shiftstack-ci-main-periodic-4.10-e2e-openstack-ovn/1470697722712428544/artifacts/e2e-openstack-ovn/gather-extra/artifacts/pods/openshift-ovn-kubernetes_ovnkube-master-rhjtx_ovn-dbchecker.log

contains:
F1214 11:42:04.051986       1 ovndbmanager.go:58] SBDB Upgrade failed: %!w(*fmt.wrapError=&{failed to get schema version for NBDB, stderr: "ovsdb-client: transaction returned error: {\"details\":\"get_schema request specifies database OVN_Southbound which is not yet available because it has not completed joining its cluster\",\"error\":\"database not available\"}\n", error: OVN command '/usr/bin/ovsdb-client -t 10 get-schema-version unix:/var/run/ovn/ovnsb_db.sock OVN_Southbound' failed: exit status 1 0xc000934c40})

Version-Release number of selected component (if applicable):
4.10

How reproducible:
This job is extremely flaky, and looking at the last few failures this doesn't seem to be a common cause.

Comment 2 Matthew Booth 2021-12-17 09:31:08 UTC
We'll continue to look out for it in CI, but unfortunately it's not something we can reproduce on demand. The problem does look like the one referenced, though. Do you want to just mark this one a duplicate?

Comment 4 Mohamed Mahmoud 2021-12-17 14:02:43 UTC

*** This bug has been marked as a duplicate of bug 2022144 ***


Note You need to log in before you can comment on or make changes to this bug.