Bug 2041830 - CI: ovn-kubernetes-master-e2e-aws-ovn-windows is broken
Summary: CI: ovn-kubernetes-master-e2e-aws-ovn-windows is broken
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Surya Seetharaman
QA Contact: Mike Fiedler
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-18 10:51 UTC by Surya Seetharaman
Modified: 2022-03-10 16:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:40:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 913 0 None open Bug 2041830: Fix panic in Hybrid Overlay 2022-01-18 11:45:37 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:40:44 UTC

Description Surya Seetharaman 2022-01-18 10:51:21 UTC
Description of problem:
We are seeing panics in the CI runs:

2022-01-15T19:05:10.881566498Z I0115 19:05:10.881524       1 informer.go:294] Successfully synced 'ci-op-pg859vpf-ed916-p8l86-master-2'
2022-01-15T19:05:10.887132241Z I0115 19:05:10.887098       1 master.go:429] Created hybrid overlay logical route policy for node ci-op-pg859vpf-ed916-p8l86-worker-5slgm
2022-01-15T19:05:10.887274695Z E0115 19:05:10.887239       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
2022-01-15T19:05:10.887274695Z goroutine 856 [running]:
2022-01-15T19:05:10.887274695Z k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1bd5fc0, 0x2e22800)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
2022-01-15T19:05:10.887274695Z k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86
2022-01-15T19:05:10.887274695Z panic(0x1bd5fc0, 0x2e22800)
2022-01-15T19:05:10.887274695Z 	/usr/lib/golang/src/runtime/panic.go:965 +0x1b9
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/libovsdbops.findDatapathByPredicate(0x0, 0x0, 0xc0033ab360, 0x0, 0x0, 0x0)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/libovsdbops/datapath.go:17 +0xc0
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/libovsdbops.FindDatapathByExternalIDs(0x0, 0x0, 0xc002d12ba0, 0x4, 0xc001ec8ce8, 0x2c)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/libovsdbops/datapath.go:43 +0x6f
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/util.CreateMACBinding(0x0, 0x0, 0xc0029f7590, 0x2c, 0x1e69f1b, 0x12, 0xc002691508, 0x6, 0x6, 0xc00269155c, ...)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/util/ovn.go:21 +0xb9
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller.(*MasterController).setupHybridLRPolicySharedGw(0xc000cc9f00, 0xc000114758, 0x1, 0x1, 0xc002e1d410, 0x27, 0xc002691508, 0x6, 0x6, 0x0, ...)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller/master.go:433 +0xa5e
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller.(*MasterController).handleOverlayPort(0xc000cc9f00, 0xc002003b00, 0x2107510, 0xc002ca2480, 0x0, 0x7b3a227465477074)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller/master.go:277 +0xd85
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller.(*MasterController).AddNode(0xc000cc9f00, 0xc002003b00, 0xc001d1b920, 0x27)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller/master.go:330 +0x4ac
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller.NewMaster.func1(0x1e32460, 0xc002003b00, 0x27, 0x1e32460)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/hybrid-overlay/pkg/controller/master.go:71 +0x49
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/informer.(*eventHandler).syncHandler(0xc001420ae0, 0xc001d1b920, 0x27, 0xe00000002e62320, 0x5)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/informer/informer.go:335 +0x33e
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/informer.(*eventHandler).processNextWorkItem.func1(0xc001420ae0, 0x1b0fb40, 0xc0013ba610, 0x0, 0x0)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/informer/informer.go:280 +0xea
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/informer.(*eventHandler).processNextWorkItem(0xc001420ae0, 0x203000)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/informer/informer.go:297 +0x49
2022-01-15T19:05:10.887274695Z github.com/ovn-org/ovn-kubernetes/go-controller/pkg/informer.(*eventHandler).runWorker(...)
2022-01-15T19:05:10.887274695Z 	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/informer/informer.go:248
2022-01-15T19:05:10.887274695Z k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0029b6f98)

Version-Release number of selected component (if applicable):


How reproducible:
90% of CI runs:
https://prow.ci.openshift.org/job-history/gs/origin-ci-test/pr-logs/directory/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-windows

We need to backport https://github.com/ovn-org/ovn-kubernetes/pull/2720/commits/51f3d5f669595a8a9efd2e2292faefe359e84543 to fix this issue.

Comment 5 Mike Fiedler 2022-01-26 17:46:45 UTC
Letting this soak longer in CI.   General regression on winc cluster successful, but CI is showing  a lot of recent failure for  FAIL: TestWMCO/network/Pod_DNS_Resolution

Comment 8 Mike Fiedler 2022-02-03 13:23:34 UTC
Verified,  latest CI runs are passing ~70% and the failures are different than this one.

Comment 10 errata-xmlrpc 2022-03-10 16:40:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.