Bug 1822803
Summary: | multus-cni crash on 4.5 with nil reference in client-go | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ryan Phillips <rphillips> |
Component: | Networking | Assignee: | Tomofumi Hayashi <tohayash> |
Networking sub component: | multus | QA Contact: | Weibin Liang <weliang> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | urgent | CC: | bbennett, dosmith |
Version: | 4.5 | ||
Target Milestone: | --- | ||
Target Release: | 4.5.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-07-13 17:26:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ryan Phillips
2020-04-09 21:44:50 UTC
Here is the PR upstream : https://github.com/kubernetes/kubernetes/pull/90034 Here is related code from multus.go ``` // send kubernetes events if delegate.Name != "" { kubeClient.Eventf(pod, v1.EventTypeNormal, "AddedInterface", "Add %s %v from %s", rt.IfName, ips, delegate.Name) } else { kubeClient.Eventf(pod, v1.EventTypeNormal, "AddedInterface", "Add %s %v", rt.IfName, ips) ``` I am not familiar with multus - the above should be examined. PR #90034 would be good to have - k8s shouldn't panic even if GetReference() is unable to proceed. I tried the following addition to staging/src/k8s.io/client-go/tools/reference/ref_test.go but was unable to reproduce the panic: func TestGetReferencePod(t *testing.T) { tests := []struct { name string input *v1.Pod groupVersion schema.GroupVersion expectedRefVersion string }{ { name: "Pod", input: &v1.Pod{ Status: v1.PodStatus{ ContainerStatuses: []v1.ContainerStatus{ { Name: "foo", }, }, }, }, groupVersion: schema.GroupVersion{Group: "", Version: "v1"}, }, } for _, test := range tests { t.Run(test.name, func(t *testing.T) { scheme := runtime.NewScheme() scheme.AddKnownTypes(test.groupVersion, &v1.Pod{}) if _, err := GetReference(scheme, test.input); err != nil { } if test.expectedRefVersion != "v1" { t.Errorf("expected %q, got %q", test.expectedRefVersion, "v1") } }) } } Stepping into the GetReference() call, I saw <autogenerated>:2 when trying to step into v1.(*Pod).GetObjectKind() - though GoLand didn't show me the generated code. I noted that multus seems to be using go 1.12 while k8s master uses go 1.13 I went over vendor/k8s.io/api/core/v1/generated.pb.go under multus There were Pod methods such as Pod#ProtoMessage() but I didn't find Pod#GetObjectKind Looking at staging/src/k8s.io/client-go/tools/reference/ref.go from k8s, the following code: ``` var listMeta metav1.Common objectMeta, err := meta.Accessor(obj) if err != nil { listMeta, err = meta.CommonAccessor(obj) ``` is not in vendor/k8s.io/client-go/tools/reference/ref.go It seems k8s dependency in multus should be upgraded. However the golang version difference seems to be a hurdle. Issue upstream: https://github.com/intel/multus-cni/issues/481 Apparently the downstream build should be golang 1.13
Luke Meyer from ART advises:
> ART OCP builds ignore what's in your dockerfile
> and overwrite it with golang-builder:1.13
> so downstream builds should already be 1.13
Will look towards updating the dockerfile with a golang 1.13.
After chatting with Tomofumi-san: Proceed with: https://github.com/intel/multus-cni/pull/482 Tomofumi-san will add more log in another PR to give us more clue if the crash happens again. There is no way to verify this bz because we still do not know how to reproduce it. QE verify this bug now and will open it again if CI testing fail in the future. If the crash is no longer seen in master log, can this be resolved ? I believe so. So let me check once again someday next week. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |