Bug 1886636 - Panic in machine-config-operator
Summary: Panic in machine-config-operator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.4
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.7.0
Assignee: Kirsten Garrison
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1908534
TreeView+ depends on / blocked
 
Reported: 2020-10-09 01:39 UTC by Alan Chan
Modified: 2023-12-15 19:42 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:24:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2156 0 None closed Bug 1886636: update ctrcfg crd to make spec & containerruntimeconfig required 2021-02-10 15:57:00 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:24:54 UTC

Description Alan Chan 2020-10-09 01:39:59 UTC
Description of problem:

This is a spin off of a case issue that originally thought to be related to bz#1858026, but further investigation per the engineer may indicate something else.

The issue is that appears the machine-config-controller pod keeps crashing for a null dereference. As far as we know, no upgrade was involved.

Cluster version:
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.12    True        False         22d     Cluster version is 4.4.12

* Pod logs:
$ oc logs machine-config-controller-6b5f7d4658-xst2j
I1001 11:41:17.781882       1 start.go:50] Version: v4.4.0-202007060343.p0-dirty (7e7e7ff90ea1007693a3c4f712747b5c0b832226)
I1001 11:41:17.785040       1 leaderelection.go:242] attempting to acquire leader lease  openshift-machine-config-operator/machine-config-controller...
E1001 11:42:52.441009       1 event.go:319] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"machine-config-controller", GenerateName:"", Namespace:"openshift-machine-config-operator", SelfLink:"/api/v1/namespaces/openshift-machine-config-operator/configmaps/machine-config-controller", UID:"56357f20-002c-484a-a08c-19d1698f4681", ResourceVersion:"34098077", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63733294451, loc:(*time.Location)(0x2d7c560)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"machine-config-controller-6b5f7d4658-xst2j_50faa507-5ac4-4805-96ad-c8af9374f6bc\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2020-10-01T11:42:52Z\",\"renewTime\":\"2020-10-01T11:42:52Z\",\"leaderTransitions\":4732}"}, OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'no kind is registered for the type v1.ConfigMap in scheme "github.com/openshift/machine-config-operator/cmd/common/helpers.go:30"'. Will not report event: 'Normal' 'LeaderElection' 'machine-config-controller-6b5f7d4658-xst2j_50faa507-5ac4-4805-96ad-c8af9374f6bc became leader'
I1001 11:42:52.441076       1 leaderelection.go:252] successfully acquired lease openshift-machine-config-operator/machine-config-controller
I1001 11:42:52.550248       1 node_controller.go:147] Starting MachineConfigController-NodeController
I1001 11:42:52.551521       1 kubelet_config_controller.go:160] Starting MachineConfigController-KubeletConfigController
I1001 11:42:52.551669       1 container_runtime_config_controller.go:189] Starting MachineConfigController-ContainerRuntimeConfigController
E1001 11:42:52.570255       1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 276 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x19bbe60, 0x2d4ff00)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82
panic(0x19bbe60, 0x2d4ff00)
	/opt/rh/go-toolset-1.13/root/usr/lib/go-toolset-1.13-golang/src/runtime/panic.go:679 +0x1b2
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).syncContainerRuntimeConfig.func2(0x101000000000020, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:536 +0x2e8
k8s.io/client-go/util/retry.OnError.func1(0xc000b78a88, 0x47819c, 0x2d54600)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/util/retry/util.go:51 +0x3c
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x5f5e100, 0x0, 0x3ff0000000000000, 0x5, 0x0, 0xc00117fae8, 0x4de384, 0xc000a86340)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:292 +0x51
k8s.io/client-go/util/retry.OnError(0x5f5e100, 0x0, 0x3ff0000000000000, 0x5, 0x0, 0x1d148b8, 0xc000b78d20, 0x0, 0x203000)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/util/retry/util.go:50 +0xa6
k8s.io/client-go/util/retry.RetryOnConflict(...)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/util/retry/util.go:104
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).syncContainerRuntimeConfig(0xc0001919e0, 0xc0004adf50, 0xd, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:522 +0x7d4
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).processNextWorkItem(0xc0001919e0, 0x4a41343643394b00)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:325 +0xff
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).worker(0xc0001919e0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:309 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000704000)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000704000, 0x3b9aca00, 0x0, 0x5256445967447701, 0xc0000bb860)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000704000, 0x3b9aca00, 0xc0000bb860)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).Run
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:193 +0x2e4
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x17af668]

goroutine 276 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x105
panic(0x19bbe60, 0x2d4ff00)
	/opt/rh/go-toolset-1.13/root/usr/lib/go-toolset-1.13-golang/src/runtime/panic.go:679 +0x1b2
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).syncContainerRuntimeConfig.func2(0x101000000000020, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:536 +0x2e8
k8s.io/client-go/util/retry.OnError.func1(0xc000b78a88, 0x47819c, 0x2d54600)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/util/retry/util.go:51 +0x3c
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x5f5e100, 0x0, 0x3ff0000000000000, 0x5, 0x0, 0xc00117fae8, 0x4de384, 0xc000a86340)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:292 +0x51
k8s.io/client-go/util/retry.OnError(0x5f5e100, 0x0, 0x3ff0000000000000, 0x5, 0x0, 0x1d148b8, 0xc000b78d20, 0x0, 0x203000)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/util/retry/util.go:50 +0xa6
k8s.io/client-go/util/retry.RetryOnConflict(...)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/util/retry/util.go:104
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).syncContainerRuntimeConfig(0xc0001919e0, 0xc0004adf50, 0xd, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:522 +0x7d4
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).processNextWorkItem(0xc0001919e0, 0x4a41343643394b00)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:325 +0xff
github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).worker(0xc0001919e0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:309 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000704000)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000704000, 0x3b9aca00, 0x0, 0x5256445967447701, 0xc0000bb860)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc000704000, 0x3b9aca00, 0xc0000bb860)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config.(*Controller).Run
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/container-runtime-config/container_runtime_config_controller.go:193 +0x2e4

Comment 16 Michael Nguyen 2020-12-14 14:46:12 UTC
$ cat << EOF > bad-ctrcfg.yaml
> apiVersion: machineconfiguration.openshift.io/v1
> kind: ContainerRuntimeConfig
> metadata:
>  name: set-pids-limit
> spec:
>  machineConfigPoolSelector:
>    matchLabels:
>      pools.operator.machineconfiguration.openshift.io/worker: ""
> EOF
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-12-14-080124   True        False         30m     Cluster version is 4.7.0-0.nightly-2020-12-14-080124
$ oc create -f bad-ctrcfg.yaml
The ContainerRuntimeConfig "set-pids-limit" is invalid: spec.containerRuntimeConfig: Required value
$ oc get kubletocnfig
error: the server doesn't have a resource type "kubletocnfig"
$ oc get kubeletconfig
No resources found.
$ oc -n openshift-machine-config-operator get kubletconfig
error: the server doesn't have a resource type "kubletconfig"
$ oc create -f invalid-ctrcfg.yaml 
The ContainerRuntimeConfig "set-pids-limit" is invalid: spec: Required value
$ cat invalid-kubeletconfig.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: set-max-pods
$ oc create -f invalid-kubeletconfig.yaml 
The KubeletConfig "set-max-pods" is invalid: spec: Required value

Comment 18 Yu Qi Zhang 2021-01-06 17:05:23 UTC
No doc needed. This doesn't change behaviour, just adds a better error message

Comment 21 errata-xmlrpc 2021-02-24 15:24:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.