Bug 1772680

Summary: machine-config operators controller segfaults during 4.2.4 upgrade
Product: OpenShift Container Platform Reporter: Samuel <smoro>
Component: Machine Config OperatorAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.2.0CC: amurdaca, ChetRHosey, clasohm, fan-wxa, jkaur, jmalde, kgarriso, ltitov, mfuruta, rbost, rekhan, rh-container, ssadhale
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1775009 1775013 (view as bug list) Environment:
Last Closed: 2020-01-22 10:46:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1775009    
Bug Blocks: 1772490, 1775013    

Description Samuel 2019-11-14 20:57:29 UTC
Description of problem:

Upgrade from 4.2.2 to 4.2.4 stuck while upgrading machine-config operator

Version-Release number of selected component (if applicable):

4.2.2, to 4.2.4, on bare-metal

How reproducible:

Unclear

Steps to Reproduce:
1. Proceed with upgrade

Actual results:

upgrade stuck

Expected results:

...

Additional info:

$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.2.4     True        False         False      4d8h
cloud-credential                           4.2.4     True        False         False      26d
cluster-autoscaler                         4.2.4     True        False         False      26d
console                                    4.2.4     True        False         False      38m
dns                                        4.2.4     True        False         False      26d
image-registry                             4.2.4     True        False         False      46m
ingress                                    4.2.4     True        False         False      4d8h
insights                                   4.2.4     True        False         False      26d
kube-apiserver                             4.2.4     True        False         False      26d
kube-controller-manager                    4.2.4     True        False         False      26d
kube-scheduler                             4.2.4     True        False         False      26d
machine-api                                4.2.4     True        False         False      26d
machine-config                             4.2.2     False       True          True       31m
marketplace                                4.2.4     True        False         False      45m
monitoring                                 4.2.4     True        False         False      37m
network                                    4.2.4     True        False         False      26d
node-tuning                                4.2.4     True        False         False      48m
openshift-apiserver                        4.2.4     True        False         False      15d
openshift-controller-manager               4.2.4     True        False         False      26d
openshift-samples                          4.2.4     True        False         False      47m
operator-lifecycle-manager                 4.2.4     True        False         False      26d
operator-lifecycle-manager-catalog         4.2.4     True        False         False      26d
operator-lifecycle-manager-packageserver   4.2.4     True        False         False      38m
service-ca                                 4.2.4     True        False         False      26d
service-catalog-apiserver                  4.2.4     True        False         False      26d
service-catalog-controller-manager         4.2.4     True        False         False      26d
storage                                    4.2.4     True        False         False      48m

$ oc describe co machine-config
Name:         machine-config
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2019-10-19T13:02:11Z
  Generation:          1
  Resource Version:    19925870
  Self Link:           /apis/config.openshift.io/v1/clusteroperators/machine-config
  UID:                 aa5b767a-f270-11e9-a34e-525400e1605b
Spec:
Status:
  Conditions:
    Last Transition Time:  2019-11-14T20:19:16Z
    Message:               Cluster not available for 4.2.4
    Status:                False
    Type:                  Available
    Last Transition Time:  2019-11-14T20:21:20Z
    Message:               Working towards 4.2.4
    Status:                True
    Type:                  Progressing
    Last Transition Time:  2019-11-14T20:19:16Z
    Message:               Unable to apply 4.2.4: timed out waiting for the condition during waitForControllerConfigToBeCompleted: controllerconfig is not completed: status for ControllerConfig machine-config-controller is being reported for 2, expecting it for 3
    Reason:                MachineConfigControllerFailed
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2019-10-19T13:03:26Z
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:
  Related Objects:
    Group:     
    Name:      openshift-machine-config-operator
    Resource:  namespaces
    Group:     machineconfiguration.openshift.io
    Name:      master
    Resource:  machineconfigpools
    Group:     machineconfiguration.openshift.io
    Name:      worker
    Resource:  machineconfigpools
    Group:     machineconfiguration.openshift.io
    Name:      machine-config-controller
    Resource:  controllerconfigs
  Versions:
    Name:     operator
    Version:  4.2.2
Events:       <none>

$ oc get pods
NAME                                         READY   STATUS    RESTARTS   AGE
etcd-quorum-guard-6899c458b4-57bh8           1/1     Running   1          15d
etcd-quorum-guard-6899c458b4-m5xzc           1/1     Running   1          15d
etcd-quorum-guard-6899c458b4-qtk6f           1/1     Running   1          15d
machine-config-controller-746ddfd848-vw5n6   1/1     Running   6          17m
machine-config-daemon-4gdfs                  1/1     Running   0          28m
machine-config-daemon-6g8c2                  1/1     Running   0          28m
machine-config-daemon-6j6sq                  1/1     Running   0          29m
machine-config-daemon-6tkpl                  1/1     Running   0          27m
machine-config-daemon-7k5fv                  1/1     Running   0          28m
machine-config-daemon-8xmqf                  1/1     Running   0          30m
machine-config-daemon-9n986                  1/1     Running   0          29m
machine-config-daemon-fcrrc                  1/1     Running   0          30m
machine-config-daemon-lkxkg                  1/1     Running   0          29m
machine-config-daemon-nx2qh                  1/1     Running   0          27m
machine-config-daemon-pbw9t                  1/1     Running   0          27m
machine-config-operator-65dbcdb7db-q2m4m     1/1     Running   0          32m
machine-config-server-7jwzh                  1/1     Running   2          15d
machine-config-server-9zqdq                  1/1     Running   2          15d
machine-config-server-blxx9                  1/1     Running   2          15d

$ oc logs -f machine-config-controller-746ddfd848-vw5n6 -p
I1114 20:47:04.029470       1 start.go:50] Version: v4.2.4-201911050122-dirty (55bb5fc17da0c3d76e4ee6a55732f0cba93e8520)
E1114 20:48:59.705026       1 event.go:247] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"machine-config-controller", GenerateName:"", Namespace:"openshift-machine-config-operator", SelfLink:"/api/v1/namespaces/openshift-machine-config-operator/configmaps/machine-config-controller", UID:"b114d4a4-071e-11ea-a595-52540079f30f", ResourceVersion:"19925042", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63709360698, loc:(*time.Location)(0x28f62e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"machine-config-controller-746ddfd848-vw5n6_ea0cb2d6-071f-11ea-a14e-0a580a8100cd\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2019-11-14T20:48:59Z\",\"renewTime\":\"2019-11-14T20:48:59Z\",\"leaderTransitions\":4}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'no kind is registered for the type v1.ConfigMap in scheme "github.com/openshift/machine-config-operator/cmd/common/helpers.go:30"'. Will not report event: 'Normal' 'LeaderElection' 'machine-config-controller-746ddfd848-vw5n6_ea0cb2d6-071f-11ea-a14e-0a580a8100cd became leader'
E1114 20:48:59.759629       1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:522
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:82
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/signal_unix.go:390
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/store.go:84
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:604
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:615
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:125
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:397
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:115
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:195
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:554
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:1333
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xe60bf5]

goroutine 257 [running]:
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x108
panic(0x15aec80, 0x28dbcd0)
	/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513 +0x1b9
github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1.(*MachineConfigPool).GetNamespace(0x0, 0x0, 0x19e98a0)
	<autogenerated>:1 +0x5
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.MetaNamespaceKeyFunc(0x1778ea0, 0x0, 0xc0005669c0, 0x1528080, 0xc000c75630, 0x12a05f200)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/store.go:84 +0x114
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.DeletionHandlingMetaNamespaceKeyFunc(0x1778ea0, 0x0, 0xc000c75630, 0x12a05f200, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261 +0x6a
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueAfter(0xc000562140, 0x0, 0x12a05f200)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:604 +0x41
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueDefault(0xc000562140, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:615 +0x44
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueDefault-fm(0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:125 +0x34
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).addNode(0xc000562140, 0x1794e00, 0xc0003672c0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:397 +0x1a1
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).addNode-fm(0x1794e00, 0xc0003672c0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:115 +0x3e
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc0003380b0, 0xc0003380c0, 0xc0003380d0, 0x1794e00, 0xc0003672c0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:195 +0x49
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0xc0009ae000, 0xc00095c190)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:554 +0x21d
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000963e18, 0x429692, 0xc00095c1c0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265 +0x51
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548 +0x79
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00099b768)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000963f68, 0xdf8475800, 0x0, 0x1582101, 0xc000766180)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc00099b768, 0xdf8475800, 0xc000766180)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc00035c500)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546 +0x8d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run-fm()
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390 +0x2a
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc000452060, 0xc0009ac000)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62

Tried to delete the machine-config-controller configmap, it gets re-created, unclear what's going on.

Note I've added an 'infra' machineconfigpool:

$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
infra    rendered-infra-71fac7974e1cadf6d63f1de505849340    True      False      False
master   rendered-master-66ca81fa67f9bace6638464be9256570   True      False      False
worker   rendered-worker-68ddaaf29b41fed5fb4b9a3339fcb1a5   False     True       False

$ oc describe machineconfigpool worker
Name:         worker
Namespace:    
Labels:       machineconfiguration.openshift.io/mco-built-in=
Annotations:  <none>
API Version:  machineconfiguration.openshift.io/v1
Kind:         MachineConfigPool
Metadata:
  Creation Timestamp:  2019-10-19T13:02:17Z
  Generation:          3
  Resource Version:    19898743
  Self Link:           /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker
  UID:                 ad9f8790-f270-11e9-a34e-525400e1605b
Spec:
  Configuration:
    Name:  rendered-worker-68ddaaf29b41fed5fb4b9a3339fcb1a5
    Source:
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         00-worker
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         01-worker-container-runtime
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         01-worker-kubelet
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-ad9f8790-f270-11e9-a34e-525400e1605b-registries
      API Version:  machineconfiguration.openshift.io/v1
      Kind:         MachineConfig
      Name:         99-worker-ssh
  Machine Config Selector:
    Match Labels:
      machineconfiguration.openshift.io/role:  worker
  Node Selector:
    Match Labels:
      node-role.kubernetes.io/worker:  
  Paused:                              false
Status:
  Conditions:
    Last Transition Time:  2019-10-19T13:02:57Z
    Message:               
    Reason:                
    Status:                False
    Type:                  NodeDegraded
    Last Transition Time:  2019-10-19T13:02:57Z
    Message:               
    Reason:                
    Status:                False
    Type:                  Degraded
    Last Transition Time:  2019-10-19T13:03:02Z
    Message:               
    Reason:                
    Status:                False
    Type:                  RenderDegraded
    Last Transition Time:  2019-11-01T19:54:21Z
    Message:               
    Reason:                
    Status:                False
    Type:                  Updated
    Last Transition Time:  2019-11-01T19:54:21Z
    Message:               All nodes are updating to rendered-worker-68ddaaf29b41fed5fb4b9a3339fcb1a5
    Reason:                
    Status:                True
    Type:                  Updating
  Configuration:
    Name:  rendered-worker-68ddaaf29b41fed5fb4b9a3339fcb1a5
    Source:
      API Version:            machineconfiguration.openshift.io/v1
      Kind:                   MachineConfig
      Name:                   00-worker
      API Version:            machineconfiguration.openshift.io/v1
      Kind:                   MachineConfig
      Name:                   01-worker-container-runtime
      API Version:            machineconfiguration.openshift.io/v1
      Kind:                   MachineConfig
      Name:                   01-worker-kubelet
      API Version:            machineconfiguration.openshift.io/v1
      Kind:                   MachineConfig
      Name:                   99-worker-ad9f8790-f270-11e9-a34e-525400e1605b-registries
      API Version:            machineconfiguration.openshift.io/v1
      Kind:                   MachineConfig
      Name:                   99-worker-ssh
  Degraded Machine Count:     0
  Machine Count:              5
  Observed Generation:        3
  Ready Machine Count:        4
  Unavailable Machine Count:  1
  Updated Machine Count:      5
Events:                       <none>

$ oc get machineconfig
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   CREATED
00-infra                                                                                               2.2.0             25d
00-master                                                   d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
00-worker                                                   d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
01-infra-container-runtime                                                                             2.2.0             25d
01-infra-kubelet                                                                                       2.2.0             25d
01-master-container-runtime                                 d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
01-master-kubelet                                           d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
01-worker-container-runtime                                 d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
01-worker-kubelet                                           d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
99-infra-ad9f8790-f270-11e9-a34e-525400e1605b-registries                                               2.2.0             25d
99-infra-ssh                                                                                           2.2.0             25d
99-master-ad9d318b-f270-11e9-a34e-525400e1605b-registries   d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
99-master-ssh                                                                                          2.2.0             26d
99-worker-ad9f8790-f270-11e9-a34e-525400e1605b-registries   d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             26d
99-worker-ssh                                                                                          2.2.0             26d
rendered-infra-0506920a222781a19fff88a4196deef4             62b0b6d2a751a5f364f2e6d5c9cfe63419668777   2.2.0             25d
rendered-infra-71fac7974e1cadf6d63f1de505849340             d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             15d
rendered-master-66ca81fa67f9bace6638464be9256570            d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             15d
rendered-master-747943425e64364488e51d15e5281265            62b0b6d2a751a5f364f2e6d5c9cfe63419668777   2.2.0             26d
rendered-worker-5e70256103cc4d0ce0162430de7233a1            62b0b6d2a751a5f364f2e6d5c9cfe63419668777   2.2.0             26d
rendered-worker-68ddaaf29b41fed5fb4b9a3339fcb1a5            d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             15d

Cluster was initially deployed with 4.2.0, successfully upgraded to 4.2.2.

Comment 1 Samuel 2019-11-15 08:38:31 UTC
As a follow up, in the last hour, for some reason, upgrade did complete, the controller is done segfaulting, is now running.


$ oc get clusterversion
oc NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.4     True        False         40m     Cluster version is 4.2.4
gsyn@sisyphe:~/git/worteks/docker/pingdom-java$ oc get mc
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   CREATED
00-infra                                                                                               2.2.0             26d
00-master                                                   55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
00-worker                                                   55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
01-infra-container-runtime                                                                             2.2.0             26d
01-infra-kubelet                                                                                       2.2.0             26d
01-master-container-runtime                                 55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
01-master-kubelet                                           55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
01-worker-container-runtime                                 55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
01-worker-kubelet                                           55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
99-infra-ad9f8790-f270-11e9-a34e-525400e1605b-registries                                               2.2.0             26d
99-infra-ssh                                                                                           2.2.0             26d
99-master-ad9d318b-f270-11e9-a34e-525400e1605b-registries   55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
99-master-ssh                                                                                          2.2.0             26d
99-worker-ad9f8790-f270-11e9-a34e-525400e1605b-registries   55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             26d
99-worker-ssh                                                                                          2.2.0             26d
rendered-infra-0506920a222781a19fff88a4196deef4             62b0b6d2a751a5f364f2e6d5c9cfe63419668777   2.2.0             26d
rendered-infra-71fac7974e1cadf6d63f1de505849340             d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             16d
rendered-infra-bed1dda9d08a9f80cd088e667c206fb4             55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             88m
rendered-master-66ca81fa67f9bace6638464be9256570            d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             16d
rendered-master-747943425e64364488e51d15e5281265            62b0b6d2a751a5f364f2e6d5c9cfe63419668777   2.2.0             26d
rendered-master-a54208d3fe789a6c2647471c1a6b2015            55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             88m
rendered-worker-5e70256103cc4d0ce0162430de7233a1            62b0b6d2a751a5f364f2e6d5c9cfe63419668777   2.2.0             26d
rendered-worker-68ddaaf29b41fed5fb4b9a3339fcb1a5            d73d5c6c95499f2820853c799366a157b0c0bd09   2.2.0             16d
rendered-worker-917c185a9e38f77f491fc863367e60fb            55bb5fc17da0c3d76e4ee6a55732f0cba93e8520   2.2.0             88m
syn@sisyphe:~/git/worteks/docker/pingdom-java$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.2.4     True        False         False      4d20h
cloud-credential                           4.2.4     True        False         False      26d
cluster-autoscaler                         4.2.4     True        False         False      26d
console                                    4.2.4     True        False         False      80m
dns                                        4.2.4     True        False         False      26d
image-registry                             4.2.4     True        False         False      10m
ingress                                    4.2.4     True        False         False      4d20h
insights                                   4.2.4     True        False         False      26d
kube-apiserver                             4.2.4     True        False         False      26d
kube-controller-manager                    4.2.4     True        False         False      26d
kube-scheduler                             4.2.4     True        False         False      26d
machine-api                                4.2.4     True        False         False      26d
machine-config                             4.2.4     True        False         False      41m
marketplace                                4.2.4     True        False         False      49m
monitoring                                 4.2.4     True        False         False      42m
network                                    4.2.4     True        False         False      26d
node-tuning                                4.2.4     True        False         False      35m
openshift-apiserver                        4.2.4     True        False         False      45m
openshift-controller-manager               4.2.4     True        False         False      26d
openshift-samples                          4.2.4     True        False         False      12h
operator-lifecycle-manager                 4.2.4     True        False         False      26d
operator-lifecycle-manager-catalog         4.2.4     True        False         False      26d
operator-lifecycle-manager-packageserver   4.2.4     True        False         False      49m
service-ca                                 4.2.4     True        False         False      26d
service-catalog-apiserver                  4.2.4     True        False         False      26d
service-catalog-controller-manager         4.2.4     True        False         False      26d
storage                                    4.2.4     True        False         False      12h

$ oc get pods
NAME                                         READY   STATUS    RESTARTS   AGE
etcd-quorum-guard-d59cc8dd8-2qwtf            1/1     Running   0          41m
etcd-quorum-guard-d59cc8dd8-2ww5d            1/1     Running   0          41m
etcd-quorum-guard-d59cc8dd8-zqfv9            1/1     Running   0          42m
machine-config-controller-746ddfd848-fhqb2   1/1     Running   6          62m
...

$ oc logs machine-config-controller-746ddfd848-fhqb2 
I1115 07:52:09.691754       1 start.go:50] Version: v4.2.4-201911050122-dirty (55bb5fc17da0c3d76e4ee6a55732f0cba93e8520)
E1115 07:54:05.385361       1 event.go:247] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"machine-config-controller", GenerateName:"", Namespace:"openshift-machine-config-operator", SelfLink:"/api/v1/namespaces/openshift-machine-config-operator/configmaps/machine-config-controller", UID:"b114d4a4-071e-11ea-a595-52540079f30f", ResourceVersion:"20335459", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63709360698, loc:(*time.Location)(0x28f62e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"machine-config-controller-746ddfd848-fhqb2_d3ad8ddc-077c-11ea-a0e0-0a580a8200a4\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2019-11-15T07:54:05Z\",\"renewTime\":\"2019-11-15T07:54:05Z\",\"leaderTransitions\":102}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'no kind is registered for the type v1.ConfigMap in scheme "github.com/openshift/machine-config-operator/cmd/common/helpers.go:30"'. Will not report event: 'Normal' 'LeaderElection' 'machine-config-controller-746ddfd848-fhqb2_d3ad8ddc-077c-11ea-a0e0-0a580a8200a4 became leader'
E1115 07:54:05.439555       1 template_controller.go:120] couldn't get ControllerConfig on secret callback &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:""}, Status:"Failure", Message:"controllerconfig.machineconfiguration.openshift.io \"machine-config-controller\" not found", Reason:"NotFound", Details:(*v1.StatusDetails)(0xc0000d0360), Code:404}}
I1115 07:54:05.613702       1 node_controller.go:147] Starting MachineConfigController-NodeController
I1115 07:54:05.614035       1 render_controller.go:123] Starting MachineConfigController-RenderController
I1115 07:54:05.614065       1 template_controller.go:182] Starting MachineConfigController-TemplateController
I1115 07:54:05.713924       1 container_runtime_config_controller.go:189] Starting MachineConfigController-ContainerRuntimeConfigController
I1115 07:54:05.713924       1 kubelet_config_controller.go:159] Starting MachineConfigController-KubeletConfigController
I1115 07:54:05.719257       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:05.731130       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:05.745190       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:05.753925       1 container_runtime_config_controller.go:713] Applied ImageConfig cluster on MachineConfigPool master
I1115 07:54:05.770186       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:05.778868       1 container_runtime_config_controller.go:713] Applied ImageConfig cluster on MachineConfigPool worker
I1115 07:54:05.815764       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:05.926071       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:06.094310       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:06.421131       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:07.090127       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:08.411033       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1115 07:54:10.450891       1 status.go:82] Pool master: All nodes are updated with rendered-master-a54208d3fe789a6c2647471c1a6b2015
I1115 07:54:10.450983       1 status.go:82] Pool infra: All nodes are updated with rendered-infra-bed1dda9d08a9f80cd088e667c206fb4

...

Upgraded did complete, in 12 hours.

Is 4.2.4 safe to apply to customer clusters?

Any explanation? What could be going on?

Comment 2 Samuel 2019-11-21 10:50:32 UTC
Hi,

The machine-config-controller Pod is still crashing with segfaults. Is there anything else you need to further investigate?

$ oc get pods -n openshift-machine-config-operator
NAME                                         READY   STATUS    RESTARTS   AGE
etcd-quorum-guard-d59cc8dd8-2qwtf            1/1     Running   0          6d2h
etcd-quorum-guard-d59cc8dd8-2ww5d            1/1     Running   0          6d2h
etcd-quorum-guard-d59cc8dd8-zqfv9            1/1     Running   0          6d2h
machine-config-controller-746ddfd848-fhqb2   1/1     Running   62         6d3h
machine-config-daemon-4gdfs                  1/1     Running   1          6d14h
machine-config-daemon-6g8c2                  1/1     Running   1          6d14h
machine-config-daemon-6j6sq                  1/1     Running   1          6d14h
...
$ oc logs -n openshift-machine-config-operator -p machine-config-controller-746ddfd848-fhqb2
I1117 04:44:57.725629       1 start.go:50] Version: v4.2.4-201911050122-dirty (55bb5fc17da0c3d76e4ee6a55732f0cba93e8520)
E1117 04:46:53.468163       1 event.go:247] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"machine-config-controller", GenerateName:"", Namespace:"openshift-machine-config-operator", SelfLink:"/api/v1/namespaces/openshift-machine-config-operator/configmaps/machine-config-controller", UID:"b114d4a4-071e-11ea-a595-52540079f30f", ResourceVersion:"22123789", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63709360698, loc:(*time.Location)(0x28f62e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"machine-config-controller-746ddfd848-fhqb2_01bafee3-08f5-11ea-bfb3-0a580a8200a4\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2019-11-17T04:46:53Z\",\"renewTime\":\"2019-11-17T04:46:53Z\",\"leaderTransitions\":157}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'no kind is registered for the type v1.ConfigMap in scheme "github.com/openshift/machine-config-operator/cmd/common/helpers.go:30"'. Will not report event: 'Normal' 'LeaderElection' 'machine-config-controller-746ddfd848-fhqb2_01bafee3-08f5-11ea-bfb3-0a580a8200a4 became leader'
E1117 04:46:53.531543       1 template_controller.go:120] couldn't get ControllerConfig on secret callback &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:""}, Status:"Failure", Message:"controllerconfig.machineconfiguration.openshift.io \"machine-config-controller\" not found", Reason:"NotFound", Details:(*v1.StatusDetails)(0xc000239260), Code:404}}
...
I1118 22:02:53.411142       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:03:13.896279       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:03:54.987446       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
E1118 22:05:16.933433       1 kubelet_config_controller.go:308] GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:05:16.933687       1 kubelet_config_controller.go:309] Dropping featureconfig "cluster" out of the queue: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:06:16.939797       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:06:16.949732       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
...
I1118 22:10:42.662049       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:11:23.627181       1 kubelet_config_controller.go:303] Error syncing kubeletconfig cluster: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
E1118 22:12:45.566035       1 kubelet_config_controller.go:308] GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
I1118 22:12:45.566260       1 kubelet_config_controller.go:309] Dropping featureconfig "cluster" out of the queue: GenerateMachineConfigsforRole failed with error failed to read dir "/etc/mcc/templates/infra": open /etc/mcc/templates/infra: no such file or directory
E1118 22:13:34.578758       1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:522
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:82
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/signal_unix.go:390
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/store.go:84
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:604
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:615
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:125
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:475
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:116
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:202
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:552
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:1333
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xe60bf5]

goroutine 158 [running]:
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x108
panic(0x15aec80, 0x28dbcd0)
	/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513 +0x1b9
github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1.(*MachineConfigPool).GetNamespace(0x0, 0x0, 0x19e98a0)
	<autogenerated>:1 +0x5
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.MetaNamespaceKeyFunc(0x1778ea0, 0x0, 0xc00057e900, 0x1528080, 0xc00084ece0, 0x12a05f200)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/store.go:84 +0x114
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.DeletionHandlingMetaNamespaceKeyFunc(0x1778ea0, 0x0, 0xc00084ece0, 0x12a05f200, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261 +0x6a
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueAfter(0xc0000e25a0, 0x0, 0x12a05f200)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:604 +0x41
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueDefault(0xc0000e25a0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:615 +0x44
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueDefault-fm(0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:125 +0x34
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).updateNode(0xc0000e25a0, 0x1794e00, 0xc000864000, 0x1794e00, 0xc000ce8dc0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:475 +0x84c
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).updateNode-fm(0x1794e00, 0xc000864000, 0x1794e00, 0xc000ce8dc0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:116 +0x52
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate(0xc0003beaf0, 0xc0003beb00, 0xc0003beb10, 0x1794e00, 0xc000864000, 0x1794e00, 0xc000ce8dc0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:202 +0x5d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0xc0000dc5c8, 0xc00095be00, 0xc0000d4a00)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:552 +0x18b
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000f7fe18, 0x429692, 0xc0000d4a30)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265 +0x51
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548 +0x79
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0000dc768)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000f7ff68, 0xdf8475800, 0x0, 0x1582101, 0xc00009ab40)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc0000dc768, 0xdf8475800, 0xc00009ab40)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc000360380)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546 +0x8d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run-fm()
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390 +0x2a
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc00059e1b0, 0xc0004505f0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62

Comment 3 Antonio Murdaca 2019-12-03 14:16:13 UTC
*** Bug 1779147 has been marked as a duplicate of this bug. ***

Comment 4 Antonio Murdaca 2019-12-04 11:34:48 UTC
*** Bug 1779546 has been marked as a duplicate of this bug. ***

Comment 5 Samuel 2019-12-12 18:22:09 UTC
Deploying 4.2.10 tonight.

Still segfaults:



$ oc logs -n openshift-machine-config-operator -f machine-config-controller-8cf5649dd-988v4
I1212 18:16:09.160139       1 start.go:50] Version: v4.2.10-201912022352-dirty (d780d197a9c5848ba786982c0c4aaa7487297046)

E1212 18:18:04.860199       1 event.go:247] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"machine-config-controller", GenerateName:"", Namespace:"openshift-machine-config-operator", SelfLink:"/api/v1/namespaces/openshift-machine-config-operator/configmaps/machine-config-controller", UID:"b114d4a4-071e-11ea-a595-52540079f30f", ResourceVersion:"49971258", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63709360698, loc:(*time.Location)(0x28f62e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"machine-config-controller-8cf5649dd-988v4_787e6475-1d0b-11ea-9c47-0a580a8101bb\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2019-12-12T18:18:04Z\",\"renewTime\":\"2019-12-12T18:18:04Z\",\"leaderTransitions\":413}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'no kind is registered for the type v1.ConfigMap in scheme "github.com/openshift/machine-config-operator/cmd/common/helpers.go:30"'. Will not report event: 'Normal' 'LeaderElection' 'machine-config-controller-8cf5649dd-988v4_787e6475-1d0b-11ea-9c47-0a580a8101bb became leader'
E1212 18:18:04.936372       1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:522
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:82
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/signal_unix.go:390
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/store.go:84
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:604
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:615
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:125
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:397
/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:115
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:195
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:554
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390
/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:1333
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xe60bf5]

goroutine 158 [running]:
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x108
panic(0x15aec80, 0x28dbcd0)
	/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513 +0x1b9
github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1.(*MachineConfigPool).GetNamespace(0x0, 0x0, 0x19e98a0)
	<autogenerated>:1 +0x5
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.MetaNamespaceKeyFunc(0x1778ea0, 0x0, 0xc0000c92c0, 0x1528080, 0xc0007881f0, 0x12a05f200)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/store.go:84 +0x114
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.DeletionHandlingMetaNamespaceKeyFunc(0x1778ea0, 0x0, 0xc0007881f0, 0x12a05f200, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:261 +0x6a
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueAfter(0xc0004bc0a0, 0x0, 0x12a05f200)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:604 +0x41
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueDefault(0xc0004bc0a0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:615 +0x44
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).enqueueDefault-fm(0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:125 +0x34
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).addNode(0xc0004bc0a0, 0x1794e00, 0xc000377580)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:397 +0x1a1
github.com/openshift/machine-config-operator/pkg/controller/node.(*Controller).addNode-fm(0x1794e00, 0xc000377580)
	/go/src/github.com/openshift/machine-config-operator/pkg/controller/node/node_controller.go:115 +0x3e
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc000574790, 0xc0005747b0, 0xc000574810, 0x1794e00, 0xc000377580)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/controller.go:195 +0x49
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0xc0005a9200, 0xc0001509b0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:554 +0x21d
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0x0, 0xc000225e18, 0x429692, 0xc0001509e0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:265 +0x51
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548 +0x79
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000718f68)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000225f68, 0xdf8475800, 0x0, 0x1582101, 0xc0005702a0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc000718f68, 0xdf8475800, 0xc0005702a0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc000337180)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546 +0x8d
github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache.(*processorListener).run-fm()
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390 +0x2a
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc000512220, 0xc00067e2e0)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62

Comment 6 Kirsten Garrison 2019-12-17 18:56:14 UTC
@Samuel, please provide a must gather from this cluster

https://github.com/openshift/must-gather

Comment 8 Kirsten Garrison 2019-12-18 18:49:22 UTC
@Jatan @Samuel we are awaiting approvals to backport a fix from https://bugzilla.redhat.com/show_bug.cgi?id=1775009

Comment 15 Michael Nguyen 2020-01-14 21:34:30 UTC
Verified on 4.2.0-0.nightly-2020-01-13-060909.  Upgrade to 4.2.0-0.nightly-2020-01-14-110551 successfully with infra node.

Comment 17 errata-xmlrpc 2020-01-22 10:46:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0107