Description of problem: Keep a cluster running for a while, machine-api-controller pod stuck in CrashLoopBackOff. Have met this in 2 clusters. Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-08-14-211610 How reproducible: Sometimes Steps to Reproduce: 1. Running a cluster for a while 2. Check machine-api-controller pod Actual results: machine-api-controller pod stuck in CrashLoopBackOff. $ oc get pod NAME READY STATUS RESTARTS AGE cluster-autoscaler-operator-fff44d57f-vrgcp 1/1 Running 0 17h machine-api-controllers-5fb9f8f668-gw6xt 3/4 CrashLoopBackOff 46 17h machine-api-operator-6f89c74764-khxbr 1/1 Running 1 17h $ oc edit pod machine-api-controllers-5fb9f8f668-gw6xt name: nodelink-controller ready: false restartCount: 47 state: waiting: message: Back-off 5m0s restarting failed container=nodelink-controller pod=machine-api-controllers-5fb9f8f668-gw6xt_openshift-machine-api(3d500681-c9ab-11e9-ad00-fa163ea99686) reason: CrashLoopBackOff $ oc logs -f machine-api-controllers-5fb9f8f668-gw6xt -c nodelink-controller I0829 08:47:03.348079 1 nodelink_controller.go:92] Adding internal IP "192.168.0.34" for node "preserve-groupg-4cf4r-master-1" to indexer I0829 08:47:03.396320 1 nodelink_controller.go:188] Reconciling Node /preserve-groupg-4cf4r-worker-f6ffp I0829 08:47:03.396606 1 nodelink_controller.go:409] Finding machine from node "preserve-groupg-4cf4r-worker-f6ffp" I0829 08:47:03.396659 1 nodelink_controller.go:426] Finding machine from node "preserve-groupg-4cf4r-worker-f6ffp" by ProviderID I0829 08:47:03.396721 1 nodelink_controller.go:449] Finding machine from node "preserve-groupg-4cf4r-worker-f6ffp" by IP I0829 08:47:03.396764 1 nodelink_controller.go:454] Found internal IP for node "preserve-groupg-4cf4r-worker-f6ffp": "192.168.0.26" E0829 08:47:03.397032 1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76 /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65 /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51 /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:522 /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513 /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:82 /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/signal_unix.go:390 /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/meta.go:135 /go/src/github.com/openshift/machine-api-operator/pkg/controller/nodelink/nodelink_controller.go:205 /go/src/github.com/openshift/machine-api-operator/pkg/controller/nodelink/nodelink_controller.go:205 /go/src/github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 /go/src/github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158 /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:1333 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x10753f7] goroutine 438 [running]: github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x108 panic(0x11c6a80, 0x1fee7d0) /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513 +0x1b9 github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/apis/meta/v1.(*ObjectMeta).GetName(...) /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/meta.go:135 github.com/openshift/machine-api-operator/pkg/controller/nodelink.(*ReconcileNodeLink).findMachineFromNode(0xc000443230, 0xc0003d6580, 0xc0000c6008, 0x0, 0x0) /go/src/github.com/openshift/machine-api-operator/pkg/controller/nodelink/nodelink_controller.go:420 +0x267 github.com/openshift/machine-api-operator/pkg/controller/nodelink.(*ReconcileNodeLink).Reconcile(0xc000443230, 0x0, 0x0, 0xc000044e40, 0x22, 0x2001c40, 0xc0001ced50, 0x444e37, 0x8) /go/src/github.com/openshift/machine-api-operator/pkg/controller/nodelink/nodelink_controller.go:205 +0x324 github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000ce1e0, 0x0) /go/src/github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x17d github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1() /go/src/github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158 +0x36 github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc000514ac0) /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54 github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000514ac0, 0x3b9aca00, 0x0, 0x13b2201, 0xc0003d8120) /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc000514ac0, 0x3b9aca00, 0xc0003d8120) /go/src/github.com/openshift/machine-api-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d created by github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start /go/src/github.com/openshift/machine-api-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157 +0x32a Expected results: machine-api-controller pod work normally. Additional info:
https://github.com/openshift/machine-api-operator/pull/390
Verified. clusterversion: 4.2.0-0.nightly-2019-09-02-172410 Didn't met this issue again, mark as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922