Bug 1929794
Summary: | [WinC] panic: runtime error: invalid memory address or nil pointer dereference after windowsmachine-controller reconciling | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Aravindh Puthiyaparambil <aravindh> |
Component: | Windows Containers | Assignee: | Sebastian Soto <ssoto> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Ronnie Rasouli <rrasouli> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 4.7 | CC: | aos-bugs, mankulka, rrasouli, sgao, team-winc |
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1929579 | Environment: | |
Last Closed: | 2021-06-02 16:03:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1929579 | ||
Bug Blocks: |
Description
Aravindh Puthiyaparambil
2021-02-17 16:35:10 UTC
This is a blocker for the WMCO 2.0 release. This bug has been verified and passed on 4.7.0-0.nightly-2021-02-18-110409, thanks. Version: OCP version: 4.7.0-0.nightly-2021-02-18-110409 WMCO commit: f1f40153af071e9778d3ada5ec6dc93e9adfaa9d Steps: 1, WMCO installed and Windows nodes has been bootstrapped # oc get nodes -l kubernetes.io/os=windows NAME STATUS ROLES AGE VERSION windows-k9t5g Ready worker 98m v1.20.0-1030+cac2421340a449 2, Delete secret cloud-private-key and create it again with an invalid key # oc delete secret cloud-private-key -n openshift-windows-machine-config-operator secret "cloud-private-key" deleted # echo "invalid" > /root/.ssh/invalid.pem # oc create secret generic cloud-private-key --from-file=private-key.pem=/root/.ssh/invalid.pem secret/cloud-private-key created 3, Check WMCO log, no panic now and get seasonable error message 2021-02-19T13:48:49.358Z ERROR controller-runtime.controller Reconciler error {"controller": "secret_controller", "request": "openshift-windows-machine-config-operator/cloud-private-key", "error": "error generating windows-user-data secret: unable to parse private key: ssh: no key found" .... 4, Delete above invalid secret and create with another valid key # oc delete secret cloud-private-key secret "cloud-private-key" deleted # oc create secret generic cloud-private-key --from-file=private-key.pem=/root/.ssh/openshift-dev.pem secret/cloud-private-key created 5, Check WMCO log, Windows node provisioned with old key is marked for deletion and new Windows node is reconciled. # oc get nodes -l kubernetes.io/os=windows NAME STATUS ROLES AGE VERSION windows-k9t5g Ready,SchedulingDisabled worker 111m v1.20.0-1030+cac2421340a449 # oc get nodes -l kubernetes.io/os=windows NAME STATUS ROLES AGE VERSION windows-tzgf5 Ready worker 2m43s v1.20.0-1030+cac2421340a449 # oc logs -f deployment.apps/windows-machine-config-operator ... 2021-02-19T13:54:16.752Z DEBUG windowsmachine-controller reconciling {"namespace": "openshift-machine-api", "name": "windows-k9t5g"} 2021-02-19T13:54:16.765Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "secret_controller", "request": "openshift-windows-machine-config-operator/cloud-private-key"} 2021-02-19T13:54:16.772Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "secret_controller", "request": "openshift-windows-machine-config-operator/cloud-private-key"} 2021-02-19T13:54:16.773Z INFO windowsmachine-controller deleting machine {"name": "windows-k9t5g"} 2021-02-19T13:54:16.803Z INFO windowsmachine-controller machine has been remediated by deletion {"name": "windows-k9t5g"} 2021-02-19T13:54:16.803Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "windowsmachine-controller", "request": "openshift-machine-api/windows-k9t5g"} 2021-02-19T13:54:16.803Z DEBUG windowsmachine-controller reconciling {"namespace": "openshift-machine-api", "name": "windows-k9t5g"} 2021-02-19T13:54:16.803Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"Machine","namespace":"openshift-machine-api","name":"windows-k9t5g","uid":"63a6d9aa-715e-4fd2-9baf-2c9a0cb2beb9","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"185934"}, "reason": "MachineDeleted", "message": "Machine windows-k9t5g has been remediated by deleting the Machine object"} 2021-02-19T13:54:16.816Z INFO windowsmachine-controller deleting machine {"name": "windows-k9t5g"} 2021-02-19T13:54:16.828Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "windowsmachine-controller", "request": "openshift-machine-api/windows-k9t5g"} 2021-02-19T13:54:16.828Z DEBUG windowsmachine-controller reconciling {"namespace": "openshift-machine-api", "name": "windows-k9t5g"} 2021-02-19T13:54:16.837Z DEBUG windowsmachine-controller machine not provisioned {"phase": "Deleting"} 2021-02-19T13:54:16.855Z INFO metrics Prometheus configured {"endpoints": "windows-machine-config-operator-metrics", "port": 9182, "name": "metrics"} 2021-02-19T13:54:16.856Z DEBUG controller-runtime.controller Successfully Reconciled {"controller": "windowsmachine-controller", "request": "openshift-machine-api/windows-k9t5g"} 2021-02-19T13:54:16.856Z DEBUG windowsmachine-controller reconciling {"namespace": "openshift-machine-api", "name": "windows-tzgf5"} ... |