Description of problem: Installation is not completed on baremetal (packet) for 4.3.0-0.nightly-2020-01-23-105702: $ oc describe co kube-controller-manager Name: kube-controller-manager Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2020-01-23T12:10:55Z Generation: 1 Resource Version: 10190 Self Link: /apis/config.openshift.io/v1/clusteroperators/kube-controller-manager UID: 0f567700-458c-4698-93a8-de6a1f6ca39c Spec: Status: Conditions: Last Transition Time: 2020-01-23T12:13:11Z Message: StaticPodsDegraded: nodes/master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com pods/kube-controller-manager-master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com container="cluster-policy-controller-5" is not ready StaticPodsDegraded: nodes/master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com pods/kube-controller-manager-master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com container="cluster-policy-controller-5" is waiting: "CreateContainerError" - "container create failed: time=\"2020-01-23T12:17:20Z\" level=error msg=\"container_linux.go:346: starting container process caused \\\"exec: \\\\\\\"cluster-policy-controller\\\\\\\": executable file not found in $PATH\\\"\"\ncontainer_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"\n" StaticPodsDegraded: pods "kube-controller-manager-master-01.mrnd-43-no-delete-e174.qe.devcluster.openshift.com" not found StaticPodsDegraded: pods "kube-controller-manager-master-00.mrnd-43-no-delete-e174.qe.devcluster.openshift.com" not found Reason: StaticPodsDegradedError Status: True Type: Degraded Last Transition Time: 2020-01-23T12:11:06Z Message: Progressing: 3 nodes are at revision 0; 0 nodes have achieved new revision 5 Reason: Progressing Status: True Type: Progressing Last Transition Time: 2020-01-23T12:10:56Z Message: Available: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 5 Reason: AvailableZeroNodesActive Status: False Type: Available Last Transition Time: 2020-01-23T12:10:55Z Reason: AsExpected Status: True Type: Upgradeable Extension: <nil> Related Objects: Group: operator.openshift.io Name: cluster Resource: kubecontrollermanagers Group: Name: openshift-config Resource: namespaces Group: Name: openshift-config-managed Resource: namespaces Group: Name: openshift-kube-controller-manager Resource: namespaces Group: Name: openshift-kube-controller-manager-operator Resource: namespaces Versions: Name: raw-internal Version: 4.3.0-0.nightly-2020-01-23-105702 Name: kube-controller-manager Version: 1.16.2 Name: operator Version: 4.3.0-0.nightly-2020-01-23-105702 Events: <none> $ oc describe pod kube-controller-manager-master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com -n openshift-kube-controller-manager Name: kube-controller-manager-master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Namespace: openshift-kube-controller-manager Priority: 2000001000 Priority Class Name: system-node-critical Node: master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com/147.75.100.27 Start Time: Thu, 23 Jan 2020 13:12:10 +0100 Labels: app=kube-controller-manager kube-controller-manager=true revision=5 Annotations: kubernetes.io/config.hash: 8109d0dfd71bc70c2f478dc03e54a1bc kubernetes.io/config.mirror: 8109d0dfd71bc70c2f478dc03e54a1bc kubernetes.io/config.seen: 2020-01-23T12:12:10.117745797Z kubernetes.io/config.source: file Status: Pending IP: 147.75.100.27 IPs: IP: 147.75.100.27 Init Containers: wait-for-host-port: Container ID: cri-o://fa8fc90cdef30ec586fbd4171ba3092c53b5b7601c27e4e320895fdc0c76a5b6 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976 Port: <none> Host Port: <none> Command: /usr/bin/timeout 30 /bin/bash -c Args: echo -n "Waiting for port :10257 to be released." while [ -n "$(lsof -ni :10257)" ]; do echo -n "." sleep 1 done State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 23 Jan 2020 13:12:10 +0100 Finished: Thu, 23 Jan 2020 13:12:11 +0100 Ready: True Restart Count: 0 Environment: <none> Mounts: <none> wait-for-cpc-host-port: Container ID: cri-o://d4747e66934c5e47340d8dd2165a9910578234ccc547b807c4b6c1de5ad094c3 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976 Port: <none> Host Port: <none> Command: /usr/bin/timeout 30 /bin/bash -c Args: echo -n "Waiting for port :10357 to be released." while [ -n "$(lsof -ni :10357)" ]; do echo -n "." sleep 1 done State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 23 Jan 2020 13:12:11 +0100 Finished: Thu, 23 Jan 2020 13:12:12 +0100 Ready: True Restart Count: 0 Environment: <none> Mounts: <none> Containers: kube-controller-manager-5: Container ID: cri-o://779aa6240b286dad0dc4c71ba5d54c1a4b959dfe1f08747db57cd11227cfc8c8 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976 Port: 10257/TCP Host Port: 10257/TCP Command: /bin/bash -ec Args: if [ -f /etc/kubernetes/static-pod-certs/configmaps/trusted-ca-bundle/ca-bundle.crt ]; then echo "Copying system trust bundle" cp -f /etc/kubernetes/static-pod-certs/configmaps/trusted-ca-bundle/ca-bundle.crt /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem fi exec hyperkube kube-controller-manager --openshift-config=/etc/kubernetes/static-pod-resources/configmaps/config/config.yaml \ --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig \ --authentication-kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig \ --authorization-kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/controller-manager-kubeconfig/kubeconfig \ --client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/client-ca/ca-bundle.crt \ --requestheader-client-ca-file=/etc/kubernetes/static-pod-certs/configmaps/aggregator-client-ca/ca-bundle.crt -v=2 --tls-cert-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.crt --tls-private-key-file=/etc/kubernetes/static-pod-resources/secrets/serving-cert/tls.key State: Running Started: Thu, 23 Jan 2020 13:12:13 +0100 Ready: True Restart Count: 0 Requests: cpu: 100m memory: 200Mi Liveness: http-get https://:10257/healthz delay=45s timeout=10s period=10s #success=1 #failure=3 Readiness: http-get https://:10257/healthz delay=10s timeout=10s period=10s #success=1 #failure=3 Environment: <none> Mounts: /etc/kubernetes/static-pod-certs from cert-dir (rw) /etc/kubernetes/static-pod-resources from resource-dir (rw) cluster-policy-controller-5: Container ID: Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bf0eb67346e3c291acb83b1cc25e330f4ccc4561a89e3f936609992692848222 Image ID: Port: 10357/TCP Host Port: 10357/TCP Command: cluster-policy-controller start Args: --config=/etc/kubernetes/static-pod-resources/configmaps/cluster-policy-controller-config/config.yaml State: Waiting Reason: CreateContainerError Ready: False Restart Count: 0 Requests: cpu: 100m memory: 200Mi Liveness: http-get https://:10357/healthz delay=45s timeout=10s period=10s #success=1 #failure=3 Readiness: http-get https://:10357/healthz delay=10s timeout=10s period=10s #success=1 #failure=3 Environment: <none> Mounts: /etc/kubernetes/static-pod-certs from cert-dir (rw) /etc/kubernetes/static-pod-resources from resource-dir (rw) kube-controller-manager-cert-syncer-5: Container ID: cri-o://a0013b7a22f10364dbf85110bc85237c7906066448d5d92038c893b588d0dbd1 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b76f8f4582d98133993414af5208d0a271f635542f7c553f1f3b49404d3efbb5 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b76f8f4582d98133993414af5208d0a271f635542f7c553f1f3b49404d3efbb5 Port: <none> Host Port: <none> Command: cluster-kube-controller-manager-operator cert-syncer Args: --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-controller-cert-syncer-kubeconfig/kubeconfig --namespace=$(POD_NAMESPACE) --destination-dir=/etc/kubernetes/static-pod-certs State: Running Started: Thu, 23 Jan 2020 13:12:18 +0100 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 50Mi Environment: POD_NAME: kube-controller-manager-master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com (v1:metadata.name) POD_NAMESPACE: openshift-kube-controller-manager (v1:metadata.namespace) Mounts: /etc/kubernetes/static-pod-certs from cert-dir (rw) /etc/kubernetes/static-pod-resources from resource-dir (rw) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: resource-dir: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/static-pod-resources/kube-controller-manager-pod-5 HostPathType: cert-dir: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/static-pod-resources/kube-controller-manager-certs HostPathType: QoS Class: Burstable Node-Selectors: <none> Tolerations: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 6m17s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976" already present on machine Normal Created 6m17s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Created container wait-for-host-port Normal Started 6m17s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Started container wait-for-host-port Normal Pulled 6m16s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976" already present on machine Normal Created 6m16s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Created container wait-for-cpc-host-port Normal Started 6m16s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Started container wait-for-cpc-host-port Normal Pulled 6m15s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce10fd84985a2522699e9d1152485e45c4eddb019f6a9b45707116757e115976" already present on machine Normal Created 6m14s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Created container kube-controller-manager-5 Normal Started 6m14s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Started container kube-controller-manager-5 Normal Pulling 6m14s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bf0eb67346e3c291acb83b1cc25e330f4ccc4561a89e3f936609992692848222" Normal Created 6m10s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Created container kube-controller-manager-cert-syncer-5 Warning Failed 6m10s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Error: container create failed: time="2020-01-23T12:12:17Z" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"" container_linux.go:346: starting container process caused "exec: \"cluster-policy-controller\": executable file not found in $PATH" Normal Pulled 6m10s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b76f8f4582d98133993414af5208d0a271f635542f7c553f1f3b49404d3efbb5" already present on machine Normal Pulled 6m10s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bf0eb67346e3c291acb83b1cc25e330f4ccc4561a89e3f936609992692848222" Normal Started 6m9s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Started container kube-controller-manager-cert-syncer-5 Warning Failed 6m8s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Error: container create failed: time="2020-01-23T12:12:19Z" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"" container_linux.go:346: starting container process caused "exec: \"cluster-policy-controller\": executable file not found in $PATH" Warning Failed 6m7s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Error: container create failed: time="2020-01-23T12:12:20Z" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"" container_linux.go:346: starting container process caused "exec: \"cluster-policy-controller\": executable file not found in $PATH" Warning Failed 5m55s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Error: container create failed: time="2020-01-23T12:12:32Z" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"" container_linux.go:346: starting container process caused "exec: \"cluster-policy-controller\": executable file not found in $PATH" Warning Failed 5m40s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Error: container create failed: time="2020-01-23T12:12:47Z" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"" container_linux.go:346: starting container process caused "exec: \"cluster-policy-controller\": executable file not found in $PATH" Warning Failed 5m28s kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Error: container create failed: time="2020-01-23T12:12:59Z" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"cluster-policy-controller\\\": executable file not found in $PATH\"" container_linux.go:346: starting container process caused "exec: \"cluster-policy-controller\": executable file not found in $PATH" Normal Pulled 67s (x25 over 6m9s) kubelet, master-02.mrnd-43-no-delete-e174.qe.devcluster.openshift.com Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bf0eb67346e3c291acb83b1cc25e330f4ccc4561a89e3f936609992692848222" already present on machine Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1.Install 4.3.0-0.nightly-2020-01-23-105702: 2.Wait for Waiting up to 30m0s for bootstrapping to complete... 3.oc get pods -n openshift-kube-controller-manager 4.kube-controller-manager-master-xxxxx.qe.devcluster.openshift.com is on CreateContainerError Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
Reasigned to kube-controller-manager, as it is also failing on OSP IPI, same version
The same has happened on the most recent AWS UPI jobs as well https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-upi-4.3/727 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-upi-4.3/726 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-upi-4.3/725
Can't reproduce the issue with latest payload:4.3.0-0.nightly-2020-02-02-175954: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-upi-4.3/772
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0391