Bug 1798549
| Summary: | oc debug node/foo does not fail quickly on errimagepull | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | David Eads <deads> |
| Component: | oc | Assignee: | Sally <somalley> |
| Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.4 | CC: | aos-bugs, jokerman, maszulik, mfojtik |
| Target Milestone: | --- | ||
| Target Release: | 4.4.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-04 11:33:47 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
opened https://github.com/openshift/oc/pull/277 for multiple bzs, including this Confirmed with latest oc client, can't reproduce the issue new: [root@dhcp-140-138 ~]# oc version -o yaml clientVersion: buildDate: "2020-02-14T07:28:29Z" compiler: gc gitCommit: 5d7a12f03389b03b651f963cb5ee8ddfa9cff559 gitTreeState: clean gitVersion: v4.4.0 goVersion: go1.13.4 major: "" minor: "" platform: linux/amd64 [root@dhcp-140-138 ~]# oc debug node/yinzho-xxxx Starting pod/yinzho-xxx ... To use host binaries, run `chroot /host` Removing debug pod ... error: Back-off pulling image "registry.redhat.io/rhel7/support-tools" Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |
When the image doesn't exist, the image pull fails and will not recover. The `oc debug node` command should fail quickly when this happens, not hang for minutes on end. ``` apiVersion: v1 kind: Pod metadata: annotations: debug.openshift.io/source-container: container-00 debug.openshift.io/source-resource: /v1, Resource=nodes/ip-10-0-134-191.us-east-2.compute.internal creationTimestamp: "2020-02-05T15:03:02Z" name: ip-10-0-134-191us-east-2computeinternal-debug namespace: default resourceVersion: "74856" selfLink: /api/v1/namespaces/default/pods/ip-10-0-134-191us-east-2computeinternal-debug uid: b46d4d07-796f-4078-81f6-74da492e3157 spec: containers: - command: - /bin/sh image: registry.redhat.io/rhel7/support-tools imagePullPolicy: Always name: container-00 resources: {} securityContext: privileged: true runAsUser: 0 stdin: true stdinOnce: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File tty: true volumeMounts: - mountPath: /host name: host - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: default-token-8nmzh readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true hostNetwork: true hostPID: true imagePullSecrets: - name: default-dockercfg-7n99x nodeName: ip-10-0-134-191.us-east-2.compute.internal priority: 0 restartPolicy: Never schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 volumes: - hostPath: path: / type: Directory name: host - name: default-token-8nmzh secret: defaultMode: 420 secretName: default-token-8nmzh status: conditions: - lastProbeTime: null lastTransitionTime: "2020-02-05T15:03:02Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2020-02-05T15:03:02Z" message: 'containers with unready status: [container-00]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2020-02-05T15:03:02Z" message: 'containers with unready status: [container-00]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2020-02-05T15:03:02Z" status: "True" type: PodScheduled containerStatuses: - image: registry.redhat.io/rhel7/support-tools imageID: "" lastState: {} name: container-00 ready: false restartCount: 0 started: false state: waiting: message: Back-off pulling image "registry.redhat.io/rhel7/support-tools" reason: ImagePullBackOff hostIP: 10.0.134.191 phase: Pending podIP: 10.0.134.191 podIPs: - ip: 10.0.134.191 qosClass: BestEffort startTime: "2020-02-05T15:03:02Z" ``` At a certain number of events failing, we should simply fail and let the user decide what to do ``` 2m12s Normal Pulling pod/ip-10-0-134-191us-east-2computeinternal-debug spec.containers{container-00} kubelet, ip-10-0-134-191.us-east-2.compute.internal Pulling image "registry.redhat.io/rhel7/support-tools" 3m31s 4 ip-10-0-134-191us-east-2computeinternal-debug.15f089cd312e1d75 2m12s Warning Failed pod/ip-10-0-134-191us-east-2computeinternal-debug spec.containers{container-00} kubelet, ip-10-0-134-191.us-east-2.compute.internal Failed to pull image "registry.redhat.io/rhel7/support-tools": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication 3m31s 4 ip-10-0-134-191us-east-2computeinternal-debug.15f089cd44414d30 2m12s Warning Failed pod/ip-10-0-134-191us-east-2computeinternal-debug spec.containers{container-00} kubelet, ip-10-0-134-191.us-east-2.compute.internal Error: ErrImagePull 3m31s 4 ip-10-0-134-191us-east-2computeinternal-debug.15f089cd4441cde1 104s Normal BackOff pod/ip-10-0-134-191us-east-2computeinternal-debug spec.containers{container-00} kubelet, ip-10-0-134-191.us-east-2.compute.internal Back-off pulling image "registry.redhat.io/rhel7/support-tools" 3m30s 6 ip-10-0-134-191us-east-2computeinternal-debug.15f089cd6560d425 89s Warning Failed pod/ip-10-0-134-191us-east-2computeinternal-debug spec.containers{container-00} kubelet, ip-10-0-134-191.us-east-2.compute.internal Error: ImagePullBackOff 3m30s 7 ip-10-0-134-191us-east-2computeinternal-debug.15f089cd65611568 ```