Description of problem: NFS volume recycle failed in ocp 3.10 for pull wrong image tag v1.10.0 Version-Release number of selected component (if applicable): oc v3.10.12 openshift v3.10.12 kubernetes v1.10.0+b81c8f8 How reproducible: always Steps to Reproduce: 1. create a PV with persistentVolumeReclaimPolicy=Recycle 2. create a PVC using the PV created above 3. delete PVC 4. check pv status Actual results: PV status is Released Expected results: PV status is Available Master Log: Additional info: # oc describe pod recycler-for-nfs-4i6sh -n openshift-infra Name: recycler-for-nfs-4i6sh Namespace: openshift-infra Node: preserve-share1-310-nrr-1/172.16.120.97 Start Time: Thu, 05 Jul 2018 06:44:59 +0000 Labels: <none> Annotations: <none> Status: Pending IP: Containers: pv-recycler: Container ID: Image: registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v1.10.0 Image ID: Port: <none> Host Port: <none> Command: /usr/bin/openshift-recycle Args: /scrub State: Waiting Reason: ErrImagePull Ready: False Restart Count: 0 Environment: <none> Mounts: /scrub from vol (rw) /var/run/secrets/kubernetes.io/serviceaccount from pv-recycler-controller-token-g4vqz (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: vol: Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: 172.31.98.192 Path: / ReadOnly: false pv-recycler-controller-token-g4vqz: Type: Secret (a volume populated by a Secret) SecretName: pv-recycler-controller-token-g4vqz Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 1m default-scheduler Successfully assigned recycler-for-nfs-4i6sh to preserve-share1-310-nrr-1 Normal Pulling 1m kubelet, preserve-share1-310-nrr-1 pulling image "registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v1.10.0" Warning Failed 1m kubelet, preserve-share1-310-nrr-1 Failed to pull image "registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v1.10.0": rpc error: code = Unknown desc = Error: image openshift3/ose-recycler:v1.10.0 not found Warning Failed 1m kubelet, preserve-share1-310-nrr-1 Error: ErrImagePull Normal SandboxChanged 50s (x22 over 1m) kubelet, preserve-share1-310-nrr-1 Pod sandbox changed, it will be killed and re-created.
It's similar to https://bugzilla.redhat.com/show_bug.cgi?id=1550372
Not an image issue (the image component is used to report bugs against images openshift delivers such as Jenkins). What is responsible for creating the recycler pod? That is the component which has done something incorrectly since it set the image field in the pod to the wrong tag (looks like it picked up the k8s version instead of the openshift version)
I have added a tag for this in registry.reg-aws [jenkins@buildvm tbielawa]$ docker pull registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v3.10.0 Trying to pull repository registry.reg-aws.openshift.com:443/openshift3/ose-recycler ... v3.10.0: Pulling from registry.reg-aws.openshift.com:443/openshift3/ose-recycler e0f71f706c2a: Pull complete 121ab4741000: Pull complete c2b08539d917: Pull complete d2b14e4bd9b9: Pull complete 026373b5e80b: Pull complete Digest: sha256:3441bba5c697d8ce227eb63ed42fa769cd3323a68aa786b4a1cf9596843ca0cf [jenkins@buildvm tbielawa]$ docker tag registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v3.10.0 registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v1.10.0 [jenkins@buildvm tbielawa]$ docker push registry.reg-aws.openshift.com:443/openshift3/ose-recycler:v1.10.0 The push refers to a repository [registry.reg-aws.openshift.com:443/openshift3/ose-recycler] aed1ec382fb8: Layer already exists 4311d0a86068: Layer already exists d594557633f0: Layer already exists d6a4dd6ace1f: Layer already exists f4fa6c253d2f: Layer already exists v1.10.0: digest: sha256:3441bba5c697d8ce227eb63ed42fa769cd3323a68aa786b4a1cf9596843ca0cf size: 1368 There is an ongoing email discussion about fixing this error at the source.
Also I'm not sure what to do about the RHCC part of this (like was handled in the referenced ticket). We haven't pushed 3.10 images to the public yet so I can't ask RCM to add a v1.10.0 tag as a work around for anything. What's the greater effect of this?
PR against origin since ose PR was closed - https://github.com/openshift/origin/pull/20609
Verified on 3.10.44, NFS recycler works as expected. It pulls the correct image now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2660