New build is available at https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=21597045
(In reply to Urvashi Mohnani from comment #1)
> New build is available at
Hi, do we miss some important containers/storage things for this build?
One node got this error for a pod:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 10m kubelet, qe-wjiang-311-node-registry-router-1 Failed to pull image "brewregistry.stage.redhat.io/openshift3/ose-node:v3.11": rpc error: code = Unknown desc = Error writing blob: error storing blob to file "/var/tmp/storage729637586/1": unexpected EOF
Warning Failed 10m kubelet, qe-wjiang-311-node-registry-router-1 Error: ErrImagePull
Normal BackOff 10m kubelet, qe-wjiang-311-node-registry-router-1 Back-off pulling image "brewregistry.stage.redhat.io/openshift3/ose-node:v3.11"
Warning Failed 10m kubelet, qe-wjiang-311-node-registry-router-1 Error: ImagePullBackOff
Normal Pulling 10m (x2 over 16m) kubelet, qe-wjiang-311-node-registry-router-1 pulling image "brewregistry.stage.redhat.io/openshift3/ose-node:v3.11"
Nope, nothing in containers/storage changed, we just cherry-picked the symlink fixes onto the containers/storage version already being used by cri-o 1.11.
Are you seeing this error on multiple pods? Did it eventually fix itself, or was it stuck in this state? Did you try killing the pod and letting it start up again?
My customer who is hitting the issue "rebuilds" the node to fix the issue (removing and reinstalling components) -- but this is a very big workaround and not ideal. Curious if anyone has a less intrusive workaround.
So the "storing-the-layer-blob-to-a-file" logic comes from containers/image and not containers/storage.
If this issue is continuously happening, please open another bz for it. This shouldn't be blocking this bz as the symlink fixes went into containers/storage.
Do you mean, if it requires rebuilding (i.e. if it does not resolve by deleting pods or etc)
@Steven yeah, does deleting the pod resole the issue? Also how often is the customer seeing this happen?
If possible can I get cri-o and kubelet logs from the cluster as well.
Checked with 1.11.14 and reboot 5 times for the whole clusters, not met this issue, so move to verified.
# oc get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
qe-wjiang-311-master-etcd-1 Ready master 48m v1.11.0+d4cacc0 10.0.76.16 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.12.1.el7.x86_64 cri-o://1.11.14-1.rhaos3.11.gitd56660e.el7
qe-wjiang-311-node-1 Ready compute 45m v1.11.0+d4cacc0 10.0.77.60 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.12.1.el7.x86_64 cri-o://1.11.14-1.rhaos3.11.gitd56660e.el7
qe-wjiang-311-node-registry-router-1 Ready <none> 45m v1.11.0+d4cacc0 10.0.76.72 <none> Red Hat Enterprise Linux Server 7.6 (Maipo) 3.10.0-957.12.1.el7.x86_64 cri-o://1.11.14-1.rhaos3.11.gitd56660e.el7
For the "Error writing blob: error storing blob to file" issue, I tried 2 times, but not met this.
Will keep an eye on that, and open bug once I met that again.
Hm, well the issue seems to occur with new pods, so this might not apply. I opened a new bug.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.