Hide Forgot
Approximately 55 days ago image pruning stopped working on api.ci, probably either due to a transient failure, or hitting a certain limit. The current error is oc logs jobs/image-pruner-clayton-debug Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request (get images.image.openshift.io) We have 200k images on the cluster, and it looks like the images call times out trying to load them all into memory. When I ran a paged call (locally in oc get) it took several minutes and we eventually hit the compaction window, so I was unable. Testing locally to see size. The cluster needs to be able to prune, and we will need to take action to get it back under the threshold. We then need to ensure that this failure mode doesn't happen in the future.
Testing on the API server directly images was 39M in JSON and took about 3m20s to retrieve. Compaction is set to 5m so we are close to the "unable to read all images before compaction window".
Verified this in ./oc version oc v3.11.141 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ec2-54-80-207-203.compute-1.amazonaws.com:443 openshift v3.11.141 kubernetes v1.11.0+d4cacc0 Could prune 200K images without timeout in a pod. cat 20kimageprune-1.log | grep ImageStreamLi I0826 11:57:35.595904 148 request.go:897] Response Body: {"kind":"ImageStreamList","apiVersion":"image.openshift.io/v1","metadata":{"selfLink":"/apis/image.openshift.io/v1/imagestreams","resourceVersion":"246416"},"items":[{"metadata":{"name":"nodejs-mongodb-example","namespace":"install-test","selfLink":"/apis/image.openshift.io/v1/namespaces/install-test/imagestreams/nodejs-mongodb-example","uid":"0ffb7d02-c7d5-11e9-b89e-0e8918b91460","resourceVersion":"5912","generation":1,"creationTimestamp":"2019-08-26T07:42:32Z","labels":{"app":"nodejs-mongodb-example","template":"nodejs-mongodb-example"},"annotations":{"description":"Keeps track of changes in the application image","openshift.io/generated-by":"OpenShiftNewApp"}},"spec":{"lookupPolicy":{"local":false}},"status":{"dockerImageRepository":"docker-registry.default.svc:5000/install-test/nodejs-mongodb-example","tags":[{"tag":"latest","items":[{"created":"2019-08-26T07:43:18Z","dockerImageReference":"docker-registry.default.svc:5000/install-test/nodejs-mongodb-example@sha256:e2059198fbc704c5bbd3b672482d0f6fada954e5 [truncated 300040 chars]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2580