+++ This bug was initially created as a clone of Bug #1702346 +++ Approximately 55 days ago image pruning stopped working on api.ci, probably either due to a transient failure, or hitting a certain limit. The current error is oc logs jobs/image-pruner-clayton-debug Error from server (Timeout): the server was unable to return a response in the time allotted, but may still be processing the request (get images.image.openshift.io) We have 200k images on the cluster, and it looks like the images call times out trying to load them all into memory. When I ran a paged call (locally in oc get) it took several minutes and we eventually hit the compaction window, so I was unable. Testing locally to see size. The cluster needs to be able to prune, and we will need to take action to get it back under the threshold. We then need to ensure that this failure mode doesn't happen in the future. --- Additional comment from Clayton Coleman on 2019-04-23 17:33:29 UTC --- Testing on the API server directly images was 39M in JSON and took about 3m20s to retrieve. Compaction is set to 5m so we are close to the "unable to read all images before compaction window".
PR: https://github.com/openshift/origin/pull/22655
Backport request to 3.11: https://bugzilla.redhat.com/show_bug.cgi?id=1702346
Could do pruning 200K images operation in a pod with 4.2 version(4.2.0-0.nightly-2019-06-25-003324) $ ./oc version Client Version: version.Info{Major:"4", Minor:"2+", GitVersion:"v4.2.0-201906241832+7a0a2f2-dirty", GitCommit:"7a0a2f2", GitTreeState:"dirty", BuildDate:"2019-06-24T23:20:08Z", GoVersion:"go1.12.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.0+952fea3", GitCommit:"952fea3", GitTreeState:"clean", BuildDate:"2019-06-24T23:20:31Z", GoVersion:"go1.12.6", Compiler:"gc", Platform:"linux/amd64"} $ time ./oc get images | wc -l 200210 real 2m55.400s user 0m14.086s sys 0m3.886s $./oc adm prune images --registry-url=default-route-openshift-image-registry.apps.xiuwang-42-largeimages.qe.devcluster.openshift.com --certificate-authority=ca.crt --all --loglevel=8 2>> 20kimageprune-2.log >> 20kimageprune-2.log ===========================snip============================== I0625 08:14:38.409367 174 round_trippers.go:423] Request Headers: I0625 08:14:38.409374 174 round_trippers.go:426] Accept: application/json, */* I0625 08:14:38.409381 174 round_trippers.go:426] User-Agent: oc/v1.14.0+7a0a2f2 (linux/amd64) kubernetes/7a0a2f2 I0625 08:14:38.409388 174 round_trippers.go:426] Authorization: Bearer 9QZf_Y30gHVBa1FW6eqdp7124c1i7nr_fxlytnV6o88 I0625 08:14:39.945956 174 round_trippers.go:441] Response Status: 200 OK in 1536 milliseconds I0625 08:14:39.945993 174 round_trippers.go:444] Response Headers: I0625 08:14:39.945999 174 round_trippers.go:447] Content-Type: application/json I0625 08:14:39.946012 174 round_trippers.go:447] Date: Tue, 25 Jun 2019 08:14:39 GMT I0625 08:14:39.946016 174 round_trippers.go:447] Audit-Id: f8b1c280-c595-457f-b2dc-8c9cdefaeb56 I0625 08:14:39.946021 174 round_trippers.go:447] Cache-Control: no-store I0625 08:14:39.946027 174 round_trippers.go:447] Cache-Control: no-store I0625 08:14:39.996312 174 request.go:942] Response Body: {"kind":"ImageStreamList","apiVersion":"image.openshift.io/v1","metadata":{"selfLink":"/apis/image.openshift.io/v1/imagestreams","resourceVersion":"272039"},"items":[{"metadata":{"name":"apicast-gateway","namespace":"openshift","selfLink":"/apis/image.openshift.io/v1/namespaces/openshift/imagestreams/apicast-gateway","uid":"23493dea-96fd-11e9-825f-0a580a820014","resourceVersion":"8161","generation":2,"creationTimestamp":"2019-06-25T03:55:57Z","labels":{"samples.operator.openshift.io/managed":"true"},"annotations":{"openshift.io/display-name":"3scale APIcast API Gateway","openshift.io/image.dockerRepositoryCheck":"2019-06-25T03:56:14Z","samples.operator.openshift.io/version":"4.2.0-0.nightly-2019-06-25-003324"}},"spec":{"lookupPolicy":{"local":false},"tags":[{"name":"2.1.0.GA","annotations":{"description":"3scale's APIcast is an NGINX based API gateway used to integrate your internal and external API services with 3scale's API Management Platform. It supports OpenID connect to integrate with external Identity [truncated 283646 chars] I0625 08:14:40.001577 174 prune.go:277] Creating image pruner with keepYoungerThan=1h0m0s, keepTagRevisions=3, pruneOverSizeLimit=<nil>, allImages=true I0625 08:14:40.001604 174 prune.go:356] Adding image "sha256:0089883f8e4387618946cd24378a447b8cf7e5dfaa146b94acab27fc5e170a14" to graph I0625 08:14:40.001842 174 prune.go:378] Adding image layer "sha256:26e5ed6899dbf4b1e93e0898255e8aaf43465cecd3a24910f26edb5d43dafa3c" to graph ===================================snip=====================================
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922