Description of problem: The problem occurs when an image is being pushed while `oc adm prune images` is running. It looks like `oc adm prune images` can delete blobs of images being uploaded since they are not yet referenced by a manifest (the manifest is uploaded last). The push operation finishes successfully but pull operations on the affected images fail with "unexpected EOF". The registry happily serves the missing blobs with HTTP status 200 and size 0. But the Docker client first downloads the manifest and expects to download blobs with the original size, hence the "unexpected EOF" error. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Log into the internal Docker registry 2. Run `oc adm prune images` in a loop 3. oc new-project test 4. oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git 5. oc scale dc ruby-ex --replicas=0 6. Wait for build to complete 7. Delete any Docker images of the `test/ruby-ex` and `centos/ruby-22-centos7` repositories using `docker rmi` to ensure they are pulled from the registry 8. docker pull 172.30.1.1:5000/test/ruby-ex The pull operation will now fail with a high probability. Should it succeed delete the project and start again at step 3. In case of failure you can find the the manifest present in the registry but one or more blobs missing. In most cases there is no easy way for end users to recover from the issue. The registry reports the blobs as already present since they are listed in the manifest. Therefore they will not be uploaded again by future push operations. Actual results: Fails to pull image Expected results: the image should have pulled without any issue. Additional info:
@Jaspreet I suspect this is a duplicate of bug 1410434. Could you please verify there is (or isn't) a panic [1] in docker daemon when the pull fails with EOF? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1410434#c4
I was able to reproduce but with a different error. I did it like this: 1. run the following command in a loop: oadm prune images --confirm --keep-tag-revisions=1 \ --keep-younger-than=1m --token=${token} 2. pushed the docker.io/golang:1.4.3 image (A) to 172.30.30.30:5000/pjoe/golang 3. built a new image (B) based on (A) adding a new data layer 4. And I pushed it again as 172.30.30.30:5000/pjoe/golang. The push succeeded. This shifted original golang image to the second position in a history of golang:latest imagestreamtag, making it a candidate for pruning. 5. removed all golang images from my docker daemon 6. pulled the golang image (B) back from the integrated registry: docker pull 172.30.30.30:5000/pjoe/golang Using default tag: latest Trying to pull repository 172.30.30.30:5000/pjoe/golang ... latest: Pulling from 172.30.30.30:5000/pjoe/golang 7268d8f794c4: Downloading a3ed95caeb02: Download complete d9a49bc2b1b0: Download complete b965864d2d45: Download complete bad3f2daf720: Downloading 61db11059a7f: Downloading 25f4a8c55a9b: Downloading d45512a784a3: Downloading 1bff6f993ec9: Download complete unknown blob The pruner started producing following output slightly before the stop 4 finished: I0111 11:16:47.291762 26846 prune.go:257] Creating image pruner with keepYoungerThan=1m0s, keepTagRevisions=1, pruneOverSizeLimit=<nil>, allImages=false I0111 11:16:47.291982 26846 prune.go:384] Unable to find image "sha256:61a80f918027452d07b988ad8add8f5fcc9780e45cda63c15ff866adc51b678e" in graph (from tag="stg", revision=0, dockerImageReference=registry.ops.openshift.com/ops/oso-rhel7-zagg-web@sha256:61a80f918027452d07b988ad8add8f5fcc9780e45cda63c15ff866adc51b678e ) - skipping I0111 11:16:47.292014 26846 prune.go:384] Unable to find image "sha256:13f460af40143865d32b44ef1e3fbf44034d4beacc67a4487cd0ef96361e9aa6" in graph (from tag="latest", revision=0, dockerImageReference=172.30.30.30:5000/pjoe/golang@sha256:13f460af40143865d32b44ef1e3fbf44034d4beacc67a4487cd0ef96361e9aa6) - skipping I0111 11:16:47.292327 26846 prune.go:835] Using registry: 172.30.30.30:5000 Deleting references from image streams to images ... STREAM IMAGE TAGS pjoe/golang sha256:89247599498eb346ec7c331d6fda457df2adc659b1db4f4ca69145c198689bab latest Deleting registry repository layer links ... REPO LAYER LINK pjoe/golang sha256:7268d8f794c449e593d3a48f62e7e22b7c3a4b6e615caaf9494ec3cb2d48f503 pjoe/golang sha256:61db11059a7f7b24e125090c65f70b544236ee47090e1deadc5962969092f776 pjoe/golang sha256:25f4a8c55a9b41b38beed2e3e9e0d43e76655173450a9d4302bd0de73628878d pjoe/golang sha256:b965864d2d455f06e4ad8165d12456219dcaeed2e49b0f13ada623aa00d9e822 pjoe/golang sha256:bad3f2daf720952bee23d5dc4baf526bfaac8f0629de7db640058c3d8f632c3e pjoe/golang sha256:d45512a784a33b701cb3b02b025dec57ccedeb84e8fc2d907d8cf9ade1801559 pjoe/golang sha256:d9a49bc2b1b0cdba4093d4ef5d276883a81a3141f05bdb46eb8bacb5b5d94acf Deleting registry layer blobs ... BLOB sha256:7268d8f794c449e593d3a48f62e7e22b7c3a4b6e615caaf9494ec3cb2d48f503 sha256:61db11059a7f7b24e125090c65f70b544236ee47090e1deadc5962969092f776 sha256:25f4a8c55a9b41b38beed2e3e9e0d43e76655173450a9d4302bd0de73628878d sha256:b965864d2d455f06e4ad8165d12456219dcaeed2e49b0f13ada623aa00d9e822 sha256:bad3f2daf720952bee23d5dc4baf526bfaac8f0629de7db640058c3d8f632c3e sha256:d45512a784a33b701cb3b02b025dec57ccedeb84e8fc2d907d8cf9ade1801559 sha256:d9a49bc2b1b0cdba4093d4ef5d276883a81a3141f05bdb46eb8bacb5b5d94acf Deleting registry repository manifest data ... W0111 11:18:02.125464 26846 prune.go:1029] Unable to prune layer http://172.30.30.30:5000/v2/pjoe/golang/manifests/sha256:89247599498eb346ec7c331d6fda457df2adc659b1db4f4ca69145c198689bab, returned 404 Not Found REPO IMAGE pjoe/golang sha256:89247599498eb346ec7c331d6fda457df2adc659b1db4f4ca69145c198689bab Deleting images from server ... IMAGE sha256:89247599498eb346ec7c331d6fda457df2adc659b1db4f4ca69145c198689bab It stopped execution after the step 4 completed. What happened under the hood: 1. registry received all the blobs of image B 2. pruner collected all the images (it found just A, not B) 3. registry received manifest and created image B in etcd 4. registry created update golang image stream - it shifted image A to index 1 - it inserted image B at index 0 5. pruner collected all the image streams - the golang:latest istag contains references both images A and B 6. pruner marked image A as a candidate for pruning because - it's older than threshold - it occurs at index 1 in revision history of golang:latest istag, which is above the threshold 7. pruner removed image A and all its layers Removing of image A is correct. It's not correct to remove its layers though since they are referenced by B. B is, however, not known to the pruner. This could be solved by fetching images that were not found during a processing of image streams. I'll open a PR. Nevertheless, it may be unrelated to customer's issue.
Trello card: https://trello.com/c/3ZmBKDZZ/437-8-registry-refactor-registry-layer-and-signature-pruning
Hello, We need to open this bugzilla till it is officially released. Regards, Jaspreet
Sorry for delay. I didn't make any progress on this and the pruning rework is still on the queue. As agreed on IRC, we will provide a temporary solution. I'll make a bash script that will be able to prune the no-longer referenced blobs from the registry storage in read-only mode. Note that the script won't address any of 404 errors like: W0512 09:17:50.392748 96693 prune.go:972] Unable to prune layer http://172.30.128.186:5000/v2/zis-dev/angebot/blobs/sha256:... As this needs to wait for the rework. It will only be able to delete the blobs that the pruning command is not able to prune. So it will only reduce the occupied size of the registry storage.
Backport PR for 3.3: https://github.com/openshift/ose/pull/802
The PR has been merged.
The build have not ready for testing, change status to modify.
(In reply to Michal Minar from comment #35) 1. \"sha256:bad8bb8186a329f4c29d96f0be2348cf75533df34c41bf8448d7732d3e86efda\" in the graph\nI0722 02:15:10.959332 40540 imagepruner.go:430] Unable to find image 2. Failing on "pathwayapi/salesforcedev to remove references to image sha256:c265689de055fa1a0fca8bebf613a1120c4cbd6afafb1378dc93a2f8c7678f65:" *********************** \nerror updating image stream pathwayapi/salesforcedev to remove references to image sha256:c265689de055fa1a0fca8bebf613a1120c4cbd6afafb1378dc93a2f8c7678f65: imagestreams \"salesforcedev\" cannot be updated: the object has been modified; please apply your changes to the latest version and try again\nerror updating image stream pathwayapi/salesforcedev to remove references to image
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1828
(In reply to hgomes from comment #37) > \"sha256:bad8bb8186a329f4c29d96f0be2348cf75533df34c41bf8448d7732d3e86efda\" > in the graph\nI0722 02:15:10.959332 40540 imagepruner.go:430] Unable to > find image Should be easy to fix. The dangling references to deleted images can be safely removed from image streams. The warning should still be printed for the first time the pruner hits them. Next time the pruner runs, these messages shall be gone. > > 2. > Failing on "pathwayapi/salesforcedev to remove references to image > sha256:c265689de055fa1a0fca8bebf613a1120c4cbd6afafb1378dc93a2f8c7678f65:" > *********************** > \nerror updating image stream pathwayapi/salesforcedev to remove references > to image > sha256:c265689de055fa1a0fca8bebf613a1120c4cbd6afafb1378dc93a2f8c7678f65: > imagestreams \"salesforcedev\" cannot be updated: the object has been > modified; please apply your changes to the latest version and try > again\nerror updating image stream pathwayapi/salesforcedev to remove > references to image Already addressed by [1]. It just needs to be back-ported. [1] https://github.com/openshift/origin/pull/15899 Could you please open a separate bugzilla where we can address the remaining issues?
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days