Description of problem: When using the oc new-app to create a new build, the builds are creating incomplete multipart uploads to S3 (as well as on other storage). Build is completed successfully. Also, The incomplete uploads are not cleaned. Currently, the incomplete multipart have to be deleted manually. If incomplete uploads get too many in s3 storage, the docker push does stop working with the HTTP error 500. Version-Release number of selected component (if applicable): OpenShift v3.9.51 How reproducible: Reproducible with 'oc new-app --name e2e https://github.com/appuio/endtoend-docker-helloworld.git -n test' It does also occur with the RHEL image registry.access.redhat.com/rhscl/httpd-24-rhel7:latest, but not always. Actual results: The incomplete uploads have seen in s3 storage which has to be cleaned manually. Expected results: There should not be any incomplete multipart uploads of image to s3 stoarge.
Retested this and created an BuildConfig directly with the same result. There is no difference if ImageStreams are used or not. apiVersion: build.openshift.io/v1 kind: BuildConfig metadata: name: e2e-12 namespace: test spec: output: to: kind: DockerImage name: 'docker-registry.default.svc:5000/test/e2e-12' source: git: uri: 'https://github.com/appuio/endtoend-docker-helloworld.git' type: Git strategy: dockerStrategy: from: kind: ImageStreamTag name: 'httpd-24-centos7:latest' type: Docker $ ./mc ls --recursive --incomplete bucket/registry [2018-12-28 17:19:52 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/18550046-da65-4189-b96a-23b6998efe6a/data [2018-12-28 17:19:56 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/1f89f3a9-9fda-4a84-b4a2-6f45ba4084c2/data [2018-12-28 17:19:54 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/4b67e7af-ab05-40dc-9e02-ecb79bff2422/data [2018-12-28 17:19:51 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/688ffddb-4633-49cb-84e5-85a4218b404d/data [2018-12-28 17:19:52 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/6931175f-c3d4-4ca4-8e43-11dff2ec175e/data [2018-12-28 17:19:51 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/768b3be9-a6df-4cf5-8dc3-60342705cfa1/data [2018-12-28 17:19:51 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/7e0a3fba-eaeb-4363-a066-2036547ea6d1/data [2018-12-28 17:19:50 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/8e91ccb4-90b2-4774-a8b5-8c0ff30ee0b9/data [2018-12-28 17:19:53 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/916da2d6-5779-45f2-8ff3-94e458c4ccc8/data [2018-12-28 17:19:49 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/a0697689-e978-4221-bc3e-4f1a4f6d477b/data [2018-12-28 17:19:51 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/a726b6f0-ba4a-4cbc-aceb-176101a3b32b/data [2018-12-28 17:19:51 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/cc87268e-1b2b-4b23-8134-b91a33f65cfe/data [2018-12-28 17:19:52 CET] 0B registry/docker/registry/v2/repositories/test/e2e-12/_uploads/e442d3dd-70bd-494b-811f-3d91752eb45c/data
There is a setting that can be enabled on the S3 bucket to abort incomplete multipart uploads (you can read more about it here: https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html#mpu-abort-incomplete-mpu-lifecycle-config) We recommend that you set this option on the S3 bucket that is being used for the registry. We have opened an issue on the github repository to enable this option by default on S3 buckets that the registry operator creates. https://github.com/openshift/cluster-image-registry-operator/issues/128
The problem with a S3 bucket based cleanup is, that this is different to implement on different S3 APIs. In this case, it is an Dell EMC ECS Storage. Ceph S3 is an other often used storage. So this should definitiv handled by the client of the docker-registry.
If those storage mediums are attempting to emulate the S3 apis and features, then it would be up to them to support them correctly/fully, or up to the user to be able to configure the incomplete multi-part uploads cleanup. Not every storage type (GCS/Azure/Filesystem) may support multi-part uploads, so those uploads would be dealt with appropriately by their driver.
Because I have seen that similar also on other file-based storage backends (Gluster). I assume, that this is a general issue, which does occur from time to time. At least a "dockerregistry -prune delete" should remove those leftovers, but it does not.
https://github.com/openshift/image-registry/pull/143
I did configure a lifecycle configuration on the S3 bucket, but it doesn't delete the invalid multipart uploads fast enough. So I still have to delete them manually: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Rule> <ID>lifecycle-v2-abortmpu-per-day</ID> <Filter/> <Status>Enabled</Status> <AbortIncompleteMultipartUpload> <DaysAfterInitiation>1</DaysAfterInitiation> </AbortIncompleteMultipartUpload> </Rule> </LifecycleConfiguration> So thanks for the PR! Hopefully this will resolve the issue.
https://github.com/openshift/image-registry/pull/151 (waiting to be merged)
Merged.
Created attachment 1533936 [details] Registry log from OCP v3.11.59
Thanks Oleg! I can reproduce with your steps and will verify this bug as below: Verified with below version: openshift v3.9.68 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.16
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0331