Description of problem: Can not push images to image-registry when enabling KMS encryption in AlibabaCloud Version-Release number of selected component (if applicable): 4.10-fc.4 How reproducible: 100% Steps to Reproduce: 1. Install AlibabaCloud cluster using IPI 2. Enable KMS encryption $ oc patch config.image/cluster -p '{"spec":{"storage":{"oss":{"encryption":{"method":"KMS","kms":{"keyID":"invaildid"}}}}}}' --type=merge 3. Start to build an application $ oc start-build cakephp-mysql-persistent 4. Check build POD logs: Warning: Push failed, retrying in 5s ... Getting image source signatures Copying blob sha256:adffa69631469a649556cee5b8456f184928818064aac82106bd08bd62e51d4e Copying blob sha256:0661f10c38ccb1007a5937fd652f834283d016642264a0e031028979fcfb2dbf Copying blob sha256:4c32485d4fd9c56cf118ed8afae041324b21b551a1ff7e149fd18b83379264b7 Copying blob sha256:26f1167feaf74177f9054bf26ac8775a4b188f25914e23bda9574ef2a759cce4 Copying blob sha256:f6f866a828fc0c9ca4d7df5ec592a1cd2dd80bbbac29bf101626e84cf4f49304 Copying blob sha256:362566a15abbd2bcd1b49f9d27a3e22855af316e1a49092a0d1b885bcbb9be4a Warning: Push failed, retrying in 5s ... Getting image source signatures Copying blob sha256:adffa69631469a649556cee5b8456f184928818064aac82106bd08bd62e51d4e Copying blob sha256:0661f10c38ccb1007a5937fd652f834283d016642264a0e031028979fcfb2dbf Copying blob sha256:4c32485d4fd9c56cf118ed8afae041324b21b551a1ff7e149fd18b83379264b7 Copying blob sha256:26f1167feaf74177f9054bf26ac8775a4b188f25914e23bda9574ef2a759cce4 Copying blob sha256:362566a15abbd2bcd1b49f9d27a3e22855af316e1a49092a0d1b885bcbb9be4a Copying blob sha256:f6f866a828fc0c9ca4d7df5ec592a1cd2dd80bbbac29bf101626e84cf4f49304 Warning: Push failed, retrying in 5s ... Registry server Address: Registry server User Name: serviceaccount Registry server Email: serviceaccount Registry server Password: <<non-empty>> error: build error: Failed to push image: writing blob: uploading layer to https://image-registry.openshift-image-registry.svc:5000/v2/test-persistent/cakephp-mysql-persistent/blobs/uploads/bfb172f4-9cb3-4d1a-a4ee-a8d9806e12ac?_state=ukfym1aJ4XH0C97fOYcJ1TFQWWwVl_PJ1TOjPgw4QxB7Ik5hbWUiOiJ0ZXN0LXBlcnNpc3RlbnQvY2FrZXBocC1teXNxbC1wZXJzaXN0ZW50IiwiVVVJRCI6ImJmYjE3MmY0LTljYjMtNGQxYS1hNGVlLWE4ZDk4MDZlMTJhYyIsIk9mZnNldCI6MTMyNzAyNzAsIlN0YXJ0ZWRBdCI6IjIwMjItMDEtMzBUMDc6NTc6MjdaIn0%3D&digest=sha256%3A2fd27d11cca9c99da262d41c1a9084c1944681b6d8fa154cfba220affc06aea6: received unexpected HTTP status: 500 Internal Server Error Actual results: Push failed Additional info:
Moving to image registry component, "storage" (PVs/PVCs) is not involved here (I hope).
I investigated this issue and was able to reproduce this error: Steps followed: 1. Built a cluster 2. Modified the registry to use KMS: oc patch config.image/cluster -p '{"spec":{"storage":{"oss":{"encryption":{"method":"KMS","kms":{"keyID":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx"}}}}}}' --type=merge from registry operator log: I0203 14:40:21.552477 1 controller.go:325] object changed: *v1.Config, Name=cluster (status=true): changed:status.conditions.0.lastTransitionTime={"2022-02-02T17:51:46Z" -> "2022-02-03T14:40:20Z"}, added:status.conditions.0.message="User supplied OSS bucket exists and is accessible", changed:status.conditions.3.message={"Default AES256 encryption was successfully enabled on the OSS bucket" -> "Default KMS encryption was successfully enabled on the OSS bucket"}, changed:status.observedGeneration={"2.000000" -> "3.000000"}, added:status.storage.oss.encryption.kms.keyID="be41ecd4-4124-4e85-b84c-135e5cbab113", added:status.storage.oss.encryption.method="KMS" 3. Ran the following new-app command: oc new-app rails-postgresql-example 4. Observed the error in build logs: error: build error: Failed to push image: writing blob: uploading layer to https://image-registry.openshift-image-registry.svc:5000/v2/test/rails-postgresql-example/blobs/uploads/b6e471e2-2123-40cf-94aa-6638a3c99d04?_state=_snTdTNtFWHYiGmOibOoneZyxQ1rlNnffuvcvsriHht7Ik5hbWUiOiJ0ZXN0L3JhaWxzLXBvc3RncmVzcWwtZXhhbXBsZSIsIlVVSUQiOiJiNmU0NzFlMi0yMTIzLTQwY2YtOTRhYS02NjM4YTNjOTlkMDQiLCJPZmZzZXQiOjgxNDYzODE5LCJTdGFydGVkQXQiOiIyMDIyLTAyLTAzVDE0OjQ1OjUzWiJ9&digest=sha256%3A5dcbdc60ea6b60326f98e2b49d6ebcb7771df4b70c6297ddf2d7dede6692df6e: received unexpected HTTP status: 500 Internal Server Error 5. Observed error in the registry logs: oc logs image-registry-57695cd768-dxt82 -nopenshift-image-registry 2022/02/03 14:43:39 PUT https://test-fhpzc-image-registry-us-east-1-tmtrddpdirxylmjoqtwwpmxsff.oss-us-east-1-internal.aliyuncs.com/docker/registry/v2/blobs/sha256/b8/b836421e9efdc85fc56e6325a4d20b9a9de98ede20369efb1a9bcbe64b50ec17/data ... time="2022-02-03T14:43:39.607520392Z" level=error msg="Failed for move from docker/registry/v2/repositories/openshift/postgresql/_uploads/7fdaa0f7-7b5b-4291-acd6-8b7f788672c3/data to docker/registry/v2/blobs/sha256/b8/b836421e9efdc85fc56e6325a4d20b9a9de98ede20369efb1a9bcbe64b50ec17/data: Aliyun API Error: RequestId: 61FBEA1B250B6235371F0D99 Status Code: 403 Code: AccessDenied Message: This request is forbidden by kms." time="2022-02-03T14:43:39.607617531Z" level=error msg="Background mirroring failed: error committing to storage: oss: Aliyun API Error: RequestId: 61FBEA1B250B6235371F0D99 Status Code: 403 Code: AccessDenied Message: This request is forbidden by kms." go.version=go1.17.2 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=6f1621fc-46a6-40ce-b48d-18ac99584d8c http.request.method=GET http.request.remoteaddr="10.129.2.1:19861" http.request.uri="/v2/openshift/postgresql/blobs/sha256:b836421e9efdc85fc56e6325a4d20b9a9de98ede20369efb1a9bcbe64b50ec17" http.request.useragent="cri-o/1.23.0-108.rhaos4.10.gitb15fee5.el8 go/go1.17.2 os/linux arch/amd64" openshift.auth.user="system:serviceaccount:test:default" vars.digest="sha256:b836421e9efdc85fc56e6325a4d20b9a9de98ede20369efb1a9bcbe64b50ec17" vars.name=openshift/postgresql A quick search revealed that the OSS storage user is missing permissions. (https://partners-intl.aliyun.com/help/en/doc-detail/185803.htm) Which led me to the following link https://partners-intl.aliyun.com/help/en/doc-detail/31871.htm?spm=a2c63.p38356.0.0.13af433feW5LNK#concept-lqm-fkd-5db This shows that we are missing permissions in our policy when using KMS: "kms:GenerateDataKey", "kms:ListKeys", "kms:ListAlias", "kms:ListAliasByKeyId", "kms:DescribeKey", "kms:Decrypt" I added the above to our test-fhpzc-openshift-image-registry-installer-cloud-credentials-policy-policy and tested the build again. oc start-build rails-postgresql-example oc logs rails-postgresql-example-2-build -f ... Pushing image image-registry.openshift-image-registry.svc:5000/test/rails-postgresql-example:latest ... Getting image source signatures Copying blob sha256:f2ce12b8c9209c395661c7915021144b64203ee2ed0b49c121c956fd7487ae0a Copying blob sha256:79a56ba04a301eb949644bca29f18b1879b6f305091ef1eb8068a0f5828db863 Copying blob sha256:ab58807b008b365083a50c35a358edffabb90602ed6a1e4786adf1b0bba2a512 Copying blob sha256:5dcbdc60ea6b60326f98e2b49d6ebcb7771df4b70c6297ddf2d7dede6692df6e Copying blob sha256:8671113e1c57d3106acaef2383f9bbfe1c45a26eacb03ec82786a494e15956c3 Copying blob sha256:aad543859364662ddb264ad5752fd9449d47410b9efa0278463c0a9c578b79c6 Copying config sha256:ac82b94433112582279c07061ae6237bac965ce94629f0e3a1b6419f7b657f7b Writing manifest to image destination Storing signatures Successfully pushed image-registry.openshift-image-registry.svc:5000/test/rails-postgresql-example@sha256:e66d4b750b80f1be12deca66da42595847e82b9fd658add6ac2feaa2878ccfae Push successful I have created a PR here for the fix: https://github.com/openshift/cluster-image-registry-operator/pull/751
Test on 4.11.0-0.nightly-2022-02-08-180554 cluster. Scenario 1: Configure with KMS encrytions with valid keyid. $oc patch config.image/cluster -p '{"spec":{"storage":{"oss":{"encryption":{"method":"KMS","kms":{"keyID":"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx"}}}}}}' --type=merge Could push and pull images from internal registry. Scenarion 2: Configure with KMS encrytions with invalid keyid. Can't push images to internal registry, but registry has clear error, "Failed for move from docker/registry/v2/repositories/wxj/httpd-ex/_uploads/94586e7b-d7e7-491a-9647-853b63cffdeb/data to docker/registry/v2/blobs/sha256/dd/dda696772fce431c72db5797aab0e2e6cb2c7768796712c8ae6f95ab8af7b47e/data: Aliyun API Error: RequestId: 6203351E716D4D3939117C1B Status Code: 400 Code: InvalidParameter Message: The specified parameter KMS keyId is not valid." Scenarion 3: Disable the KMS Can't push images to internal registry, but registry has clear error, time="2022-02-09T06:15:49.285806811Z" level=error msg="Failed for move from docker/registry/v2/repositories/wxj/httpd-ex/_uploads/f516a79f-63de-4401-a711-039e106b5693/data to docker/registry/v2/blobs/sha256/7b/7b9ccd1cbf5f58d676f1c0882fd6b09e615d122d701a64acdecdcf7db8a10a9e/data: Aliyun API Error: RequestId: 62035C15C7A059383388FA4B Status Code: 409 Code: KeyDisabled Message: The request was rejected because the key state is Disabled."
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069