Bug 2060362

Summary: Openshift registry starts to segfault after S3 storage configuration
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Image RegistryAssignee: Oleg Bulatov <obulatov>
Status: CLOSED ERRATA QA Contact: Keenon Lee <jitli>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6.zCC: aos-bugs, fgleizes, llopezmo, obulatov, xiuwang
Target Milestone: ---Keywords: Reopened
Target Release: 4.10.z   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-21 13:16:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1976782, 2068433, 2074015    
Bug Blocks: 2060363    

Comment 3 XiuJuan Wang 2022-04-01 06:20:20 UTC
After configure registry using ceph rgw , met 403 error, blocked by https://bugzilla.redhat.com/show_bug.cgi?id=2068433

time="2022-04-01T06:16:49.900908996Z" level=error msg="response completed with error" err.code=unknown err.detail="s3aws: AccessDenied: \n\tstatus code: 403, request id: tx0000000000000000002c0-00624698d1-89fc-ocs-storagecluster-cephobjectstore, host id: " err.message="unknown error" go.version=go1.17.5 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=99d5be30-4de1-442e-b785-554f81066883 http.request.method=POST http.request.remoteaddr="10.129.2.16:40036" http.request.uri=/v2/default/httpd-ex/blobs/uploads/ http.request.useragent="containers/5.19.1 (github.com/containers/image)" http.response.contenttype="application/json; charset=utf-8" http.response.duration=50.836091ms http.response.status=500 http.response.written=123 openshift.auth.user="system:serviceaccount:default:builder" vars.name=default/httpd-ex

Comment 5 Keenon Lee 2022-04-11 06:18:01 UTC
Steps to Reproduce:

1.Installed a vsphere cluster with 3 workers, each worker has 10cpu and 24G memory
Install odf operator, and install StorageSystem

Create an obc named "jitli-object-bucket" using the ceph RGW
get the Object Bucket Claim Data

Expose the ceph RGW service.
redhat@jitli:~$ oc expose svc rook-ceph-rgw-ocs-storagecluster-cephobjectstore --hostname=rook-ceph-rgw-ocs-storagecluster-openshift-storage.apps.jitlivs411a.qe.devcluster.openshift.com -n openshift-storage
route.route.openshift.io/rook-ceph-rgw-ocs-storagecluster-cephobjectstore exposed

get the Object Bucket Claim Data and create secret
redhat@jitli:~$ export AWS_ACCESS_KEY_ID=
redhat@jitli:~$ export AWS_SECRET_ACCESS_KEY=
redhat@jitli:~$ oc create secret generic image-registry-private-configuration-user --from-literal=REGISTRY_STORAGE_S3_ACCESSKEY=${AWS_ACCESS_KEY_ID} --from-literal=REGISTRY_STORAGE_S3_SECRETKEY=${AWS_SECRET_ACCESS_KEY} --namespace openshift-image-registry
secret/image-registry-private-configuration-user created


2. Check the aws s3 api
redhat@jitli:~$ aws s3 --no-verify-ssl --endpoint http://rook-ceph-rgw-ocs-storagecluster-openshift-storage.apps.jitlivs411a.qe.devcluster.openshift.com ls
2022-04-11 13:32:57 jitli-object-bucket-5ec98c5e-7b17-4045-9e3f-88109de770ff

redhat@jitli:~$ aws s3 --no-verify-ssl --endpoint http://rook-ceph-rgw-ocs-storagecluster-openshift-storage.apps.jitlivs411a.qe.devcluster.openshift.com cp ./bb.json s3://jitli-object-bucket-5ec98c5e-7b17-4045-9e3f-88109de770ff/bb.json
upload: ./bb.json to s3://jitli-object-bucket-5ec98c5e-7b17-4045-9e3f-88109de770ff/bb.json

redhat@jitli:~$ oc edit config.image
spec:
  ...
  storage:
    managementState: Unmanaged
    s3:
      s3:
      bucket: jitli-object-bucket-5ec98c5e-7b17-4045-9e3f-88109de770ff
      region: us-east-1
      regionEndpoint: http://rook-ceph-rgw-ocs-storagecluster-openshift-storage.apps.jitlivs411a.qe.devcluster.openshift.com
      virtualHostedStyle: false
  ...
redhat@jitli:~$ oc new-app httpd~http://github.com/openshift/httpd-ex.git -n default  --name='jitli'
redhat@jitli:~$ oc get builds -n default
NAME      TYPE     FROM          STATUS     STARTED          DURATION
jitli-1   Source   Git@753f06d   Complete   58 seconds ago   42s

redhat@jitli:~$ oc logs -f build/jitli-1 -n default
...
Successfully pushed image-registry.openshift-image-registry.svc:5000/default/jitli@sha256:623c0f14b5439f9d5afdfbe1d92076c93a787bc8f4bbfade73b0c447c287651e
Push successful

Comment 6 XiuJuan Wang 2022-04-11 06:25:47 UTC
Verified on 4.10.0-0.nightly-2022-04-07-042325 cluster, see details steps in comment #5

Comment 11 errata-xmlrpc 2022-04-21 13:16:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.10 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1356