Bug 2065552 - [AWS] Failed to install cluster on AWS ap-southeast-3 region due to image-registry panic error
Summary: [AWS] Failed to install cluster on AWS ap-southeast-3 region due to image-reg...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Oleg Bulatov
QA Contact: Keenon Lee
URL:
Whiteboard:
Depends On:
Blocks: 2110963
TreeView+ depends on / blocked
 
Reported: 2022-03-18 07:59 UTC by Yunfei Jiang
Modified: 2022-08-10 10:55 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the image registry and its operator uses old AWS SDK Consequence: they don't know about the ap-southeast-3 region Fix: bump AWS SDK Result: the registry can be configured to use ap-southeast-3
Clone Of:
Environment:
Last Closed: 2022-08-10 10:54:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-image-registry-operator pull 771 0 None open Bug 2065552: Bump aws-sdk-go to 1.44.4 2022-05-09 13:15:21 UTC
Github openshift image-registry pull 326 0 None Merged Bug 2065552: Bump aws-sdk-go to 1.44.4 2022-05-09 13:15:21 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:55:14 UTC

Description Yunfei Jiang 2022-03-18 07:59:17 UTC
Install an IPI cluster on new AWS region ap-southeast-3, imgege registry got a panic error:

oc logs -n openshift-image-registry                           image-registry-7745c97c84-w2fm6
time="2022-03-18T07:04:04.329871661Z" level=info msg="start registry" distribution_version=v2.7.1+unknown go.version=go1.17.5 openshift_version=4.10.0-202203111548.p0.g9fb7451.assembly.stream-9fb7451
time="2022-03-18T07:04:04.330259828Z" level=info msg="caching project quota objects with TTL 1m0s" go.version=go1.17.5
panic: Invalid region provided: ap-southeast-3

goroutine 1 [running]:
github.com/docker/distribution/registry/handlers.NewApp({0x20252a8, 0xc00012c000}, 0xc000535880)
    /go/src/github.com/openshift/image-registry/vendor/github.com/docker/distribution/registry/handlers/app.go:127 +0x2829
github.com/openshift/image-registry/pkg/dockerregistry/server/supermiddleware.NewApp({0x20252a8, 0xc00012c000}, 0x0, {0x2041e98, 0xc000579dd0})
    /go/src/github.com/openshift/image-registry/pkg/dockerregistry/server/supermiddleware/app.go:96 +0xb9
github.com/openshift/image-registry/pkg/dockerregistry/server.NewApp({0x20252a8, 0xc00012c000}, {0x2009750, 0xc00031c1a0}, 0xc000535880, 0xc0001b8dc0, {0x0, 0x0})
    /go/src/github.com/openshift/image-registry/pkg/dockerregistry/server/app.go:138 +0x466
github.com/openshift/image-registry/pkg/cmd/dockerregistry.NewServer({0x20252a8, 0xc00012c000}, 0xc000535880, 0xc0001b8dc0)
    /go/src/github.com/openshift/image-registry/pkg/cmd/dockerregistry/dockerregistry.go:210 +0x36a
github.com/openshift/image-registry/pkg/cmd/dockerregistry.Execute({0x1ff2660, 0xc00031c010})
    /go/src/github.com/openshift/image-registry/pkg/cmd/dockerregistry/dockerregistry.go:164 +0x889
main.main()
    /go/src/github.com/openshift/image-registry/cmd/dockerregistry/main.go:93 +0x496


Version-Release number of the following components: 
4.10.5

 
How reproducible: 
Always 
 

Steps to Reproduce: 
1. Create install-config.yaml and update it as:
platform:
  aws:
    region: ap-southeast-3
    serviceEndpoints:
    - name: ec2
      url: https://ec2.ap-southeast-3.amazonaws.com
2. create an IPI cluster

Actual results: 
panic: Invalid region provided: ap-southeast-3

Expected results:
image-registry operator is available and cluster can be installed successfully


Additional info:
1. this issue is similar as Bug 1882199
2. specify serviceEndpoints is due to Bug 2065510, this is a workaround.

Comment 1 Okky Hendriansyah Tri Firgantoro 2022-03-20 02:21:11 UTC
Hi,

I have passed the image registry panic error by specifying the serviceEndpoints for S3 also. This is probably due to openshift-installer (via its vendor aws-sdk) does not have the visibility of ap-southeast-3 region, because it was not GA yet during the development of openshift-install 4.10. 

I have put the details of my install-config.yaml in [1]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2065510

Comment 2 XiuJuan Wang 2022-03-22 07:38:21 UTC
The workaround to add serviceEndpoints for ap-southeast-3 region in install-config.yaml works for image-registry.

If not add `regionEndpoint: https://s3.ap-southeast-3.amazonaws.com` in image registry configure, will meet comment #0 panic.

Comment 3 Keenon Lee 2022-05-05 07:49:19 UTC
Works well in ap-southeast-3
redhat@jitli:~/work/src/test/2074050/test$ oc get node -l node-role.kubernetes.io/worker -o=jsonpath='{.items[*].metadata.labels.topology\.kubernetes\.io\/zone}'
ap-southeast-3a ap-southeast-3b ap-southeast-3c


redhat@jitli:~/work/src/test/2074050/test$ oc get pods -n openshift-image-registry
NAME                                              READY   STATUS    RESTARTS   AGE
cluster-image-registry-operator-975868bd5-mzftg   1/1     Running   0          34m
image-registry-5d7bc4499c-6srvt                   1/1     Running   0          21m
image-registry-5d7bc4499c-r9l62                   1/1     Running   0          21m
node-ca-5tbsp                                     1/1     Running   0          21m
node-ca-69xw9                                     1/1     Running   0          21m
node-ca-b24tn                                     1/1     Running   0          17m
node-ca-bnvmt                                     1/1     Running   0          21m
node-ca-n4v7z                                     1/1     Running   0          17m
node-ca-sfkzr                                     1/1     Running   0          17m


https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/ginkgo-test-vm/19619/testReport/

lgtm

Done

Comment 10 errata-xmlrpc 2022-08-10 10:54:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.