Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1798879

Summary: docker registry pods don't start on a newly installed OCP 3.11 cluster on region eu-north-1
Product: OpenShift Container Platform Reporter: Jose Ignacio Jerez <jjerezro>
Component: Image RegistryAssignee: Oleg Bulatov <obulatov>
Status: CLOSED DUPLICATE QA Contact: Wenjing Zheng <wzheng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: adam.kaplan, aos-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-06 18:06:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Inventory file none

Description Jose Ignacio Jerez 2020-02-06 08:29:47 UTC
Created attachment 1658119 [details]
Inventory file

Description of problem:

Internal image registry pods don't start on a newly installed cluster on AWS, the log from the pods show the probable cause:

panic: Invalid region provided: eu-north-1

Apparently the issue comes from the storage backend S3 bucket used for the registry, that is created in the eu-north-1 (Stockholm) region, as the rest of the components of the cluster.

Issue could be related to https://github.com/docker/distribution/issues/2870

The workaround is simple enough, create an S3 bucket in another supported region and modify the registry-config secret to point to that bucket, then rollout a new deployment of docker-registry deployment config.

Version-Release number of selected component (if applicable):

OCP 3.11.161

How reproducible:

Deploy an OCP 3.11.161 cluster in AWS region eu-north-1 and use an S3 bucket as backend storage for the registry.  Deployment completes without errors but docker-registry pods fail to start


Actual results:

$ oc get pods -n default
NAME                       READY     STATUS             RESTARTS   AGE
docker-registry-1-deploy   0/1       Error              0          18h
docker-registry-2-deploy   0/1       Error              0          16h
docker-registry-3-deploy   0/1       Error              0          15h
docker-registry-4-deploy   1/1       Running            0          7m
docker-registry-4-l9xfw    0/1       CrashLoopBackOff   6          6m
docker-registry-4-lx8jd    0/1       CrashLoopBackOff   6          6m
docker-registry-4-vdhpp    0/1       CrashLoopBackOff   6          6m


Additional info:

Inventory file used to deploy the cluster is pretty simple, attached to the bugzilla.


registry-config secret contents:

version: 0.1
log:
  level: info
http:
  addr: :5000
storage:
  delete:
    enabled: true
  cache:
    blobdescriptor: inmemory
  s3:
    accesskey: XXX
    secretkey: XXX
    region: eu-north-1
    bucket: XXX
    encrypt: False
    secure: true
    v4auth: true
    rootdirectory: /registry
    chunksize: "26214400"
auth:
  openshift:
    realm: openshift
middleware:
  registry:
  - name: openshift
  repository:
  - name: openshift
    options:
      pullthrough: true
      acceptschema2: true
      enforcequota: true
  storage:
  - name: openshift

Comment 1 Adam Kaplan 2020-02-06 18:06:07 UTC
OCP v3.11 uses an older version of the AWS SDK, which means that it is not aware of newer AWS regions like eu-north-1. The workaround for new regions is to provide the s3 endpoint directly in your ansible inventory file via the `openshift_hosted_registry_storage_s3_regionendpoint` parameter [1]. For eu-north-1 this would be https://s3.eu-north-1.amazonaws.com [2].

I do not recommend choosing a different region to host the registry's storage, as this would likely lead to significant bandwidth charges as image content crosses AWS regions.

We recently upgraded OCP 4.4 to support the latest AWS regions [3]. As 3.11 is a non-current version and this is not a Critical/High priority issue [3], per our life cycle support policy we do not intend on backporting the AWS SDK upgrade [4].

[1] https://docs.openshift.com/container-platform/3.11/install/configuring_inventory_file.html#advanced-install-registry-storage
[2] https://docs.aws.amazon.com/general/latest/gr/rande.html
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1787604
[4] https://access.redhat.com/support/policy/updates/openshift_noncurrent

*** This bug has been marked as a duplicate of bug 1787604 ***

Comment 2 Jose Ignacio Jerez 2020-02-10 08:47:01 UTC
I have tested the recommended workaround by using the variable openshift_hosted_registry_storage_s3_regionendpoint and it works perfectly.

Thanks for the help Adam