Bug 1798879 - docker registry pods don't start on a newly installed OCP 3.11 cluster on region eu-north-1
Summary: docker registry pods don't start on a newly installed OCP 3.11 cluster on reg...
Keywords:
Status: CLOSED DUPLICATE of bug 1787604
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry
Version: 3.11.0
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Oleg Bulatov
QA Contact: Wenjing Zheng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-06 08:29 UTC by Jose Ignacio Jerez
Modified: 2020-02-10 08:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-06 18:06:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Inventory file (2.09 KB, text/plain)
2020-02-06 08:29 UTC, Jose Ignacio Jerez
no flags Details

Description Jose Ignacio Jerez 2020-02-06 08:29:47 UTC
Created attachment 1658119 [details]
Inventory file

Description of problem:

Internal image registry pods don't start on a newly installed cluster on AWS, the log from the pods show the probable cause:

panic: Invalid region provided: eu-north-1

Apparently the issue comes from the storage backend S3 bucket used for the registry, that is created in the eu-north-1 (Stockholm) region, as the rest of the components of the cluster.

Issue could be related to https://github.com/docker/distribution/issues/2870

The workaround is simple enough, create an S3 bucket in another supported region and modify the registry-config secret to point to that bucket, then rollout a new deployment of docker-registry deployment config.

Version-Release number of selected component (if applicable):

OCP 3.11.161

How reproducible:

Deploy an OCP 3.11.161 cluster in AWS region eu-north-1 and use an S3 bucket as backend storage for the registry.  Deployment completes without errors but docker-registry pods fail to start


Actual results:

$ oc get pods -n default
NAME                       READY     STATUS             RESTARTS   AGE
docker-registry-1-deploy   0/1       Error              0          18h
docker-registry-2-deploy   0/1       Error              0          16h
docker-registry-3-deploy   0/1       Error              0          15h
docker-registry-4-deploy   1/1       Running            0          7m
docker-registry-4-l9xfw    0/1       CrashLoopBackOff   6          6m
docker-registry-4-lx8jd    0/1       CrashLoopBackOff   6          6m
docker-registry-4-vdhpp    0/1       CrashLoopBackOff   6          6m


Additional info:

Inventory file used to deploy the cluster is pretty simple, attached to the bugzilla.


registry-config secret contents:

version: 0.1
log:
  level: info
http:
  addr: :5000
storage:
  delete:
    enabled: true
  cache:
    blobdescriptor: inmemory
  s3:
    accesskey: XXX
    secretkey: XXX
    region: eu-north-1
    bucket: XXX
    encrypt: False
    secure: true
    v4auth: true
    rootdirectory: /registry
    chunksize: "26214400"
auth:
  openshift:
    realm: openshift
middleware:
  registry:
  - name: openshift
  repository:
  - name: openshift
    options:
      pullthrough: true
      acceptschema2: true
      enforcequota: true
  storage:
  - name: openshift

Comment 1 Adam Kaplan 2020-02-06 18:06:07 UTC
OCP v3.11 uses an older version of the AWS SDK, which means that it is not aware of newer AWS regions like eu-north-1. The workaround for new regions is to provide the s3 endpoint directly in your ansible inventory file via the `openshift_hosted_registry_storage_s3_regionendpoint` parameter [1]. For eu-north-1 this would be https://s3.eu-north-1.amazonaws.com [2].

I do not recommend choosing a different region to host the registry's storage, as this would likely lead to significant bandwidth charges as image content crosses AWS regions.

We recently upgraded OCP 4.4 to support the latest AWS regions [3]. As 3.11 is a non-current version and this is not a Critical/High priority issue [3], per our life cycle support policy we do not intend on backporting the AWS SDK upgrade [4].

[1] https://docs.openshift.com/container-platform/3.11/install/configuring_inventory_file.html#advanced-install-registry-storage
[2] https://docs.aws.amazon.com/general/latest/gr/rande.html
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1787604
[4] https://access.redhat.com/support/policy/updates/openshift_noncurrent

*** This bug has been marked as a duplicate of bug 1787604 ***

Comment 2 Jose Ignacio Jerez 2020-02-10 08:47:01 UTC
I have tested the recommended workaround by using the variable openshift_hosted_registry_storage_s3_regionendpoint and it works perfectly.

Thanks for the help Adam


Note You need to log in before you can comment on or make changes to this bug.