Bug 1810461 - Unable to pull images from swift-backed internal image registry: x509 error with self-signed OSP16 [NEEDINFO]
Summary: Unable to pull images from swift-backed internal image registry: x509 error w...
Keywords:
Status: NEW
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 4.4
Hardware: x86_64
OS: Unspecified
high
low
Target Milestone: ---
: 4.6.0
Assignee: Vikram Goyal
QA Contact: Xiaoli Tian
Vikram Goyal
URL:
Whiteboard:
: 1816042 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-05 10:13 UTC by Robert Sandoval
Modified: 2020-07-31 07:01 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
cjanisze: needinfo? (racedoro)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5234981 None None None 2020-07-20 10:02:33 UTC

Description Robert Sandoval 2020-03-05 10:13:08 UTC
Description of problem:

Deployed OCP 4.4 on OSP16 and tried to deploy an app using the s2i builder.
Image Registry is backed by Swift 

App built fine and was pushed into the internal registry. The deployment failed with

 x509: certificate signed by unknown authority

on trying to pull the image from the internal registry

Version-Release number of selected component (if applicable):


How reproducible:
With OCP4.4 on OSP16 use an s2i builder to deploy a sample app

Steps to Reproduce:
1.
2.
3.

Actual results:

Error: ImagePullBackOff


Expected results:

Application to deploy normally


Additional info:

Workaround is to set  
config.imageregistry/cluster and set spec.disableRedirects = true

This allows the client to pull the image layers from the image registry rather than from links directly from Swift. The Swift CA needs to be added to worker nodes.

Comment 1 Oleg Bulatov 2020-03-23 13:20:45 UTC
*** Bug 1816042 has been marked as a duplicate of this bug. ***

Comment 2 Wenjing Zheng 2020-03-24 01:13:08 UTC
This issue only happens on self-signed OSP16. QE has tested with OSP16+kury, no such issue.

Comment 3 Wenjing Zheng 2020-03-24 07:01:33 UTC
Hi Oleg, this bug is targeted to 4.5, will we fix this in 4.4 too?

Comment 4 Oleg Bulatov 2020-03-24 10:18:22 UTC
Eventually we may backport it to 4.4, but it's not our highest priority. The workaround is simple: changing one field on the config.imageregistry object.

Comment 6 Chris Janiszewski 2020-05-22 18:10:27 UTC
I just hit this in my environment and there is a typo in the first comment (workaround) as well as the release notes. The variable should be: 
spec.disableRedirect: true

and not
spec.disableRedirects: true

(no s at the end)

Also for anyone that hits this issue the cli command to add that parameter is:
oc edit configs.imageregistry.operator.openshift.io/cluster

Comment 7 Adam Kaplan 2020-05-29 12:30:45 UTC
Moving this to Docs.

The Swift service needs to use a trusted certificate - either one signed by a globally trusted CA, or a CA that has been added to the cluster trust store [1]. The current docs do not mention this [2].

[1] https://docs.openshift.com/container-platform/4.4/networking/configuring-a-custom-pki.html
[2] https://docs.openshift.com/container-platform/4.4/installing/installing_openstack/installing-openstack-installer-custom.html#installation-osp-enabling-swift_installing-openstack-installer-custom

Comment 8 Chris Janiszewski 2020-05-29 16:14:14 UTC
Adam, can you clarify why do you think this is Documentation issue? I have trusted certs injected in both cloud.yaml and config-install.yaml and I am still getting this error when deploying apps.

error: build error: After retrying 2 times, Pull image still failed due to error: while pulling "docker://image-registry.openshift-image-registry.svc:5000/
openshift/python@sha256:cc03f354f2a298de72f0d9dcb39a82178c996faf033321c56f8c4756b0cd3a90" as "image-registry.openshift-image-registry.svc:5000/openshift/py
thon@sha256:cc03f354f2a298de72f0d9dcb39a82178c996faf033321c56f8c4756b0cd3a90": Error parsing image configuration: Get https://10.9.65.100:13808/swift/v1/AU
TH_1b80108965b748b7aeff8b6ec2017129/ocpra-nt6wn-image-registry-ilbncpaljsphjuurgihkefoehsjnxouadfx/files/docker/registry/v2/blobs/sha256/7d/7ddee4f67b8369d
b4795540eadab163e17442a380ee339891069fbf676753a09/data?temp_url_sig=e839524d45193f91872ec6ef5c7c78836a6fd046&temp_url_expires=1590769493: x509: certificate
 signed by unknown authority


(shiftstack) [stack@chrisj-undercloud-osp13 ~]$ cat clouds.yaml | grep cacert
    cacert: /home/stack/ssl/overcloud.pem

(shiftstack) [stack@chrisj-undercloud-osp13 ~]$ cat ocpra-config/install-config.yaml | grep -A2 additionalTrustBundle
additionalTrustBundle: |
    -----BEGIN CERTIFICATE-----
    MIIF4TCCA8mgAwIBAgIJANsI/G7mHc83MA0GCSqGSIb3DQEBCwUAMIGFMQswCQYD

Comment 9 Adam Kaplan 2020-06-25 12:18:07 UTC
@Chris apologies for not getting back to you on this.

I'm moving this back to the Image Registry per your comment. It appears that the installation should have added the CA to the global trust bundle, and the image registry should be able to pick it up.

Comment 10 Chris Janiszewski 2020-06-25 13:39:11 UTC
Thank you. I appreciate looking into it again.

Comment 11 Oleg Bulatov 2020-06-29 11:03:18 UTC
Lowering severity as workaround is trivial.

> the image registry should be able to pick it up.

The problem is that the client (buildah, I guess) doesn't trust storage (Swift).

Either the registry should proxy traffic to storage through itself (i.e. spec.disableRedirect should be true), but in this case the registry will require much more resources, especially network bandwidth.

Or clients should trust storage certificates.

The registry operator doesn't know if clients trust this certificate, so it doesn't know if redirects should be disabled. The operator's expectation is that the object storage is world accessible, like S3, GCS, or Azure Blob Storage.

So it either day 2 operation (tuning the registry for clients that don't trust storage) or WONTFIX.

Comment 12 Anil Dhingra 2020-07-21 03:20:31 UTC
hi

is this bug fixed in 4.5 GA or will be on radar for later release  , as we have another instance of self-signed OSP16 reported

Comment 13 Wenjing Zheng 2020-07-21 04:56:27 UTC
(In reply to Anil Dhingra from comment #12)
> hi
> 
> is this bug fixed in 4.5 GA or will be on radar for later release  , as we
> have another instance of self-signed OSP16 reported

We are using this workaround to fix this error: $oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"disableRedirect":"true"}}'


Note You need to log in before you can comment on or make changes to this bug.