Bug 1578269

Summary: Error handling when upstream certificates are not trusted needs to be improved
Product: OpenShift Container Platform Reporter: Simon Reber <sreber>
Component: Image RegistryAssignee: Oleg Bulatov <obulatov>
Status: CLOSED ERRATA QA Contact: Dongbo Yan <dyan>
Severity: low Docs Contact:
Priority: medium    
Version: 3.7.1CC: aos-bugs
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: send to the client why the registry can't pull the manifest from the remote registry. Reason: without this information, it's harder to understand what's going on. Result: the registry can send non-standard errors with additional information.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:15:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simon Reber 2018-05-15 07:19:50 UTC
Description of problem:

Customer ran into the following issue.

$ oc import-image master/jenkins-master-base --from=upstream-registry.faraway.example.intra:443/openshift-jenkins/master/jenkins-master-base:latest --confirm -n test | head -10
The import completed successfully.

Name:                   jenkins-master-base
Namespace:              test
Created:                5 hours ago
Labels:                 <none>
Annotations:            openshift.io/image.dockerRepositoryCheck=2018-05-09T13:58:35Z
Docker Pull Spec:       docker-registry.default.svc:5000/test/jenkins-master-base
Image Lookup:           local=false
Unique Images:          1

$ docker pull docker-registry-default.openshift.example.intra/test/jenkins-master-base
Using default tag: latest
Trying to pull repository docker-registry-default.openshift.example.intra/test/jenkins-master-base ...
error parsing HTTP 404 response body: json: cannot unmarshal number AB2344... into Go struct field Error.detail of type float64: "{\"errors\":[{\"code\":\"MANIFEST_UNKNOWN\",\"message\":\"manifest unknown\",\"detail\":{\"Op\":\"Get\",\"URL\":\"https://upstream-registry.faraway.example.intra:443/v2/\",\"Err\":{\"Cert\":{\"Raw\":\"123...",\"RawTBSCertificate\":\"456HGB...",\"RawSubject\":\"hu32DSa...\",\"RawIssuer\":\"hu32DSa...\",\"Signature\":\"3242zaSDJUSIHI...\",\"SignatureAlgorithm\":4,\"PublicKeyAlgorithm\":1,\"PublicKey\":{\"N\":32456347GHGGSD...,\"E\":45436},\"Version\":3,\"SerialNumber\":324235345,\"Issuer\":{\"Country\":[\"XC\"],\"Organization\":[\"EXAMLE\",\"Foo Bar\"],\"OrganizationalUnit\":[\"PKI\"],\"Locality\":null,\"Province\":null,\"StreetAddress\":null,\"PostalCode\":null,\"SerialNumber\":\"\",\"CommonName\":\"Root-CA 2016\",\"Names\":[{\"Type\":[2,5,4,3],\"Value\":\"Root-CA 2016\"},{\"Type\":[2,5,4,11],\"Value\":\"PKI\"},{\"Type\":[2,5,4,10],\"Value\":\"Foo Bar\"},{\"Type\":[2,5,4,10],\"Value\":\"EXAMPLE\"},{\"Type\":[2,5,4,6],\"Value\":\"XC\"}],\"ExtraNames\":null},\"Subject\":{\"Country\":[\"XC\"],\"Organization\":[\"EXAMPLE\",\"Foo Bar\"],\"OrganizationalUnit\":[\"PKI\"],\"Locality\":null,\"Province\":null,\"StreetAddress\":null,\"PostalCode\":null,\"SerialNumber\":\"\",\"CommonName\":\"Root-CA 2016\",\"Names\":[{\"Type\":[2,5,4,3],\"Value\":\"Root-CA 2016\"},{\"Type\":[2,5,4,11],\"Value\":\"PKI\"},{\"Type\":[2,5,4,10],\"Value\":\"EXAMPLE\"},{\"Type\":[2,5,4,10],\"Value\":\"Foo Bar\"},{\"Type\":[2,5,4,6],\"Value\":\"XC\"}],\"ExtraNames\":null},\"NotBefore\":\"2016-05-23T11:31:28Z\",\"NotAfter\":\"2023-05-23T11:31:28Z\",\"KeyUsage\":99,\"Extensions\":[{\"Id\":[2,5,29,14],\"Critical\":false,\"Value\":\"Hfuisdf...\"},{\"Id\":[2,5,29,19],\"Critical\":true,\"Value\":\"54huiweftHUI...\"},{\"Id\":[2,5,29,35],\"Critical\":false,\"Value\":\"fddSFUHfudsifdhfds...\"},{\"Id\":[2,5,29,15],\"Critical\":true,\"Value\":\"dfgdfgd453sd...\"}],\"ExtraExtensions\":null,\"UnhandledCriticalExtensions\":null,\"ExtKeyUsage\":null,\"UnknownExtKeyUsage\":null,\"BasicConstraintsValid\":true,\"IsCA\":true,\"MaxPathLen\":-1,\"MaxPathLenZero\":false,\"SubjectKeyId\":\"dghdfiogho38FHFO...\",\"AuthorityKeyId\":\"dghSDFHU385...\",\"OCSPServer\":null,\"IssuingCertificateURL\":null,\"DNSNames\":null,\"EmailAddresses\":null,\"IPAddresses\":null,\"PermittedDNSDomainsCritical\":false,\"PermittedDNSDomains\":null,\"ExcludedDNSDomains\":null,\"CRLDistributionPoints\":null,\"PolicyIdentifiers\":null}}}}]}\n"

The problem is, that certificates from upstream-registry.faraway.example.intra were missing in the trust store of docker-registry-default.openshift.example.intra

Following the below description did help resolve the problem.

Add the certificate from the upstream registry upstream-registry.faraway.example.intra to the trust store of Red Hat OpenShift Container Platform registry docker-registry-default.openshift.example.intra to trust external registries a pullthrough should work against. The certificates need to be placed in the /etc/pki/tls/certs directory on the pod. The certifiates can be mounted using a configuration map or secret. It's important to have the entire /etc/pki/tls/certs directory replaced with all it's current content and the newly required certificates.

The problem is, that the error messages reported by the registry is rather cryptic and hard to understand. It would be therefore nice if we can catch such a situation and provide more suitable information to the customer to ease troubleshooting and allow them to quickly address the problem.

Version-Release number of selected component (if applicable):

 - registry.access.redhat.com/openshift3/ose-docker-registry:v3.7.44-3 

How reproducible:

 - Always

Steps to Reproduce:
1. As mentioned in the problem description. Certificate of the upstream
   registry should be missing in the trust store of the Red Hat OpenShift Container Platform Registry. Importing ImageStream and running `docker pull ...` will cause the behavior seen

Actual results:

A rather cryptic and hard to understand error is reported

Expected results:

Providing a hint in the error about the root cause to ease troubleshooting.

Additional info:

Comment 1 Oleg Bulatov 2018-06-05 13:54:23 UTC
https://github.com/openshift/image-registry/pull/86

Comment 3 Dongbo Yan 2018-06-11 11:13:13 UTC
Verified
openshift v3.10.0-0.64.0
kubernetes v1.10.0+b81c8f8

Reproduce steps:
1.Create an external secure registry
2.Add certificate from external secure registry into openshift cluster, import image from external registry
3.Remove the certificate, docker pull imported image from openshift internal registry

Actual results:
# docker pull docker-registry.default.svc:5000/dyan/busy2
Using default tag: latest
Trying to pull repository docker-registry.default.svc:5000/dyan/busy2 ... 
unknown: unable to pull manifest from dyan-registry.usersys.redhat.com/test/busybox:latest: Get https://dyan-registry.usersys.redhat.com/v2/: x509: certificate signed by unknown authority


Additional info:
When add certificate into registry pod via secret, could pull image successfully

# docker pull docker-registry.default.svc:5000/dyan/busy2
Using default tag: latest
Trying to pull repository docker-registry.default.svc:5000/dyan/busy2 ... 
latest: Pulling from docker-registry.default.svc:5000/dyan/busy2
07a152489297: Pull complete 
Digest: sha256:74f634b1bc1bd74535d5209589734efbd44a25f4e2dc96d78784576a3eb5b335
Status: Downloaded newer image for docker-registry.default.svc:5000/dyan/busy2:latest

Comment 5 errata-xmlrpc 2018-07-30 19:15:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816