Bug 1704874 - Builds with additional trusted CA stuck in Pending state
Summary: Builds with additional trusted CA stuck in Pending state
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.1.0
Assignee: Adam Kaplan
QA Contact: Wenjing Zheng
URL:
Whiteboard:
Depends On: 1703941
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-30 17:36 UTC by Adam Kaplan
Modified: 2019-06-04 10:48 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1703941
Environment:
Last Closed: 2019-06-04 10:48:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:48:24 UTC

Description Adam Kaplan 2019-04-30 17:36:39 UTC
Similar root cause as Bug #1703941 - an API change to the service signing CA injection annotation broke builds.

+++ This bug was initially created as a clone of Bug #1703941 +++

Description of problem:
Met below error if try to import CA file via adding additionalTrustedCA with image.config.openshift.io cluster:
I0429 07:29:46.019269       1 apiserver.go:151] reading image import ca path: /var/run/configmaps/image-import-ca/..2019_04_29_07_29_35.208155455/docker-registry-default.apps.0429-1ow.qe.rhcloud.com, incoming err: <nil>
I0429 07:29:46.019407       1 apiserver.go:151] reading image import ca path: /var/run/configmaps/image-import-ca/docker-registry-default.apps.0429-1ow.qe.rhcloud.com, incoming err: <nil>
I0429 07:29:46.019412       1 apiserver.go:156] skipping dir or symlink: /var/run/configmaps/image-import-ca/docker-registry-default.apps.0429-1ow.qe.rhcloud.com

Version-Release number of selected component (if applicable):
4.1.0-0.nightly-2019-04-28-064010

How reproducible:
Always

Steps to Reproduce:
1.$ oc create configmap registry-config --from-file=docker-registry-default.apps.0429-1ow.qe.rhcloud.com=ca.crt -n openshift-config
2.$ oc edit image.config.openshift.io cluster
spec:
  additionalTrustedCA:
    name: registry-config
3.Watch apiserver pods:
$oc get pods -n openshift-apiserver
$oc logs pods/apiserver-lpv8m -n openshift-apiserver | grep docker-registry

Actual results:
The CA cannot be imported with errors in api pod

Expected results:
Should can be imported.

Additional info:

--- Additional comment from Oleg Bulatov on 2019-04-29 12:00:04 UTC ---

It seems it successfully imported `/var/run/configmaps/image-import-ca/..2019_04_29_07_29_35.208155455/docker-registry-default.apps.0429-1ow.qe.rhcloud.com`.

Have you checked that the api server gets x509 errors when it imports from this registry?

--- Additional comment from Ben Parees on 2019-04-29 13:32:30 UTC ---

The message is informational, it's indicating what CA was read, and that no error ("nil") was encountered in the process.

I think this is working correctly unless as Oleg asked, the import is not actually succeeding.  Moving to QA to verify.

--- Additional comment from Wenjing Zheng on 2019-04-30 07:47:15 UTC ---

OK, seems cluster has picked up the CA file, since image can be imported with the error exists in api pod.

However it is not synced to image-registry pod, since I cannot see it as below:
$ oc rsh image-registry-77dc78779f-kngmp
sh-4.2$ ls /etc/pki/ca-trust/source/anchors
service-ca.crt

So build and pod cannot be running with 509 error:
  Warning  Failed     14s                kubelet, ip-172-31-146-161.eu-west-2.compute.internal  Failed to pull image "image-registry.openshift-image-registry.svc:5000/openshift-image-registry/myimage@sha256:8d750876687d9fb0adf46020e38cd43165b906208d2e81088b3a13213e751df3": rpc error: code = Unknown desc = Error reading manifest sha256:8d750876687d9fb0adf46020e38cd43165b906208d2e81088b3a13213e751df3 in image-registry.openshift-image-registry.svc:5000/openshift-image-registry/myimage: unknown: unable to pull manifest from docker-registry-default.apps.0430-usw.qe.rhcloud.com/test/myimage:latest: Get https://docker-registry-default.apps.0430-usw.qe.rhcloud.com/v2/: x509: certificate signed by unknown authority

Build is pending with below error:
Events:
  Type		Reason		Age			From							Message
  ----		------		----			----							-------
  Normal	Scheduled	65s			default-scheduler					Successfully assigned openshift-image-registry/ruby-hello-world-2-build to ip-172-31-146-161.eu-west-2.compute.internal
  Warning	FailedMount	1s (x8 over 65s)	kubelet, ip-172-31-146-161.eu-west-2.compute.internal	MountVolume.SetUp failed for volume "build-ca-bundles" : configmap references non-existent config key: docker-registry-default.apps.0430-usw.qe.rhcloud.com

--- Additional comment from Wenjing Zheng on 2019-04-30 09:20 UTC ---



--- Additional comment from Wenjing Zheng on 2019-04-30 09:21 UTC ---



--- Additional comment from Wenjing Zheng on 2019-04-30 09:21 UTC ---



--- Additional comment from Wenjing Zheng on 2019-04-30 09:24 UTC ---



--- Additional comment from Adam Kaplan on 2019-04-30 11:51:08 UTC ---

@Wenjing Please provide the following:
1. The build and build pod YAML
2. The ConfigMaps in the build pod's namespace.

I'm concerned about the build pod not starting because of a non-existent ConfigMap key.

@Oleg does the registry need a copy of the trusted certs to do pull-through?

Comment 3 Adam Kaplan 2019-05-02 14:53:56 UTC
PR for cluster-openshift-controller-manager-operator: https://github.com/openshift/cluster-openshift-controller-manager-operator/pull/96

Comment 4 Adam Kaplan 2019-05-03 01:19:33 UTC
PR to fix the build controller: https://github.com/openshift/origin/pull/22743

Comment 6 Wenjing Zheng 2019-05-05 09:09:15 UTC
Verified with below version:
$ oc rsh image-registry-799fd5b5cf-b9x8v
sh-4.2$ ls /etc/pki/ca-trust/source/anchors
docker-registry-default.apps.0505-387.qe.rhcloud.com  image-registry.openshift-image-registry.svc..5000  image-registry.openshift-image-registry.svc.cluster.local..5000

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-04-210601   True        False         85m     Cluster version is 4.1.0-0.nightly-2019-05-04-210601

Comment 8 errata-xmlrpc 2019-06-04 10:48:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.