Bug 1565736 - cluster registries have unnecessary environment values
Summary: cluster registries have unnecessary environment values
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Image Registry
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.10.0
Assignee: Alexey Gladkov
QA Contact: Dongbo Yan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-10 15:59 UTC by Gabor Burges
Modified: 2018-07-30 19:13 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1570845 1570859 (view as bug list)
Environment:
Last Closed: 2018-07-30 19:12:35 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 None None None 2018-07-30 19:13:02 UTC

Description Gabor Burges 2018-04-10 15:59:44 UTC
Description of problem:
After the 3.7 openshift upgrade, the registry failed to communicate with the pods, leavin build in Error or imagepullbackoff state

After clearing the registry dcs' old env variables became clear which token the registry actually picks up and uses


Version-Release number of selected component (if applicable):
oc version
oc v3.7.23
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift v3.7.23
kubernetes v1.7.6+a08f5eeb62


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
builds are failing on the registry because of mis-match between env vars and secrets caused incorrect variables to be picked up by the registry.


Expected results:


Additional info:

Comment 2 Ben Parees 2018-04-10 16:57:48 UTC
the openshift ansible installer for OCP needs to be updated to properly migrate existing registry deployments by doing the following steps (in addition, the ops installer needs to start using the OCP installer).  The fix will need to be backported to v3.7+.



1) VERIFY: docker-registry is using the ‘registry’ service account
RUN: oc get dc -n default docker-registry -o json | jq ".spec.template.spec.serviceAccount"

	IF NOT, ensure the registry service account exists:
	oc create serviceaccount -n default registry
AND set the service account for the docker-registry deployment to the registry service account.


2) VERIFY registry SA can update imagestreams in all namespaces
RUN: oc policy who-can update imagestreams --all-namespaces
VERIFY: system:serviceaccount:default:registry appears in list of allowed users:
# oc policy who-can update imagestreams --all-namespaces | grep system:serviceaccount:default:registry


IF NOT, grant the system:registry role and reverify:
oc create clusterrolebinding system:registry \
  --clusterrole=system:registry \
  --serviceaccount=default:registry

3) remove legacy env variables from the registry deploymentconfig if present:
	`oc edit -n default dc docker-registry`
OPENSHIFT_MASTER
OPENSHIFT_CA_DATA
OPENSHIFT_CERT_DATA
OPENSHIFT_INSECURE
OPENSHIFT_MASTER
KUBERNETES_MASTER
OPENSHIFT_CERT_FILE
OPENSHIFT_CA_FILE
BEARER_TOKEN
BEARER_TOKEN_FILE
OPENSHIFT_KEY_FILE
OPENSHIFT_KEY_DATA
(from https://github.com/openshift/origin/blob/master/pkg/client/cmd/clientcmd.go#L161-L202)

Comment 4 Ben Parees 2018-04-10 21:12:04 UTC
Jordan made the sensible suggestion that we should be able to just create/update all these things w/o checking anything:

1) create the registry SA
2) grant it all the right permissions
3) delete the env vars from the DC (the upgrade playbook already edits the DC to update the image tag to the new version).

So hopefully this isn't *that* terrible to implement.

Comment 5 Alexey Gladkov 2018-04-23 09:19:08 UTC
The fix has merged:

https://github.com/openshift/openshift-ansible/pull/8020

Comment 6 Alexey Gladkov 2018-04-23 09:23:00 UTC
This fix will be applied when upgrading to v3.10. The openshift-ansible do not have upgrade scripts for v3.8. Should I try to create it ?

Comment 7 Matthew Barnes 2018-05-04 12:50:34 UTC
(In reply to Ben Parees from comment #2)
> the openshift ansible installer for OCP needs to be updated to properly
> migrate existing registry deployments by doing the following steps (in
> addition, the ops installer needs to start using the OCP installer).

I see the fix was integrated into the upgrade_control_plane.yml playbook, which the ops installer already calls [1].  So Operations will pick this up as it's backported.  Thanks!


[1] https://github.com/openshift/openshift-ansible-ops/blob/prod/playbooks/release/bin/cicd_operations.sh#L356

Comment 9 Dongbo Yan 2018-05-24 02:32:04 UTC
Verified
openshift v3.10.0-0.50.0
kubernetes v1.10.0+b81c8f8

could build successfully after upgrading cluster from 3.9 to 3.10, and registry pod has no unnecessary env var
# oc describe po/docker-registry-2-8m52l
    
    Environment:
      REGISTRY_HTTP_ADDR:                                     :5000
      REGISTRY_HTTP_NET:                                      tcp
      REGISTRY_HTTP_SECRET:                                   IFu5DeOwZxq5jQ75kjYqYKZhD4kXiZaK+UZ1poEAa+o=
      REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA:  false
      REGISTRY_HTTP_TLS_KEY:                                  /etc/secrets/registry.key
      REGISTRY_OPENSHIFT_SERVER_ADDR:                         docker-registry.default.svc:5000
      REGISTRY_CONFIGURATION_PATH:                            /etc/registry/config.yml
      REGISTRY_HTTP_TLS_CERTIFICATE:                          /etc/secrets/registry.crt

Comment 11 errata-xmlrpc 2018-07-30 19:12:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.