Bug 1488059 - ClusterRegistry health check fails after installation
Summary: ClusterRegistry health check fails after installation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Unknown
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: ---
: 3.6.z
Assignee: Luke Meyer
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-04 08:41 UTC by Marko Myllynen
Modified: 2017-10-25 13:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The ClusterRegistry diagnostic checks the registry given to ImageStreams by default with the known registry service. It compares the IP, but with 3.6 the ImageStream now gets a cluster hostname for the registry instead of an IP. Consequence: The diagnostic reports a false error condition because the IP is not the same as the hostname. Fix: The diagnostic now checks if either of the hostname and IP version of the registry matches. Result: The diagnostic should again report correctly against either old deployments or new.
Clone Of:
Environment:
Last Closed: 2017-10-25 13:06:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3049 0 normal SHIPPED_LIVE OpenShift Container Platform 3.6, 3.5, and 3.4 bug fix and enhancement update 2017-10-25 15:57:15 UTC

Description Marko Myllynen 2017-09-04 08:41:35 UTC
Description of problem:
After a fresh OCP 3.6 installation oc adm diagostics complains:

[Note] Running diagnostic: ClusterRegistry
       Description: Check that there is a working Docker registry
       
ERROR: [DClu1019 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:343]
       Diagnostics created a test ImageStream and compared the registry IP
       it received to the registry IP available via the docker-registry service.
       
       docker-registry      : 172.30.22.245:5000
       ImageStream registry : docker-registry.default.svc:5000
       
       They do not match, which probably means that an administrator re-created
       the docker-registry service but the master has cached the old service
       IP address. Builds or deployments that use ImageStreams with the wrong
       docker-registry IP will fail under this condition.
       
       To resolve this issue, restarting the master (to clear the cache) should
       be sufficient. Existing ImageStreams may need to be re-created.

Chris Pitman explains:

The issue isn't that the registry has been restarted, it is that someone deleted the registry service then recreated it. The IP of the registry is cached, so if the service is recreated it gets a new IP address. The simple fix is to either (1) don't delete the registry service (there is no reason to) or (2) specify the service ip when recreating the registry service so that it uses the same IP.

Since this is right after installation it would seem that the installer is triggering this warning, thus filing against it initially.

Version-Release number of the following components:
# rpm -q openshift-ansible
openshift-ansible-3.6.173.0.5-3.git.0.522a92a.el7.noarch
# rpm -q ansible
ansible-2.2.3.0-1.el7.noarch
# ansible --version
ansible 2.2.3.0
  config file = /root/.ansible.cfg
  configured module search path = Default w/o overrides

Comment 1 Luke Meyer 2017-09-05 17:19:19 UTC
I need to do some work to confirm this but I suspect what happened here is that the ImageStream used to get its registry as an IP, and now it's getting it as a hostname, which of course is not the same string as the service IP. The diagnostic was intended to detect the situation cpitman explained, however if I'm right then that may not be a problem any longer (the registry URL won't change just because the service is re-created with a different IP) and the diagnostic could be retired.

Comment 2 Luke Meyer 2017-09-07 13:43:38 UTC
https://github.com/openshift/origin/pull/16188

Comment 3 openshift-github-bot 2017-09-07 20:16:44 UTC
Commits pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/5bbe044b3449f9cd2da4f6434d884688a61ae67b
ClusterRegistry diagnostic: fix address mismatch

The registry for imagestreams was previously recorded as an IP, which
could change if the registry service were re-created.  Now it is a
cluster hostname, which should be unchanging even if re-created. Just
ensure it is the right hostname.

fixes bug 1488059
https://bugzilla.redhat.com/show_bug.cgi?id=1488059

https://github.com/openshift/origin/commit/5c677dcb72783bc8bad3de0bd3094075c3d5712d
Merge pull request #16188 from sosiouxme/20170906-oadm-diag-clusterregistry-bz1488059

Automatic merge from submit-queue (batch tested with PRs 14825, 15756, 16178, 16188, 16189)

ClusterRegistry diagnostic: fix address mismatch

The registry for imagestreams was previously recorded as an IP, which
could change if the registry service were re-created.  Now it is a
cluster hostname, which should be unchanging even if re-created. Just
ensure it is the right hostname.

fix bug 1488059
https://bugzilla.redhat.com/show_bug.cgi?id=1488059

Comment 9 Johnny Liu 2017-10-13 06:12:49 UTC
Verified this bug with openshift v3.6.173.0.49, and PASS.


# oadm diagnostics ClusterRegistry
[Note] Determining if client configuration exists for client/cluster diagnostics
Info:  Successfully read a client config file at '/root/.kube/config'
Info:  Using context for cluster-admin access: 'default/host-8-241-49-host-centralci-eng-rdu2-redhat-com:8443/system:admin'

[Note] Running diagnostic: ClusterRegistry
       Description: Check that there is a working Docker registry
       
[Note] Summary of diagnostics execution (version v3.6.173.0.49):
[Note] Completed with no errors or warnings seen.

Comment 11 errata-xmlrpc 2017-10-25 13:06:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3049


Note You need to log in before you can comment on or make changes to this bug.