Bug 1367690 - metrics deployer mode=refresh fails with 'validating the internal hawkular-metrics certificate against the route destination CA'
Summary: metrics deployer mode=refresh fails with 'validating the internal hawkular-me...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Matt Wringe
QA Contact: chunchen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-17 08:57 UTC by Peng Li
Modified: 2016-09-30 02:16 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-27 09:44:34 UTC
Target Upstream Version:


Attachments (Terms of Use)
sample log for metrics deployer log (26.74 KB, text/plain)
2016-08-17 08:57 UTC, Peng Li
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1933 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.3 Release Advisory 2016-09-27 13:24:36 UTC

Description Peng Li 2016-08-17 08:57:48 UTC
Created attachment 1191496 [details]
sample log for metrics deployer log

Description of problem:
metrics deployer mode=refresh fails with 'validating the internal hawkular-metrics certificate against the route destination CA'
 
Version-Release number of selected component (if applicable):
[peng@dhcp-0-123-nay-redhat-com 33]$ oc version
oc v3.3.0.19
kubernetes v1.3.0+507d3a7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://host-8-172-83.host.centralci.eng.rdu2.redhat.com:8443
openshift v3.3.0.21
kubernetes v1.3.0+507d3a7

metrics-deployer  "3.3.0": "f776b79db884c4b8291722a2cdc845cbc641362b610c11fbed6a866514df4a58",

How reproducible:
sometimes

Steps to Reproduce:
1. deploy metrics component in 'openshift-infra' project as 'deploy' mode.[1]
2. change mode to 'refresh' and run deployer again.[2]
3. after finished, check pod status and log.[3]

[1]
oc new-app metrics-deployer-template -p IMAGE_PREFIX=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/,IMAGE_VERSION=3.3.0,MASTER_URL=https://host-8-172-83.host.centralci.eng.rdu2.redhat.com:8443,HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.0817-4og.qe.rhcloud.com,MODE=deploy,USE_PERSISTENT_STORAGE=false,CASSANDRA_NODES=1,CASSANDRA_PV_SIZE=10,USER_WRITE_ACCESS=false

[peng@dhcp-0-123-nay-redhat-com 33]$ oc get pod
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-8p7zt   1/1       Running     0          3m
hawkular-metrics-yv2rr       1/1       Running     0          3m
heapster-4b51w               1/1       Running     0          3m
metrics-deployer-yq72d       0/1       Completed   0          3m

[2]
oc new-app metrics-deployer-template -p IMAGE_PREFIX=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/,IMAGE_VERSION=3.3.0,MASTER_URL=https://host-8-172-83.host.centralci.eng.rdu2.redhat.com:8443,HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.0817-4og.qe.rhcloud.com,MODE=refresh,USE_PERSISTENT_STORAGE=false,CASSANDRA_NODES=1,CASSANDRA_PV_SIZE=10,USER_WRITE_ACCESS=false

[3]
[peng@dhcp-0-123-nay-redhat-com 33]$ oc get pod
NAME                         READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-8fx96   1/1       Running   0          35m
hawkular-metrics-jeyoi       1/1       Running   0          35m
heapster-h9wwd               1/1       Running   0          35m
metrics-deployer-139ii       1/1       Error     0          35m
[peng@dhcp-0-123-nay-redhat-com 33]$ oc logs metrics-deployer-139ii 
(...)
--- validate_deployment_artifacts ---
======== ERROR =========
validate_deployment_artifacts: 
---
There was an error while validating the internal hawkular-metrics certificate against the route destination CA:
stdin: CN = hawkular-metrics
error 20 at 0 depth lookup:unable to get local issuer certificate
This will prevent proper functioning of the route.
========================
--- validate_deployed_project ---

VALIDATION FAILED
(...)


Actual results:
metrics-deployer-***** pod show status 'error', and could not access hawkular-metrics, when access it, show error 503.

Expected results:
metrics-deployer-***** pod show status 'completed' 

Additional info:

Comment 1 Matt Wringe 2016-08-17 23:09:08 UTC
I have looked into this issue and its more than just an issue with the validator misbehaving. I should have a fix in place tomorrow to resolve this.

Comment 3 Peng Li 2016-08-19 05:56:00 UTC
Bug is verified, tried several times using 'refresh' mode, no error is observed.

[1]oc new-app metrics-deployer-template -p IMAGE_PREFIX=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/,IMAGE_VERSION=3.3.0,MASTER_URL=https://host-8-172-83.host.centralci.eng.rdu2.redhat.com:8443,HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.0817-4og.qe.rhcloud.com,MODE=refresh,USE_PERSISTENT_STORAGE=false,CASSANDRA_NODES=1,CASSANDRA_PV_SIZE=10,USER_WRITE_ACCESS=false


[2]
[peng@dhcp-0-123-nay-redhat-com 33]$ oc describe pod metrics-deployer-dqmw6
(...)
Containers:
  deployer:
    Container ID:	docker://22ea8e6346f5460bcab321caa1e4331c8403d0a7eba5722337c2b47035cd6231
    Image:		brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/metrics-deployer:3.3.0
    Image ID:		docker://sha256:d2564383e350e470496628b7e79247f2f2442b768ea5f3d70e37ed5a65208e09
(...)

[3]
[peng@dhcp-0-123-nay-redhat-com 33]$ oc get pod
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-dnj4n   1/1       Running     0          6m
hawkular-metrics-ok7sv       1/1       Running     0          6m
heapster-0rpm2               1/1       Running     0          6m
metrics-deployer-ltqjc       0/1       Completed   0          7m

[4]
oc logs metrics-deployer-ltqjc
(...)
VALIDATION SUCCEEDED
validate_nodes_accessible: ok
validate_deployment_artifacts: ok
validate_deployed_project:
Success!
(...)

Comment 5 errata-xmlrpc 2016-09-27 09:44:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933

Comment 6 Matt Wringe 2016-09-27 14:30:17 UTC
Did not affect a released version.


Note You need to log in before you can comment on or make changes to this bug.