Bug 1500464

Summary: 3.5.1 White spaces in the cert prevents Origin Metrics from starting
Product: OpenShift Container Platform Reporter: Juraci Paixão Kröhling <jcosta>
Component: HawkularAssignee: Juraci Paixão Kröhling <jcosta>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.5.1CC: aos-bugs, cbucur, erich, erjones, hgomes, jcantril, jcosta, juzhao, mwringe, pweil, snegrea, stwalter
Target Milestone: ---   
Target Release: 3.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
When either a certificate within the chain at `serviceaccount/ca.crt` or any of the certificates within the provided truststore file contain a white space after the `BEGIN CERTIFICATE` declaration, the Java keytool rejects the certificate with an error, causing Origin Metrics to fail to start. As a workaround, Origin Metrics will now attempt to remove the spaces before feeding the certificate to the Keytool, but admins should make sure their certificates don't contain such spaces.
Story Points: ---
Clone Of: 1471251 Environment:
Last Closed: 2017-12-07 07:12:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1471251, 1500471, 1503450    
Bug Blocks:    
Attachments:
Description Flags
log file
none
hawkular-metrics pod log none

Comment 3 Junqi Zhao 2017-10-16 13:19:24 UTC
Created attachment 1339227 [details]
log file

Comment 8 Junqi Zhao 2017-10-18 02:47:37 UTC
Tested with metrics-hawkular-metrics:3.5.0-47

Steps:
1. Change to "-----BEGIN CERTIFICATE-----  "(two spaces in the end) of /etc/origin/master/ca-bundle.crt.
2. Restart server and deploy metrics 3.5

Metrics pods are in running status,
# oc get po
NAME                         READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-kxv85   1/1       Running   0          1h
hawkular-metrics-vdllm       1/1       Running   0          1h
heapster-rlnzb               1/1       Running   0          1h

do sanity testing, metrics works well,but I can see warn info in hawkular-metrics pod logs like the followings, I think these info is normal, what do you think?
*********************************************************************
# oc logs hawkular-metrics-vdllm | grep -e error -e Exception
Could not start Jolokia agent: java.lang.IllegalStateException: Cannot use keystore for https communication: java.security.cert.CertificateException: Could not parse certificate: java.io.IOException: Illegal header: -----BEGIN CERTIFICATE-----  
2017-10-18 01:11:18,679 WARN  [org.hawkular.alerts.engine.impl.CassCluster] (ServerService Thread Pool -- 74) Could not connect to Cassandra cluster - assuming is not up yet. Cause: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: hawkular-cassandra/172.30.42.134:9042 (com.datastax.driver.core.exceptions.TransportException: [hawkular-cassandra/172.30.42.134:9042] Cannot connect))
2017-10-18 01:11:24,713 WARN  [org.hawkular.alerts.engine.impl.CassCluster] (ServerService Thread Pool -- 74) Could not connect to Cassandra cluster - assuming is not up yet. Cause: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: hawkular-cassandra/172.30.42.134:9042 (com.datastax.driver.core.exceptions.TransportException: [hawkular-cassandra/172.30.42.134:9042] Cannot connect))
2017-10-18 01:11:27,723 WARN  [org.hawkular.alerts.engine.impl.CassCluster] (ServerService Thread Pool -- 74) Could not connect to Cassandra cluster - assuming is not up yet. Cause: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: hawkular-cassandra/172.30.42.134:9042 (com.datastax.driver.core.exceptions.TransportException: [hawkular-cassandra/172.30.42.134:9042] Cannot connect))
2017-10-18 01:24:30,102 WARN  [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-49) RESTEASY002142: Multiple resource methods match request "GET /". Selecting one. Matching methods: [public javax.ws.rs.core.Response org.hawkular.metrics.api.jaxrs.handler.BaseHandler.baseJSON(), public void org.hawkular.metrics.api.jaxrs.handler.BaseHandler.baseHTML(javax.servlet.ServletContext) throws java.lang.Exception]
2017-10-18 01:24:46,261 WARN  [org.jboss.resteasy.resteasy_jaxrs.i18n] (default task-5) RESTEASY002142: Multiple resource methods match request "GET /". Selecting one. Matching methods: [public javax.ws.rs.core.Response org.hawkular.metrics.api.jaxrs.handler.BaseHandler.baseJSON(), public void org.hawkular.metrics.api.jaxrs.handler.BaseHandler.baseHTML(javax.servlet.ServletContext) throws java.lang.Exception]
****************************************************************************
Attach the whole logs of hawkular-metrics pod

Comment 9 Junqi Zhao 2017-10-18 02:48:14 UTC
Created attachment 1339962 [details]
hawkular-metrics pod log

Comment 10 Junqi Zhao 2017-10-18 03:42:13 UTC
# openshift version
openshift v3.5.5.31.36
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Comment 11 Juraci Paixão Kröhling 2017-10-18 08:07:02 UTC
Except for the first one, it doesn't look like the messages are related to this BZ.

The BZ 1503462 was created to document the Jolokia issue and is closed as "WONTFIX". If there's a demand, we can come up with a workaround.

Comment 12 Junqi Zhao 2017-10-18 12:06:45 UTC
Verification steps:
1. Change to "-----BEGIN CERTIFICATE-----  "(two spaces in the end) of /etc/origin/master/ca-bundle.crt.
2. Restart server and deploy metrics 3.4
3. #oc rsh ${HAWKULAR_METRICS_PODS};
   sh-4.2$cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

   #oc rsh ${HAWKULAR_CASSANDRA_PODS};
   sh-4.2$cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

   #oc rsh ${HEAPSTER_PODS};
   sh-4.2$cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

/var/run/secrets/kubernetes.io/serviceaccount/ca.crt is the same with /etc/origin/master/ca-bundle.crt, all have one space in the end: "-----BEGIN CERTIFICATE----- "

4. Sanity testing of Metrics, it works well.

env:
# openshift version
openshift v3.5.5.31.36
kubernetes v1.5.2+43a9be4
etcd 3.1.0

image:
metrics-hawkular-metrics:3.5.0-47

Comment 13 Junqi Zhao 2017-10-18 12:07:42 UTC
(In reply to Junqi Zhao from comment #12)
> 2. Restart server and deploy metrics 3.4

change to
2. Restart server and deploy metrics 3.5

Comment 16 errata-xmlrpc 2017-12-07 07:12:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3389