Description of problem: Customer has metrics components 3.4.1 running on a 3.4.0 ocp cluster and encountering an error message [0] when trying to view pod metrics from the metrics tab for the application. [0] Metrics are not available. An error occurred getting metrics for container <CONTAINER> from https://hawkular-metrics.$(HOSTNAME)/hawkular/metrics Customr has provided screenshots showing the errors from the browser's Network and the browser's Console inspect sections, I will be uploading those along with pod logs shortly.
From the logs it appears that there is a problem authenticating the user. In order to determine why OpenShift is rejecting the connection we will need to increase the logging level for out the Hawkular Metrics authentication filter. You will need to edit the 'hawkular-metrics' rc (eg 'oc edit rc hawkular-metrics') and under the 'env' section add the following: - name: ADDITIONAL_LOGGING value: org.hawkular.openshift.auth=DEBUG After you have made that change, save the rc and then scale down your hawkular metric instances to 0 and back up again. After this has been running for a few minutes, try to access the console page again and then attach the Hawkular Metrics logs to this bugzilla.
@Matt, use debug mode for hawkular-metrics, it throws out error in hawkular-metrics pod continuously, but hawkular metrics route could be accessed and CPU, memory usage could be shown in UI. - name: ADDITIONAL_LOGGING value: org.hawkular.openshift.auth=DEBUG,io.undertow=DEBUG About the CA chain, I rshed to hawkular-metrics pod, and execute the following commands: keytool -list -v -keystore /opt/hawkular/auth/hawkular-metrics.truststore -alias kubernetes-master -storepass `cat /secrets/hawkular-metrics.truststore.password` Output is icked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss Alias name: kubernetes-master Creation date: May 25, 2017 Entry type: trustedCertEntry Owner: CN=openshift-signer@1495620339 Issuer: CN=openshift-signer@1495620339 Serial number: 1 Valid from: Wed May 24 10:05:38 UTC 2017 until: Mon May 23 10:05:39 UTC 2022 Certificate fingerprints: MD5: 4B:13:7D:35:0F:44:92:82:97:C7:E4:2D:6A:7E:A7:54 SHA1: 2F:BC:C9:7F:D2:A1:F6:FF:FF:CC:D4:65:AF:97:6F:DA:23:EC:2C:17 SHA256: 24:EB:0A:3C:73:BD:0C:DE:83:73:88:CD:79:4D:17:7B:1C:48:F9:CC:DB:E2:98:2F:74:2B:66:C5:A4:DD:6A:EF Signature algorithm name: SHA256withRSA Version: 3 Extensions: #1: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:2147483647 ] #2: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ DigitalSignature Key_Encipherment Key_CertSign ]
Created attachment 1282132 [details] hawkular metric pod log
(In reply to Junqi Zhao from comment #29) > @Matt, > > use debug mode for hawkular-metrics, it throws out error in hawkular-metrics > pod continuously, but hawkular metrics route could be accessed and CPU, > memory usage could be shown in UI. > > - name: ADDITIONAL_LOGGING > value: org.hawkular.openshift.auth=DEBUG,io.undertow=DEBUG > > About the CA chain, I rshed to hawkular-metrics pod, and execute the > following commands: > keytool -list -v -keystore /opt/hawkular/auth/hawkular-metrics.truststore > -alias kubernetes-master -storepass `cat > /secrets/hawkular-metrics.truststore.password` > > Output is > > icked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss > Alias name: kubernetes-master > Creation date: May 25, 2017 > Entry type: trustedCertEntry > > Owner: CN=openshift-signer@1495620339 > Issuer: CN=openshift-signer@1495620339 > Serial number: 1 > Valid from: Wed May 24 10:05:38 UTC 2017 until: Mon May 23 10:05:39 UTC 2022 > Certificate fingerprints: > MD5: 4B:13:7D:35:0F:44:92:82:97:C7:E4:2D:6A:7E:A7:54 > SHA1: 2F:BC:C9:7F:D2:A1:F6:FF:FF:CC:D4:65:AF:97:6F:DA:23:EC:2C:17 > SHA256: > 24:EB:0A:3C:73:BD:0C:DE:83:73:88:CD:79:4D:17:7B:1C:48:F9:CC:DB:E2:98:2F:74: > 2B:66:C5:A4:DD:6A:EF > Signature algorithm name: SHA256withRSA > Version: 3 > > Extensions: > > #1: ObjectId: 2.5.29.19 Criticality=true > BasicConstraints:[ > CA:true > PathLen:2147483647 > ] > > #2: ObjectId: 2.5.29.15 Criticality=true > KeyUsage [ > DigitalSignature > Key_Encipherment > Key_CertSign > ] Sorry, what are you testing here exactly? You need to configure your openshift cluster so that its ca.crt doesn't contain just one certificate, but a chain of certificates (eg what the customer is currently using). You should not be configuring anything in the Hawkular Metrics pods at all, and you shouldn't need to modify the Hawkular logging levels. Once you have your OpenShift cluster configured to use a custom CA certificate which contains chain of certificate, you need to verify that it fails with the previous releases and then verify it works with the updated version mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1447463#c28
@Matt, Do you know how to configure a CA certificate which contains chain of certificate? We will try to verify it today.
(In reply to Junqi Zhao from comment #32) > @Matt, > > Do you know how to configure a CA certificate which contains chain of > certificate? We will try to verify it today. It can be tricky to setup. You need to create a CA chain (eg CA_A signs CA_B signs CA_C) which is not always easy to do. Once you have this CA you need to setup your OpenShift cluster to use this CA. You may want to talk with the security team to figure out how to do all of this.
I followed Jordan's instruction from https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/authentication/request/x509/testdata/generate.sh It requires cfssl tool, and I tried to install it with https://github.com/cloudflare/cfssl After the installation, and run the generate.sh, it can not found cfssl command. I have to use another method to generated the cert chain.
I tried to use another complicated method to create cert chain, but failed. Will try to solve this issue on May,31th
@Matt, Issue not re-produced with metrics-hawkular-metrics:3.4.1-19 Verify steps: 1. use the following script to generate CA cert chain client-valid.pem https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/authentication/request/x509/testdata/generate.sh 2. create customer cert openshift admin ca create-server-cert --key=a.key --cert=a.crt --hostnames='$hawkular_route' --signer-cert=/etc/origin/master/ca.crt --signer-key=/etc/origin/master/ca.key --signer-serial=/etc/origin/master/ca.serial.txt cat a.crt a.key /etc/origin/master/ca.crt > hm.pem cat /etc/origin/master/ca.crt client-valid.pem > hm-ca.cert 3. use customer cert and ca file to deploy metrics oc secrets new metrics-deployer hawkular-metrics.pem=hm.pem hawkular-metrics-ca.cert=hm-ca.cert https://hawkular-metrics.$(HOSTNAME)/hawkular/metrics could be accessed. Do you think if there is something wrong with the steps?
I did not reproduce this issue with metrics-hawkular-metrics:3.4.1-19 Steps: 1.Use the following script to generate CA cert chain client-valid.pem https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/authentication/request/x509/testdata/generate.sh 2. Configured Custom Certificates according to https://docs.openshift.com/container-platform/3.5/install_config/certificate_customization.html 3. Deployed metrics 3.4.1 on OCP 3.4.0 https://hawkular-metrics.$(HOSTNAME)/hawkular/metrics could be accessed. # openshift version openshift v3.4.0.39 kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 # docker images | grep metrics metrics-hawkular-metrics 3.4.1 2d882c0c14c2 3 days ago 1.282 GB metrics-heapster 3.4.1 92abaf97a255 3 days ago 318 MB metrics-cassandra 3.4.1 8caa095870e6 3 days ago 545.1 MB metrics-deployer 3.4.1 f16d60ab5198 3 days ago 893 MB metrics-hawkular-metrics 3.4.1-19 38c362f874c5 13 days ago 1.261 GB metrics-hawkular-metrics latest version is 3.4.1-22
(In reply to Junqi Zhao from comment #40) > I did not reproduce this issue with metrics-hawkular-metrics:3.4.1-19 > > Steps: > 1.Use the following script to generate CA cert chain client-valid.pem > > https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/ > apiserver/pkg/authentication/request/x509/testdata/generate.sh > > 2. Configured Custom Certificates according to > https://docs.openshift.com/container-platform/3.5/install_config/ > certificate_customization.html > > 3. Deployed metrics 3.4.1 on OCP 3.4.0 > > https://hawkular-metrics.$(HOSTNAME)/hawkular/metrics could be accessed. Its not about accessing this endpoint, are metrics being shown in the console?
yes, metrics could be shown in the console, I checked 1. Overview page 2. Metrics tab under each pods.
Sorry, the cert file is not a chain of certificates, it is just one certificates file, will use the certificate with chain to re-test
Testing steps: 1. edit the master/ca-bundle.crt, add one cert(root.pem) above and one cert(intermediate.pem) below, root.pem and intermediate.pem were generated by https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/authentication/request/x509/testdata/generate.sh 2. use images from registry.access.redhat.com/openshift3 to deploy metrics 3.4.1, metrics-hawkular-metrics version is 3.4.1-18. Check whether metrics can be viewed in the console -- yes, it can be viewed 3. remove metrics from the environment and use images from brew registryto deploy metrics 3.4.1, metrics-hawkular-metrics version is 3.4.1-24. Check whether metrics can be viewed in the console -- yes, it can be viewed Note: /root/Deploy_metrics_3.4.1.sh is used to deploy metrics, parameters value can be changed according to different configurations.
Issue reproduced by using images from registry.access.redhat.com/openshift3/, and issue was fixed by using images from brew registry. Reproduce steps: 1. Add the example certificate before and after /etc/origin/master/ca-bundle.crt. 2. Restart server and deploy metrics 3.4.1 by using images from registry.access.redhat.com/openshift3/. 3. oc exec -it ${HAWKULAR_METRICS_PODS}; cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt /var/run/secrets/kubernetes.io/serviceaccount/ca.crt is the same with /etc/origin/master/ca-bundle.crt. 4. Login web console, metrics can not be viewed. Verify issue fixed with brew images 1. Remove metrics 3.4.1 in the reproduce steps. 2. Deploy metrics 3.4.1 by using images from from brew registry. 3. Check metrics web console, metrics can be viewed in the console
Created attachment 1287489 [details] metrics can not be viewed when reproducing this issue
Created attachment 1287490 [details] reproduce steps of BZ#1447463
Created attachment 1287491 [details] metrics can be viewed with the image fix
Issue was fixed. Verification steps: 1. Add the example certificate in Comment 53 before and after /etc/origin/master/ca-bundle.crt. 2. Restart server and deploy metrics 3.4.1 by using images from brew registry. 3. oc rsh ${HAWKULAR_METRICS_PODS}; cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt /var/run/secrets/kubernetes.io/serviceaccount/ca.crt is the same with /etc/origin/master/ca-bundle.crt. 4. Login web console, metrics can be viewed. Testing env: # openshift version openshift v3.4.1.42 kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 Imags from brew registry # docker images | grep metrics metrics-deployer 3.4.1 bc7efb9c7533 34 hours ago 864.2 MB metrics-hawkular-metrics 3.4.1 4b7a8114d05c 34 hours ago 1.262 GB metrics-cassandra 3.4.1 8bb36cbbb82d 34 hours ago 572.1 MB metrics-heapster 3.4.1 d0cd20203416 34 hours ago 318 MB
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1640