Created attachment 1274060 [details] metrics ansible deploy log Description of problem: hawkular-metrics pod is CrashLoopBackOff after metrics 3.6.0 was deployed. Error "openssl: command not found" in hawkular-metrics pod's log. # oc get po NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-kf3jx 1/1 Running 0 16m hawkular-metrics-ct9xw 0/1 CrashLoopBackOff 7 16m heapster-xwwlw 0/1 Running 1 16m # oc logs hawkular-metrics-ct9xw 2017-04-26 00:53:26 Starting Hawkular Metrics /opt/hawkular/scripts/hawkular-metrics-wrapper.sh: line 49: openssl: command not found /opt/hawkular/scripts/hawkular-metrics-wrapper.sh: line 53: openssl: command not found The service account has read permissions for its project. Proceeding /opt/hawkular/scripts/hawkular-metrics-wrapper.sh: line 104: openssl: command not found Creating the Hawkular Metrics keystore from the Secret's cert data Failed to create a PKCS12 certificate file with the service-specific certificate. Aborting. # openssl version OpenSSL 1.0.1e-fips 11 Feb 2013 Version-Release number of selected component (if applicable): # oc version oc v3.6.49 kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO # rpm -qa | grep openshift-ansible openshift-ansible-docs-3.6.37-1.git.0.e19f6d8.el7.noarch openshift-ansible-lookup-plugins-3.6.37-1.git.0.e19f6d8.el7.noarch openshift-ansible-3.6.37-1.git.0.e19f6d8.el7.noarch openshift-ansible-callback-plugins-3.6.37-1.git.0.e19f6d8.el7.noarch openshift-ansible-roles-3.6.37-1.git.0.e19f6d8.el7.noarch openshift-ansible-filter-plugins-3.6.37-1.git.0.e19f6d8.el7.noarch openshift-ansible-playbooks-3.6.37-1.git.0.e19f6d8.el7.noarch # docker images | grep metrics registry.ops.openshift.com/openshift3/metrics-hawkular-metrics 3.6.0 12f3f49d713a 6 days ago 1.293 GB registry.ops.openshift.com/openshift3/metrics-cassandra 3.6.0 fe1b71caa3bf 7 days ago 545.2 MB registry.ops.openshift.com/openshift3/metrics-heapster 3.6.0 0fa183f8e8ff 2 weeks ago 273.8 MB How reproducible: Always Steps to Reproduce: 1.Deploy metrics 3.6.0 stacks on OCP 3.6.0 by running ansible scripts 2. 3. Actual results: hawkular-metrics pod is CrashLoopBackOff Expected results: All metrics pod are running well and without errors Additional info: Attached ansible inventory file, running log
Created attachment 1274061 [details] metrics ansible inventory file
Metrics is deployed by: ansible-playbook -vvv -i ${INVENTORY_FILE} playbooks/byo/openshift-cluster/openshift-metrics.yml
Looks like the Alpha 0 Dockerfile for Hawkular Metrics didn't include OpenSSL: https://github.com/openshift/origin-metrics/blob/v3.6.0-alpha.0/hawkular-metrics/Dockerfile#L80 This is present for Alpha 1 though: https://github.com/openshift/origin-metrics/blob/v3.6.0-alpha.1/hawkular-metrics/Dockerfile#L80
Sorry, I had this fixed in our brew build a few days ago but never got around to verifying it. I have since confirmed that it works for me now, can you please retry?
@mwringe, Retested with images tag=v3.6 on brew registry, issue had been fixed well. The metrics pods are running fine also the metrics statistics are visible on web console. Please feel free to change back to ON_QA for closure. Images tested with: openshift3/metrics-cassandra 58aedf976616 openshift3/metrics-hawkular-metrics a2d906e06f22 openshift3/metrics-heapster 99ceffab1a79 # oc get po NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-l1xwt 1/1 Running 0 7m hawkular-metrics-h2q7j 1/1 Running 0 7m heapster-d9nbv 1/1 Running 0 7m
# openshift version openshift v3.6.63 kubernetes v1.6.1+5115d708d7 etcd 3.1.0
@mwringe, The issue was resolved and we have test passed this scenario per comment #8. Please feel free to change back to ON_QA for closure.
Set to verified according to comment #8.
*** Bug 1451909 has been marked as a duplicate of this bug. ***
Hi. I still see this problem with the image pulled from brew.
(In reply to Jaroslav Henner from comment #25) > Hi. I still see this problem with the image pulled from brew. Please set openshift_metrics_image_version=v3.6 in inventory file, do not set 3.6.0, since images with 3.6.0 tag are not the latest, images with v3.6 tag are the latest. We don't have this issue in our functional testing now, if you still find this issue in your performance testing, please open one defect.
(In reply to Junqi Zhao from comment #26) > (In reply to Jaroslav Henner from comment #25) > > Hi. I still see this problem with the image pulled from brew. > > > Please set openshift_metrics_image_version=v3.6 in inventory file, do not > set 3.6.0, since images with 3.6.0 tag are not the latest, images with v3.6 > tag are the latest. > > We don't have this issue in our functional testing now, if you still find > this issue in your performance testing, please open one defect. Thanks. It seems it worked. Should the same image version be used with OSES 3.5?
(In reply to Jaroslav Henner from comment #27) > (In reply to Junqi Zhao from comment #26) > Thanks. It seems it worked. Should the same image version be used with OSES > 3.5? For OCP 3.5, the 'v3.5' tag will be for the latest 3.5 image. I believe the '3.5.0' tag also aliases to the latest 3.5 image (at least for now, but if there is ever a 3.5.1 release then the '3.5.1' tag will instead point to the latest version). There are also specific tags that someone can use for 3.5, so if they don't want to automatically pull in the latest images, they can specify an exact image tag (eg 3.5.0-28). The container catalog will show the tag options which can be used: https://access.redhat.com/containers/#/search/openshift3%252Fmetrics