Bug 1445568 - hawkular-metrics pod is CrashLoopBackOff after metrics 3.6.0 was deployed
hawkular-metrics pod is CrashLoopBackOff after metrics 3.6.0 was deployed
Status: VERIFIED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Metrics (Show other bugs)
3.6.0
Unspecified Unspecified
high Severity high
: ---
: 3.6.z
Assigned To: John Sanda
Junqi Zhao
aos-scalability-36
:
: 1451909 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-04-25 21:30 EDT by Junqi Zhao
Modified: 2018-02-28 11:07 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
metrics ansible deploy log (884.12 KB, text/plain)
2017-04-25 21:30 EDT, Junqi Zhao
no flags Details
metrics ansible inventory file (553 bytes, text/plain)
2017-04-25 21:33 EDT, Junqi Zhao
no flags Details

  None (edit)
Description Junqi Zhao 2017-04-25 21:30:32 EDT
Created attachment 1274060 [details]
metrics ansible deploy log

Description of problem:
hawkular-metrics pod is CrashLoopBackOff after metrics 3.6.0 was deployed. Error "openssl: command not found" in hawkular-metrics pod's log.

# oc get po
NAME                         READY     STATUS             RESTARTS   AGE
hawkular-cassandra-1-kf3jx   1/1       Running            0          16m
hawkular-metrics-ct9xw       0/1       CrashLoopBackOff   7          16m
heapster-xwwlw               0/1       Running            1          16m

# oc logs hawkular-metrics-ct9xw
2017-04-26 00:53:26 Starting Hawkular Metrics
/opt/hawkular/scripts/hawkular-metrics-wrapper.sh: line 49: openssl: command not found
/opt/hawkular/scripts/hawkular-metrics-wrapper.sh: line 53: openssl: command not found
The service account has read permissions for its project. Proceeding
/opt/hawkular/scripts/hawkular-metrics-wrapper.sh: line 104: openssl: command not found
Creating the Hawkular Metrics keystore from the Secret's cert data
Failed to create a PKCS12 certificate file with the service-specific certificate. Aborting.

# openssl version
OpenSSL 1.0.1e-fips 11 Feb 2013

Version-Release number of selected component (if applicable):
# oc version
oc v3.6.49
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

# rpm -qa | grep openshift-ansible
openshift-ansible-docs-3.6.37-1.git.0.e19f6d8.el7.noarch
openshift-ansible-lookup-plugins-3.6.37-1.git.0.e19f6d8.el7.noarch
openshift-ansible-3.6.37-1.git.0.e19f6d8.el7.noarch
openshift-ansible-callback-plugins-3.6.37-1.git.0.e19f6d8.el7.noarch
openshift-ansible-roles-3.6.37-1.git.0.e19f6d8.el7.noarch
openshift-ansible-filter-plugins-3.6.37-1.git.0.e19f6d8.el7.noarch
openshift-ansible-playbooks-3.6.37-1.git.0.e19f6d8.el7.noarch

# docker images | grep metrics
registry.ops.openshift.com/openshift3/metrics-hawkular-metrics   3.6.0               12f3f49d713a        6 days ago          1.293 GB
registry.ops.openshift.com/openshift3/metrics-cassandra          3.6.0               fe1b71caa3bf        7 days ago          545.2 MB
registry.ops.openshift.com/openshift3/metrics-heapster           3.6.0               0fa183f8e8ff        2 weeks ago         273.8 MB

How reproducible:
Always

Steps to Reproduce:
1.Deploy metrics 3.6.0 stacks on OCP 3.6.0 by running ansible scripts
2.
3.

Actual results:
hawkular-metrics pod is CrashLoopBackOff

Expected results:
All metrics pod are running well and without errors

Additional info:
Attached ansible inventory file, running log
Comment 1 Junqi Zhao 2017-04-25 21:33 EDT
Created attachment 1274061 [details]
metrics ansible inventory file
Comment 2 Junqi Zhao 2017-04-25 21:34:52 EDT
Metrics is deployed by:

ansible-playbook -vvv -i ${INVENTORY_FILE} playbooks/byo/openshift-cluster/openshift-metrics.yml
Comment 3 Juraci Paixão Kröhling 2017-05-02 06:00:07 EDT
Looks like the Alpha 0 Dockerfile for Hawkular Metrics didn't include OpenSSL:

https://github.com/openshift/origin-metrics/blob/v3.6.0-alpha.0/hawkular-metrics/Dockerfile#L80

This is present for Alpha 1 though:

https://github.com/openshift/origin-metrics/blob/v3.6.0-alpha.1/hawkular-metrics/Dockerfile#L80
Comment 4 Matt Wringe 2017-05-02 10:52:26 EDT
Sorry, I had this fixed in our brew build a few days ago but never got around to verifying it. I have since confirmed that it works for me now, can you please retry?
Comment 8 Xia Zhao 2017-05-04 01:44:12 EDT
@mwringe,

Retested with images tag=v3.6 on brew registry, issue had been fixed well. The metrics pods are running fine also the metrics statistics are visible on web console. Please feel free to change back to ON_QA for closure.

Images tested with:
openshift3/metrics-cassandra    58aedf976616
openshift3/metrics-hawkular-metrics    a2d906e06f22
openshift3/metrics-heapster    99ceffab1a79

# oc get po
NAME                         READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-l1xwt   1/1       Running   0          7m
hawkular-metrics-h2q7j       1/1       Running   0          7m
heapster-d9nbv               1/1       Running   0          7m
Comment 9 Xia Zhao 2017-05-04 01:44:40 EDT
# openshift version
openshift v3.6.63
kubernetes v1.6.1+5115d708d7
etcd 3.1.0
Comment 10 Xia Zhao 2017-05-09 02:08:41 EDT
@mwringe,
The issue was resolved and we have test passed this scenario per comment #8. Please feel free to change back to ON_QA for closure.
Comment 11 Xia Zhao 2017-05-09 22:07:08 EDT
Set to verified according to comment #8.
Comment 12 Mike Fiedler 2017-05-17 16:19:17 EDT
*** Bug 1451909 has been marked as a duplicate of this bug. ***
Comment 25 Jaroslav Henner 2017-08-02 14:08:37 EDT
Hi. I still see this problem with the image pulled from brew.
Comment 26 Junqi Zhao 2017-08-02 21:44:55 EDT
(In reply to Jaroslav Henner from comment #25)
> Hi. I still see this problem with the image pulled from brew.


Please set openshift_metrics_image_version=v3.6 in inventory file, do not set 3.6.0, since images with 3.6.0 tag are not the latest, images with v3.6 tag are the latest.

We don't have this issue in our functional testing now, if you still find this issue in your performance testing, please open one defect.
Comment 27 Jaroslav Henner 2017-08-08 01:53:55 EDT
(In reply to Junqi Zhao from comment #26)
> (In reply to Jaroslav Henner from comment #25)
> > Hi. I still see this problem with the image pulled from brew.
> 
> 
> Please set openshift_metrics_image_version=v3.6 in inventory file, do not
> set 3.6.0, since images with 3.6.0 tag are not the latest, images with v3.6
> tag are the latest.
> 
> We don't have this issue in our functional testing now, if you still find
> this issue in your performance testing, please open one defect.

Thanks. It seems it worked. Should the same image version be used with OSES 3.5?
Comment 28 Matt Wringe 2017-08-08 10:21:16 EDT
(In reply to Jaroslav Henner from comment #27)
> (In reply to Junqi Zhao from comment #26)
> Thanks. It seems it worked. Should the same image version be used with OSES
> 3.5?

For OCP 3.5, the 'v3.5' tag will be for the latest 3.5 image.

I believe the '3.5.0' tag also aliases to the latest 3.5 image (at least for now, but if there is ever a 3.5.1 release then the '3.5.1' tag will instead point to the latest version).

There are also specific tags that someone can use for 3.5, so if they don't want to automatically pull in the latest images, they can specify an exact image tag (eg 3.5.0-28).

The container catalog will show the tag options which can be used: https://access.redhat.com/containers/#/search/openshift3%252Fmetrics

Note You need to log in before you can comment on or make changes to this bug.