Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1608216 - Should change timeoutSeconds for hawkular-cassandra readiness check to a bigger value
Should change timeoutSeconds for hawkular-cassandra readiness check to a bigg...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular (Show other bugs)
3.10.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.11.0
Assigned To: John Sanda
Junqi Zhao
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-07-25 02:40 EDT by Junqi Zhao
Modified: 2018-10-11 03:22 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-11 03:22:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:22 EDT

  None (edit)
Description Junqi Zhao 2018-07-25 02:40:11 EDT
Description of problem:
This bug is from Bug 1607984,  the default timeoutSeconds for hawkular-cassandra readiness check is 1 second, but if the readiness check takes more than 1 second to get the response,
metrics pods could not started up

# oc get pod -n openshift-infra
NAME                            READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-njcbq      0/1       Running   0          1h
hawkular-metrics-642hp          0/1       Running   8          1h
hawkular-metrics-schema-4k4hj   1/1       Running   0          1h
heapster-lmc8m                  0/1       Running   9          1h


# oc rsh hawkular-cassandra-1-njcbq
sh-4.2$ time nodetool status
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.131.0.64  103.11 KB  256          100.0%            df669d60-a338-4057-a4c2-00cf92b6291b  rack1


real    0m1.499s
user    0m2.417s
sys    0m0.187s



sh-4.2$ time nodetool help 
Picked up JAVA_TOOL_OPTIONS: -Duser.home=/home/jboss -Duser.name=jboss
<--snip >

See 'nodetool help <command>' for more information on a specific command.


real    0m1.626s
user    0m1.807s
sys    0m0.133s



after changing it to bigger value in roles/openshift_metrics/templates/hawkular_cassandra_rc.j2, added timeoutSeconds: 10, metrics works well.

        readinessProbe:
          exec:
            command:
            - "/opt/apache-cassandra/bin/cassandra-docker-ready.sh"
          timeoutSeconds: 10


Version-Release number of selected component (if applicable):
# rpm -qa | grep openshift-ansible
openshift-ansible-roles-3.10.14-1.git.273.a64b86b.el7.noarch
openshift-ansible-playbooks-3.10.14-1.git.273.a64b86b.el7.noarch
openshift-ansible-3.10.14-1.git.273.a64b86b.el7.noarch
openshift-ansible-docs-3.10.14-1.git.273.a64b86b.el7.noarch

openshift3-metrics-cassandra-v3.10.14-7
metrics-hawkular-metrics-v3.10.14-7
metrics-schema-installer-v3.10.14-7
metrics-heapster-v3.10.14-8


How reproducible:
Always

Steps to Reproduce:
1. Deploy metrics
2.
3.

Actual results:
metrics pods could not started up

Expected results:
metrics pods can start up

Additional info:
Comment 1 Ruben Vargas Palma 2018-08-03 12:53:56 EDT
I've sent a PR

https://github.com/openshift/openshift-ansible/pull/9417

Which is already merged, I'll move this to MODIFIED.
Comment 3 Junqi Zhao 2018-08-23 03:26:22 EDT
Issue is fixed,  timeoutSeconds for hawkular-cassandra readiness check is 10s

# rpm -qa | grep ansible
ansible-2.6.3-1.el7ae.noarch
openshift-ansible-playbooks-3.11.0-0.20.0.git.0.ec6d8caNone.noarch
openshift-ansible-roles-3.11.0-0.20.0.git.0.ec6d8caNone.noarch
openshift-ansible-3.11.0-0.20.0.git.0.ec6d8caNone.noarch
openshift-ansible-docs-3.11.0-0.20.0.git.0.ec6d8caNone.noarch
Comment 5 errata-xmlrpc 2018-10-11 03:22:06 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.