1361061 – Failed to show metrics on web console due to the clocks between machine running the console and the machine running Hawkular Metrics are out of sync

Bug 1361061 - Failed to show metrics on web console due to the clocks between machine running the console and the machine running Hawkular Metrics are out of sync

Summary: Failed to show metrics on web console due to the clocks between machine run...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Hawkular
Sub Component:
Version:	3.2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Samuel Padgett
QA Contact:	chunchen
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-28 09:39 UTC by chunchen
Modified:	2017-03-08 18:26 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: The web console previously used the client's clock to calculate the start time for displaying metrics. Consequence: If the client's clock was more than one hour faster than the server clock, an occur would occur opening the metrics tab in the web console. Fix: The web console now uses the server time for calculating start and end times for metrics. Result: Metrics will display properly even if the client clock is out of sync with the server.
Clone Of:
Environment:
Last Closed:	2016-09-27 09:41:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
console screenshot 2 (128.69 KB, image/png) 2016-08-01 02:09 UTC, chunchen	no flags	Details
console screenshot 1 (193.87 KB, image/png) 2016-08-01 02:09 UTC, chunchen	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:1933	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.3 Release Advisory	2016-09-27 13:24:36 UTC

Description chunchen 2016-07-28 09:39:40 UTC

Description of problem:
It's failed to show metrics on web console, the page response:

'{"errorMsg":"Range end must be strictly greater than start"}'

Version-Release number of selected component (if applicable):
openshift v3.2.1.4-1-g1864c8f
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

brew-pulp-docker01.web.qa...com:8888/openshift3/metrics-hawkular-metrics 3.2.1 219e26f45297
brew-pulp-docker01.web.qa...com:8888/openshift3/metrics-cassandra 3.2.1 afeae5fccd3f
brew-pulp-docker01.web.qa...com:8888/openshift3/metrics-heapster 3.2.1 eac7eb4e46c4

How reproducible:
Always

Steps to Reproduce:
1. Login to OpenShift server and use openshift-infra project
oc project openshift-infra

2. Deploy metrics stack
oc create serviceaccount metrics-deployer

oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:openshift-infra:heapster

oc policy add-role-to-user edit system:serviceaccount:openshift-infra:metrics-deployer

oc secrets new metrics-deployer nothing=/dev/null

oc new-app metrics-deployer-template -p IMAGE_PREFIX=brew-pulp-docker01.web.qa...com:8888/openshift3/,IMAGE_VERSION=3.2.1,CASSANDRA_PV_SIZE=10,CASSANDRA_NODES=1,MASTER_URL=https://openshift-221...com:8443,HAWKULAR_METRICS_HOSTNAME=hawkular-metrics.0725-6cj.qe.rhcloud.com,USE_PERSISTENT_STORAGE=false,MODE=deploy

3. Check the pods
oc get pods

4. Try to show the pod metrics on web console

Actual results:
at step 3:
NAME                         READY     STATUS      RESTARTS   AGE
hawkular-cassandra-1-z42ez   1/1       Running     0          1h
hawkular-metrics-8tzy6       1/1       Running     16         1h
heapster-p6oe0               1/1       Running     14         1h
metrics-deployer-437pk       0/1       Completed   0          1h

at step 4: the metrics can not be shown on console, please refer to the web console screenshot in attachments.

Expected results:
Should show metrcs on web console

Additional info:

Comment 1 Matt Wringe 2016-07-29 14:37:37 UTC

Where is this error message showing up exactly?

In one of the metric component logs? In the browser's console logs?

There are also no screenshots attached like mentioned in the bz.

Comment 2 chunchen 2016-08-01 02:09:12 UTC

Created attachment 1186221 [details]
console screenshot  2

Comment 3 chunchen 2016-08-01 02:09:33 UTC

Created attachment 1186222 [details]
console screenshot  1

Comment 4 chunchen 2016-08-01 02:11:06 UTC

Sorry for forgetting to add the attachments, they are existed now.

Comment 5 Matt Wringe 2016-08-02 14:13:21 UTC

Ah, thank you. I think I know the problem now.

The browser is using its own clock to send the 'start' time in that response, and the end time is not specified. When the end value is not specified, it means that Hawkular Metrics will use its own system clock to generate this value.

What is most likely happening here is that the clocks between the machine running the console and the machine running Hawkular Metrics are out of sync and the end value (generated by Hawkular Metrics) is before the start value (generated by the console).

Can you please verify if the clocks between these system are indeed off by a large margin?

Comment 6 Matt Wringe 2016-08-02 14:14:14 UTC

The upstream issue for this is https://issues.jboss.org/browse/HWKMETRICS-358

Once that is done, the console can use relative timestamps and this type of issue cannot occur.

Comment 7 chunchen 2016-08-03 06:41:28 UTC

Yes, The issue can be reproduced when the clocks are out of sync between these system even if I tested with metrics in V1 registry(brew-pulp-docker01.web.prod...com:8888), so it's not related to V2 registry, I am removing the keywords in the bug title. For this root cause, the bug is not a testblocker any more.

Comment 8 Matt Wringe 2016-08-16 21:33:37 UTC

The metrics containers now support relative timestamps.

The only other piece is for the console in OSE 3.3 to be updated to use it. Its already working in Origin with relative timestamps.

Comment 9 Samuel Padgett 2016-08-18 12:51:23 UTC

Should be part of

https://github.com/openshift/ose/commit/29daeae51244ddb205706958023504c014092541

Comment 10 Troy Dawson 2016-08-18 19:20:32 UTC

This has been merged into ose and is in OSE v3.3.0.22 or newer.

Comment 19 chunchen 2016-08-22 04:03:42 UTC

It's fixed, checked with the latest hawkular 3.3.0 image(70c30be69f7d), so mark it as verified.

Comment 21 errata-xmlrpc 2016-09-27 09:41:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933

Note You need to log in before you can comment on or make changes to this bug.