Bug 1344935 - "Can not sample data, oVirt Engine is not updating the statistics" shown when dwh_sampling and engine DwhHeartBeatInterval do not match
Summary: "Can not sample data, oVirt Engine is not updating the statistics" shown when...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: 4.0.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.0.2
: ---
Assignee: Shirly Radco
QA Contact: Lukas Svaty
URL:
Whiteboard:
Depends On:
Blocks: 1349309
TreeView+ depends on / blocked
 
Reported: 2016-06-12 20:43 UTC by mlehrer
Modified: 2020-03-30 01:48 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-12 14:26:47 UTC
oVirt Team: Metrics
Embargoed:
rule-engine: ovirt-4.0.z+
ykaul: exception+
rule-engine: planning_ack+
rule-engine: devel_ack+
lsvaty: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 61234 0 'None' MERGED core: Change default value of DwhHeartBeatInterval 2021-01-07 20:02:58 UTC
oVirt gerrit 61265 0 'None' MERGED core: Change default value of DwhHeartBeatInterval 2021-01-07 20:02:56 UTC
oVirt gerrit 61296 0 'None' MERGED core: Change default value of DwhHeartBeatInterval 2021-01-07 20:02:58 UTC

Description mlehrer 2016-06-12 20:43:51 UTC
Description of problem:

When DwhHeartBeatInterval value set with engine-config does not match dwh sampling interval the following error occurs in /var/log/ovirt-engine-dwh/ovirt-engine-dwhd.log

OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704

Version-Release number of selected component (if applicable):


How reproducible:
each time

Steps to Reproduce:
1. set dwh_interval
2. leave DwhHeartBeatInterval to default value
3. start dwh and ovirt-engine

Actual results:
"Can not sample data, oVirt Engine is not updating the statistics" shown in ovirt-engine-dwhd.log"

Expected results:
No error message and successful sample

Additional info:

Comment 4 Lukas Svaty 2016-07-29 08:36:44 UTC
value was set to 15 - DwhHeartBeatInterval: 15 version: general

However dwh aggregation is set to 20 seconds, thus we still have error messages:

2016-07-29 10:27:25|D0zvDW|KQjdOI|9jYJcr|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2016-07-29 10:27:25 Statistics sync ended. Duration: 5174 milliseconds 
2016-07-29 10:27:40|D0zvDW|9jYJcr|KQjdOI|19468|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.0|Default||end|success|20004
2016-07-29 10:27:40|s1EU5Z|9jYJcr|KQjdOI|19468|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.0|Default||begin||
2016-07-29 10:27:50|s1EU5Z|KQjdOI|9jYJcr|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704

Comment 5 Red Hat Bugzilla Rules Engine 2016-07-29 08:36:49 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 6 Yaniv Lavi 2016-07-31 13:18:32 UTC
Should this be ON_QA?

Comment 7 Lukas Svaty 2016-08-10 12:10:47 UTC
after service ovirt-engine restart

2016-08-10 12:06:00|dIHYsy|mv7tbQ|eet6ZB|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704

it is displayed only on service ovirt-engine-dwhd restart

tested in ovirt-engine-dwh-4.0.2-1.el7ev.noarch

How is the target release set to 4.0.2.2 when no such package was built yet and bug is ON_QA since end of july? Am I missing something here?

Comment 8 Red Hat Bugzilla Rules Engine 2016-08-10 12:10:51 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 9 Lukas Svaty 2016-08-11 10:41:48 UTC
retested this on latest version:
ovirt-engine-dwh-setup-4.0.2-1.el7ev.noarch
ovirt-engine-setup-4.0.2.6-0.1.el7ev.noarch

working, and error message is gone even after dwh daemon restart

needinfo solved via irc

Comment 10 Yaniv Kaul 2016-08-24 10:55:36 UTC
We are seeing it happening in RHEV.TLV setup, which has:
ovirt-engine-setup-4.0.2.7-0.1.el7ev.noarch
ovirt-engine-dwh-setup-4.0.2-1.el7ev.noarch
ovirt-engine-dwh-4.0.2-1.el7ev.noarch

Comment 11 Lukas Svaty 2016-08-24 11:03:20 UTC
Is it possible that update of engine does not restart dwh service, thus when I restarted it manually it reloaded heartbeat -> error is gone?

In this case it would only affect 4.0 minor version upgrade.

Comment 12 Yaniv Kaul 2016-08-24 14:24:03 UTC
(In reply to Lukas Svaty from comment #11)
> Is it possible that update of engine does not restart dwh service, thus when
> I restarted it manually it reloaded heartbeat -> error is gone?
> 
> In this case it would only affect 4.0 minor version upgrade.

But we've upgraded from RC1 to RC2, but perhaps. Need more investigation around RHEV.TLV use case.

Comment 13 Shirly Radco 2017-03-27 12:30:17 UTC
I fixed the error sent to the engine log to be sent only if heartbeat has not update for at least a minute. Bug #1371111.

The error will still be logged to the dwh log, since the dwh does not sample if heartbeat did not update.
The heartbeat should update each 15 seconds but for some reason it sometimes takes it longer that 20 seconds to update.

Comment 14 Muhammad Aidilfitri 2020-03-30 01:40:55 UTC
Hello all,

I have encountered a similar issue in the oVirt 4.4 Pre-release version. Below is the logs with the Debug enabled. Anyone have any idea on this?


ovirtEngineDbDriverClass|org.postgresql.Driver
ovirtEngineHistoryDbJdbcConnection|jdbc:postgresql://SGOVIRTPSQL01.gebgd.org:5432/ovirt_engine_history?sslfactory=org.postgresql.ssl.NonValidatingFactory
hoursToKeepDaily|0
hoursToKeepHourly|720
ovirtEngineDbPassword|***********
runDeleteTime|3
ovirtEngineDbJdbcConnection|jdbc:postgresql://SGOVIRTPSQL01.gebgd.org:5432/engine?sslfactory=org.postgresql.ssl.NonValidatingFactory
runInterleave|60
limitRows|limit 1000
ovirtEngineHistoryDbUser|ovirt_engine_history
ovirtEngineDbUser|engine
deleteIncrement|10
timeBetweenErrorEvents|300000
hoursToKeepSamples|24
deleteMultiplier|1000
lastErrorSent|2011-07-03 12:46:47.000000
etlVersion|4.4.0
dwhAggregationDebug|true
dwhUuid|7ec3b9e7-fada-4212-a2f5-e73375af27a5
ovirtEngineHistoryDbDriverClass|org.postgresql.Driver
ovirtEngineHistoryDbPassword|***********
2020-03-30 09:23:48|EfvHCb|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:23:50 Statistics sync ended. Duration: 1099 milliseconds
2020-03-30 09:24:48|EfvHCb|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:24:48|EwAWST|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:24:48|EwAWST|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:24:48 Statistics sync ended. Duration: 36 milliseconds
2020-03-30 09:25:48|EwAWST|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:25:48|wEfkgz|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:25:53|wEfkgz|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:25:53 Statistics sync ended. Duration: 5024 milliseconds
2020-03-30 09:26:48|wEfkgz|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:26:48|Rx4YUP|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:26:49 Statistics sync ended. Duration: 574 milliseconds
2020-03-30 09:27:48|Rx4YUP|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:27:48|eQG4D1|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:27:48|eQG4D1|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:27:48 Statistics sync ended. Duration: 22 milliseconds
2020-03-30 09:28:48|eQG4D1|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:28:48|98VLM8|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:28:53|98VLM8|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:28:54 Statistics sync ended. Duration: 5023 milliseconds
2020-03-30 09:29:48|98VLM8|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:29:48|QDgtSy|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:29:49 Statistics sync ended. Duration: 569 milliseconds
2020-03-30 09:30:48|QDgtSy|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:30:48|5yfMco|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:30:49|5yfMco|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:30:49 Statistics sync ended. Duration: 21 milliseconds
2020-03-30 09:31:48|5yfMco|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:31:49|hczJ72|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:31:54|hczJ72|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:31:54 Statistics sync ended. Duration: 5021 milliseconds
2020-03-30 09:32:49|hczJ72|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60002
2020-03-30 09:32:49|7p3Eso|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:32:49 Statistics sync ended. Duration: 589 milliseconds
2020-03-30 09:33:49|7p3Eso|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:33:49|bPtyJQ|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:33:49|bPtyJQ|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:33:49 Statistics sync ended. Duration: 21 milliseconds
2020-03-30 09:34:49|bPtyJQ|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60000
2020-03-30 09:34:49|I58T7d|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:34:54|I58T7d|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:34:54 Statistics sync ended. Duration: 5025 milliseconds
2020-03-30 09:35:49|I58T7d|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:35:49|NuDobu|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:35:49 Statistics sync ended. Duration: 538 milliseconds
2020-03-30 09:36:49|NuDobu|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:36:49|4qzfiX|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:36:49|4qzfiX|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:36:49 Statistics sync ended. Duration: 31 milliseconds
2020-03-30 09:37:49|4qzfiX|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:37:49|cOdq5d|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:37:54|cOdq5d|ZM82yo|r2xagP|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704
2020-03-30 09:37:54 Statistics sync ended. Duration: 5019 milliseconds
2020-03-30 09:38:49|cOdq5d|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||end|success|60001
2020-03-30 09:38:49|OZ4QNe|r2xagP|ZM82yo|6531|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|_FvEy8LzqEeCaj-T1n0SCFw|4.4|Default||begin||
2020-03-30 09:38:49 Statistics sync ended. Duration: 536 milliseconds


Note You need to log in before you can comment on or make changes to this bug.