Bug 1430666 - Engine does not update the dwh heartbeat in the specified interval causing dwh error
Summary: Engine does not update the dwh heartbeat in the specified interval causing dw...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: ---
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.1.2
: 4.1.2
Assignee: Eli Mesika
QA Contact: Lucie Leistnerova
URL:
Whiteboard:
Depends On: 1371111
Blocks: 1373456
TreeView+ depends on / blocked
 
Reported: 2017-03-09 09:22 UTC by Shirly Radco
Modified: 2021-05-01 16:47 UTC (History)
9 users (show)

Fixed In Version:
Clone Of: 1371111
Environment:
Last Closed: 2017-05-23 08:14:21 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: planning_ack+
mperina: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 74661 0 master MERGED engine: adding transaction bounderies to heart beat logging 2017-04-19 14:24:52 UTC
oVirt gerrit 75686 0 ovirt-engine-4.1 MERGED engine: adding transaction bounderies to heart beat logging 2017-04-20 13:58:27 UTC

Description Shirly Radco 2017-03-09 09:22:27 UTC
+++ This bug was initially created as a clone of Bug #1371111 +++

Description of problem:
Engine Heartbeat should update every 15 seconds, but in some cases it may take longer.
If it takes longer than 20 seconds the dwh will not collect the samples.

Version-Release number of selected component (if applicable):
4.0.2

How reproducible:


Steps to Reproduce:
1.Try to load the engine machine with dwh installed.
2.See if data is indeed updated in the sampling tables during the day.
3.

Actual results:
DWH heartbeat does not update each 15 seconds. This causes missing sampling.

Expected results:
DWH heartbeat shloulb update each 15 seconds

Additional info:

--- Additional comment from Yaniv Kaul on 2016-09-15 04:56:47 EDT ---

Is this being worked on? Is this on track for 4.0.5?

--- Additional comment from Shirly Radco on 2016-09-18 01:04:28 EDT ---

(In reply to Yaniv Kaul from comment #1)
> Is this being worked on? Is this on track for 4.0.5?

Please see Eli's comments for the bug on engine.
We currently don't see this issue in the rhev-tlv logs.
This seems to be a temporary issues. He asked to add to logging to engine, in order to determine who is responsible for that delay.

--- Additional comment from Yaniv Kaul on 2016-09-18 03:24:38 EDT ---

(In reply to Shirly Radco from comment #2)
> (In reply to Yaniv Kaul from comment #1)
> > Is this being worked on? Is this on track for 4.0.5?
> 
> Please see Eli's comments for the bug on engine.
> We currently don't see this issue in the rhev-tlv logs.
> This seems to be a temporary issues. He asked to add to logging to engine,
> in order to determine who is responsible for that delay.

So if it's not going to 4.0.5 (perhaps the engine additional logs are), please postpone.

--- Additional comment from Oved Ourfali on 2016-10-30 10:36:25 EDT ---

Eli, let's check if we can reproduce and understand why it happens. 
Or suggest workarounds.

--- Additional comment from Eli Mesika on 2016-11-09 05:54:54 EST ---

Can we have engine log with DEBUG messages attached so we can check what part of code is responsible for that 
I added DEBUG messages to figure out what's going on in patch https://gerrit.ovirt.org/#/c/64139/

--- Additional comment from Shirly Radco on 2016-11-29 05:55 EST ---



--- Additional comment from Shirly Radco on 2016-11-29 05:59:46 EST ---

Please use this link due to file size.
engine.log: https://drive.google.com/open?id=0B8qzHycX6vljVlg5dVYzMHVGMkk

--- Additional comment from Shirly Radco on 2017-03-08 08:22:04 EST ---



--- Additional comment from Oved Ourfali on 2017-03-08 08:46:18 EST ---

I don't see the added print at all.
There are some DEBUG prints, but only AAA ones.

I sat with Shirly and we agreed we'll do a logic similar to what she described in the description. In parallel, we need debug logs wherever it reproduces.

Comment 5 Lucie Leistnerova 2017-05-12 10:25:41 UTC
If engine is not able to write to dwh_history_timekeeping, heart beat doesn't end and waits. No error occurs.

verified in ovirt-engine-4.1.2.1-0.1.el7.noarch


Note You need to log in before you can comment on or make changes to this bug.