Bug 1430666

Summary: Engine does not update the dwh heartbeat in the specified interval causing dwh error
Product: [oVirt] ovirt-engine Reporter: Shirly Radco <sradco>
Component: BLL.InfraAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: Lucie Leistnerova <lleistne>
Severity: medium Docs Contact:
Priority: medium    
Version: ---CC: bugs, emesika, lveyde, mburman, mperina, oourfali, pstehlik, sradco, trefex
Target Milestone: ovirt-4.1.2Flags: rule-engine: ovirt-4.1+
rule-engine: planning_ack+
mperina: devel_ack+
pstehlik: testing_ack+
Target Release: 4.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1371111 Environment:
Last Closed: 2017-05-23 08:14:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1371111    
Bug Blocks: 1373456    

Description Shirly Radco 2017-03-09 09:22:27 UTC
+++ This bug was initially created as a clone of Bug #1371111 +++

Description of problem:
Engine Heartbeat should update every 15 seconds, but in some cases it may take longer.
If it takes longer than 20 seconds the dwh will not collect the samples.

Version-Release number of selected component (if applicable):
4.0.2

How reproducible:


Steps to Reproduce:
1.Try to load the engine machine with dwh installed.
2.See if data is indeed updated in the sampling tables during the day.
3.

Actual results:
DWH heartbeat does not update each 15 seconds. This causes missing sampling.

Expected results:
DWH heartbeat shloulb update each 15 seconds

Additional info:

--- Additional comment from Yaniv Kaul on 2016-09-15 04:56:47 EDT ---

Is this being worked on? Is this on track for 4.0.5?

--- Additional comment from Shirly Radco on 2016-09-18 01:04:28 EDT ---

(In reply to Yaniv Kaul from comment #1)
> Is this being worked on? Is this on track for 4.0.5?

Please see Eli's comments for the bug on engine.
We currently don't see this issue in the rhev-tlv logs.
This seems to be a temporary issues. He asked to add to logging to engine, in order to determine who is responsible for that delay.

--- Additional comment from Yaniv Kaul on 2016-09-18 03:24:38 EDT ---

(In reply to Shirly Radco from comment #2)
> (In reply to Yaniv Kaul from comment #1)
> > Is this being worked on? Is this on track for 4.0.5?
> 
> Please see Eli's comments for the bug on engine.
> We currently don't see this issue in the rhev-tlv logs.
> This seems to be a temporary issues. He asked to add to logging to engine,
> in order to determine who is responsible for that delay.

So if it's not going to 4.0.5 (perhaps the engine additional logs are), please postpone.

--- Additional comment from Oved Ourfali on 2016-10-30 10:36:25 EDT ---

Eli, let's check if we can reproduce and understand why it happens. 
Or suggest workarounds.

--- Additional comment from Eli Mesika on 2016-11-09 05:54:54 EST ---

Can we have engine log with DEBUG messages attached so we can check what part of code is responsible for that 
I added DEBUG messages to figure out what's going on in patch https://gerrit.ovirt.org/#/c/64139/

--- Additional comment from Shirly Radco on 2016-11-29 05:55 EST ---



--- Additional comment from Shirly Radco on 2016-11-29 05:59:46 EST ---

Please use this link due to file size.
engine.log: https://drive.google.com/open?id=0B8qzHycX6vljVlg5dVYzMHVGMkk

--- Additional comment from Shirly Radco on 2017-03-08 08:22:04 EST ---



--- Additional comment from Oved Ourfali on 2017-03-08 08:46:18 EST ---

I don't see the added print at all.
There are some DEBUG prints, but only AAA ones.

I sat with Shirly and we agreed we'll do a logic similar to what she described in the description. In parallel, we need debug logs wherever it reproduces.

Comment 5 Lucie Leistnerova 2017-05-12 10:25:41 UTC
If engine is not able to write to dwh_history_timekeeping, heart beat doesn't end and waits. No error occurs.

verified in ovirt-engine-4.1.2.1-0.1.el7.noarch