+++ This bug was initially created as a clone of Bug #1371111 +++ Description of problem: Engine Heartbeat should update every 15 seconds, but in some cases it may take longer than 20 seconds. If it takes longer than 20 seconds the dwh will alert "Can not sample data, oVirt Engine is not updating the statistics" . Version-Release number of selected component (if applicable): 4.0.2 How reproducible: Steps to Reproduce: 1.Try to load the engine machine with dwh installed. 2. 3. Actual results: Hearbeat does not update every 15 seconds and can take longer than 20 seconds. Expected results: Hearbeat should update every 15 seconds. Additional info:
This is a normal scheduler job in DwhHeartBeat.java I checked the engine log in rhev-tlv today and did not found this issue Anyway, there are two options here Either it is a scheduler issue or it is a DB issue I recommend to add some logging in order to determine who is responsible for that delay The code is : public void engineIsRunningNotification() { try { // TODO : add logging here heartBeatVar.setDateTime(new Date()); dwhHistoryTimekeepingDao.save(heartBeatVar); // TODO : add logging here } catch (Exception ex) { log.error("Error updating DWH Heart Beat: {}", ex.getMessage()); log.debug("Exception", ex); } } } See my TODO comments inside the code, this will enable us at least to track the log and see the invocation time as well as the time this change was saved to the database
Reducing severity, as it will be treated also via Bug #1371111. I agree with Eli that logging will indeed help here analyze the issue.
(In reply to Eli Mesika from comment #1) > This is a normal scheduler job in DwhHeartBeat.java > I checked the engine log in rhev-tlv today and did not found this issue It happens several times a day in RHEV.TLV.
We have added additional logging to find out what causes the issue/
(In reply to Martin Perina from comment #4) > We have added additional logging to find out what causes the issue/ Please note that new log messages are DEBUG messages , so you have to turn on the DEBUG logging mode
verified in ovirt-engine-4.0.5-0.1.el7ev.noarch 2016-10-13 10:47:18,627 INFO [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (ServerService Thread Pool -- 67) [] Initializing DWH Heart Beat 2016-10-13 10:47:18,628 INFO [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (ServerService Thread Pool -- 67) [] DWH Heart Beat initialized 2016-10-13 10:47:18,628 INFO [org.ovirt.engine.core.bll.InitBackendServicesOnStartupBean] (ServerService Thread Pool -- 67) [] Start org.ovirt.engine.core.bll.dwh.DwhHeartBeat@31cb7143 2016-10-13 10:47:18,650 DEBUG [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (DefaultQuartzScheduler4) [] DWH Heart Beat - Start 2016-10-13 10:47:18,696 DEBUG [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (DefaultQuartzScheduler4) [] DWH Heart Beat - End 2016-10-13 10:47:33,697 DEBUG [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (DefaultQuartzScheduler8) [] DWH Heart Beat - Start 2016-10-13 10:47:33,703 DEBUG [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (DefaultQuartzScheduler8) [] DWH Heart Beat - End 2016-10-13 10:47:48,705 DEBUG [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (DefaultQuartzScheduler2) [] DWH Heart Beat - Start 2016-10-13 10:47:48,712 DEBUG [org.ovirt.engine.core.bll.dwh.DwhHeartBeat] (DefaultQuartzScheduler2) [] DWH Heart Beat - End