Created attachment 1292606 [details] Firefox Description of problem: On my F25 laptop, with Wayland or Gnome Classic, with Firefox (54.0 (64-bit)) or Chrome (60.0.3112.40 (Official Build) beta (64-bit) ) I can't see the bottom squares for utilization. Version-Release number of selected component (if applicable): 4.1.2.1-0.1.el7 How reproducible: Always, I'm testing against internal RHEV.TLV setup.
Also, there is no status information for the clusters on the status cards. The different statuses should all sum to the total count of objects in the card's title. The dashboard_data JSON pull does not include status info for clusters, and the heatMapData is empty. Looks like the problem in is the DashboardDataServlet/DB tier.
My mistake on the N/A on the cluster status card - that is expected. The heat maps are not being populated because no data is being sent from the server. No errors are happening in the SQL queries used for the heatmap data. I looked on the TLV server and this is in the dwhd log (/var/log/ovirt-engine-dwh/virt-engine-dwhd.log): 2017-06-29 05:51:15|Mu5B5g|QaRG6l|mHAkli|OVIRT_ENGINE_DWH|SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not updating the statistics. Please check your oVirt Engine status.|9704 The message repeats just about every minute from the beginning of the log file on 2017-06-11. I lack additional credentials to triage the RHEV.TLV setup further.
Hourly aggregation stopped at 2017-06-13 18:00:00+03. I see that dwh was restarted at that time. The log errors are indication that there are issues with engine db connection, The heartbeat does not update every 15 seconds as required. I restarted the service and will check if hourly job starts aggregating again.
Moving to metrics (no more DWH oVirt team?) and assigning to Shirly. Shirly, please decide if to close, or there is an issue requires investigating.
Is that on track to 4.1.5?
(In reply to Yaniv Kaul from comment #5) > Is that on track to 4.1.5? Is that on track for 4.1.6?
(In reply to Yaniv Kaul from comment #6) > (In reply to Yaniv Kaul from comment #5) > > Is that on track to 4.1.5? > > Is that on track for 4.1.6? Ping?
(In reply to Yaniv Kaul from comment #7) > (In reply to Yaniv Kaul from comment #6) > > (In reply to Yaniv Kaul from comment #5) > > > Is that on track to 4.1.5? > > > > Is that on track for 4.1.6? > > Ping? Moved to 4.1.7...
(In reply to Yaniv Kaul from comment #8) > (In reply to Yaniv Kaul from comment #7) > > (In reply to Yaniv Kaul from comment #6) > > > (In reply to Yaniv Kaul from comment #5) > > > > Is that on track to 4.1.5? > > > > > > Is that on track for 4.1.6? > > > > Ping? > > Moved to 4.1.7... This bug is repeated once in a while for customers. The lastHourAgg timestamp is set to a an hour that is a few hours the current time. This causes the dwh to try and aggregate hours that dont yet have samples and daily aggregation to aggregate on empty hour. I could not locate the issue. I know that in rhev-tlv there was a power outage that caused this. I tried to reproduce but could not. I can try to create a workaround for it by comparing the timestamp we plan to update to current time before updating the db and not update if it is not before the current hour. Please let me know if this is acceptable.
(In reply to Shirly Radco from comment #9) > (In reply to Yaniv Kaul from comment #8) > > (In reply to Yaniv Kaul from comment #7) > > > (In reply to Yaniv Kaul from comment #6) > > > > (In reply to Yaniv Kaul from comment #5) > > > > > Is that on track to 4.1.5? > > > > > > > > Is that on track for 4.1.6? > > > > > > Ping? > > > > Moved to 4.1.7... > > This bug is repeated once in a while for customers. > The lastHourAgg timestamp is set to a an hour that is a few hours the > current time. > This causes the dwh to try and aggregate hours that dont yet have samples > and daily aggregation to aggregate on empty hour. > > I could not locate the issue. > > I know that in rhev-tlv there was a power outage that caused this. I tried > to reproduce but could not. > > I can try to create a workaround for it by comparing the timestamp we plan > to update to current time before updating the db and not update if it is not > before the current hour. > > Please let me know if this is acceptable. Yes, unless it causes a major performance issue.
Verified in ovirt-engine-4.1.7.4-0.1.el7.noarch ovirt-engine-dwh-4.1.8-1.el7ev.noarch I used a freshly installed engine with a host, storage and a VM. After an hour, the utilization heatmap blocks appeared. See the attached screenshot.
Created attachment 1343241 [details] screen: utilization heatmap blocks visible