Bug 1373460 - Disk I/O Metrics empty in RHEV
Summary: Disk I/O Metrics empty in RHEV
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: C&U Capacity and Utilization
Version: 5.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: GA
: 5.8.0
Assignee: Boriso
QA Contact: Tasos Papaioannou
URL:
Whiteboard: rhevm
: 1322094 (view as bug list)
Depends On:
Blocks: 1404391
TreeView+ depends on / blocked
 
Reported: 2016-09-06 10:52 UTC by Victor Estival
Modified: 2019-07-23 15:30 UTC (History)
11 users (show)

Fixed In Version: 5.8.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1404391 (view as bug list)
Environment:
Last Closed: 2017-06-12 17:35:03 UTC
Category: ---
Cloudforms Team: RHEVM
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshots to show the problem (915.40 KB, application/zip)
2016-09-06 10:52 UTC, Victor Estival
no flags Details
overt-engine-dwhd.log (65.60 KB, text/plain)
2016-09-14 06:51 UTC, Wolfram Richter
no flags Details
ovirt-engine-dwhd.log after restart and running for a day (65.67 KB, text/plain)
2016-09-14 19:56 UTC, Wolfram Richter
no flags Details
New CFME screenshot showing missing CPU data (469.97 KB, image/png)
2016-09-15 08:27 UTC, Wolfram Richter
no flags Details
Screenshot of dashboard showing CPU consumption data (413.55 KB, image/png)
2016-09-15 13:50 UTC, Wolfram Richter
no flags Details
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png (1.59 MB, image/png)
2016-11-03 15:16 UTC, Ilanit Stein
no flags Details
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png (1.60 MB, image/png)
2016-11-03 15:17 UTC, Ilanit Stein
no flags Details
Screenshot_CFME-5.7.0.9_RHV-4.0.png (116.03 KB, image/png)
2016-11-03 15:21 UTC, Ilanit Stein
no flags Details
Screenshot_RHV-3.6.8.png (151.33 KB, image/png)
2016-11-03 15:22 UTC, Ilanit Stein
no flags Details
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png (115.08 KB, image/png)
2016-11-03 15:22 UTC, Ilanit Stein
no flags Details
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png (102.18 KB, image/png)
2016-11-03 15:23 UTC, Ilanit Stein
no flags Details

Description Victor Estival 2016-09-06 10:52:18 UTC
Created attachment 1198171 [details]
Screenshots to show the problem

Description of problem: 
After adding successfully RHEV as a provider in CloudForms, no information is shown for CPU or Disk I/O. RHEV detects the load. 

We tried with Interval for 1 day and for 1 hour and the line remains flat.

We generated the load in the VM with this script
#!/bin/bash # e.g. for 3 processors 
for i in 1 2 3 ; do 
perl -e '$z=time()+(10*60); while (time()<$z) { $j++; $j *= 1.1 for (1..9999); }'
done 
wait 

Version-Release number of selected component (if applicable):
RHEV version Version 3.6.6.2-0.1.el6 CFME version 5.6.1.2


How reproducible:
Add a RHEV provider to CFME and configure C&U


Steps to Reproduce:
1.
2.
3.

Actual results:
No CPU neither Disk activity shown in CFME


Expected results: 
Activity shown in CFME


Additional info:

Comment 2 Shirly Radco 2016-09-11 05:10:38 UTC
Hi,

Please attach ovirt-engine-dwhd.log.
Please check that the guest agent is running on the hosts, 
and also check if you see in the ovirt_engine_history database up to date data in the statistics tables samples/hourly and daily.

Comment 3 Wolfram Richter 2016-09-14 06:51:05 UTC
Created attachment 1200734 [details]
overt-engine-dwhd.log

I'm not the original reporter, but have access to the same environment (xxx.hailstorm3.coe.muc.redhat.com).

Comment 4 Shirly Radco 2016-09-14 07:11:13 UTC
I see I/O errors and Broken pipe . The dwh does not manage to connect to the engine db to collect data.
Please try restarting dwh.
If errors persist you will need to check configurations for connecting to engine db.

Comment 5 Wolfram Richter 2016-09-14 07:58:25 UTC
I restarted the ovirt-engine-dwhd on the rhevm. So far I can see no more exceptions in the log, but also not C&U CPU data in CloudForms yet. I will monitor this for a couple of hours and post my findings.

What I find odd is that the memory and network IO graphs are actually available - if the connection to the engine DB would be problematic, wouldn't that also affect these graphs?

Comment 6 Shirly Radco 2016-09-14 08:28:17 UTC
(In reply to Wolfram Richter from comment #5)
> I restarted the ovirt-engine-dwhd on the rhevm. So far I can see no more
> exceptions in the log, but also not C&U CPU data in CloudForms yet. I will
> monitor this for a couple of hours and post my findings.
> 
> What I find odd is that the memory and network IO graphs are actually
> available - if the connection to the engine DB would be problematic,
> wouldn't that also affect these graphs?

Please check in the ovirt_engine_history database in the daily/hourly tables if data is up to date, vm_daily_history, vm_hourly_history.
Hourly - until the hour before last
Daily - until the day before last.

Please confirm that the guest agent is running on all of the vms.

Comment 7 Wolfram Richter 2016-09-14 19:42:46 UTC
The dwh values look good to me:

ovirt_engine_history=# select history_datetime,vm_id,cpu_usage_percent,max_cpu_usage from vm_hourly_history order by history_datetime desc limit 10;
    history_datetime    |                vm_id                 | cpu_usage_percent | max_cpu_usage
------------------------+--------------------------------------+-------------------+---------------
 2016-09-14 19:00:00+02 | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 5 |            31
 2016-09-14 19:00:00+02 | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-14 19:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            88
 2016-09-14 18:00:00+02 | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-14 18:00:00+02 | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 5 |            22
 2016-09-14 18:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                46 |            94
 2016-09-14 17:00:00+02 | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-14 17:00:00+02 | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 6 |            34
 2016-09-14 17:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                52 |            99
 2016-09-14 16:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                63 |            98
(10 rows)

ovirt_engine_history=# select history_datetime,vm_id,cpu_usage_percent,max_cpu_usage from vm_daily_history order by history_datetime desc limit 10;
 history_datetime |                vm_id                 | cpu_usage_percent | max_cpu_usage
------------------+--------------------------------------+-------------------+---------------
 2016-09-12       | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-12       | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 4 |             7
 2016-09-12       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            66
 2016-09-11       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                50 |            66
 2016-09-11       | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-11       | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 4 |             5
 2016-09-10       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            65
 2016-09-10       | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-10       | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 4 |             5
 2016-09-09       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            65
(10 rows)

ovirt_engine_history=#


The guest agent is running:

Wolframs-MBP-6:ansible wolfram$ ssh root.coe.muc.redhat.com systemctl status ovirt-guest-agent
root.coe.muc.redhat.com's password:
/etc/profile.d/lang.sh: line 19: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
● ovirt-guest-agent.service - oVirt Guest Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-guest-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2016-08-30 15:22:16 CEST; 2 weeks 1 days ago
 Main PID: 951 (python)
   CGroup: /system.slice/ovirt-guest-agent.service
           └─951 /usr/bin/python /usr/share/ovirt-guest-agent/ovirt-guest-agent.py

Aug 30 15:22:15 localhost.localdomain systemd[1]: Starting oVirt Guest Agent...
Aug 30 15:22:16 localhost.localdomain systemd[1]: Started oVirt Guest Agent.
Wolframs-MBP-6:ansible wolfram$

Comment 8 Wolfram Richter 2016-09-14 19:56:35 UTC
Created attachment 1200963 [details]
ovirt-engine-dwhd.log after restart and running for a day

No more exceptions are being logged

Comment 9 Shirly Radco 2016-09-15 07:43:33 UTC
(In reply to Wolfram Richter from comment #8)
> Created attachment 1200963 [details]
> ovirt-engine-dwhd.log after restart and running for a day
> 
> No more exceptions are being logged

Are CPU and Disk activity still missing in CFME?

Comment 10 Wolfram Richter 2016-09-15 08:27:11 UTC
CPU and Disk IO data are still missing.

Comment 11 Wolfram Richter 2016-09-15 08:27:58 UTC
Created attachment 1201159 [details]
New CFME screenshot showing missing CPU data

Comment 12 Wolfram Richter 2016-09-15 13:50:06 UTC
Created attachment 1201247 [details]
Screenshot of dashboard showing CPU consumption data

I was surprised to see that on the Dashboard, the RHEV Vms are actually ranked which indicates to me that CF is in fact able to see the CPU consumption data, although it is not displayed when viewing the utilization data.

Comment 13 Shirly Radco 2016-09-18 12:31:56 UTC
Can you please see who can check why the data is not collected from RHEV?

Comment 14 Victor Estival 2016-09-27 11:34:03 UTC
So I guess that problem is the data is collected  but not shown... 

This environment is in the Red Hat VPN, please tell us who do we need to send IPs, URLs users and password

Comment 15 Victor Estival 2016-10-07 19:16:01 UTC
Hello

Do you have any update?

Comment 16 Oved Ourfali 2016-10-31 10:01:51 UTC
Shirly, can you take a look at it?
I guess Boris can assist.

Comment 17 Ilanit Stein 2016-11-03 15:16:38 UTC
Created attachment 1217052 [details]
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png

Comment 18 Ilanit Stein 2016-11-03 15:17:34 UTC
Created attachment 1217054 [details]
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png

Comment 19 Ilanit Stein 2016-11-03 15:21:48 UTC
Created attachment 1217057 [details]
Screenshot_CFME-5.7.0.9_RHV-4.0.png

Comment 20 Ilanit Stein 2016-11-03 15:22:14 UTC
Created attachment 1217058 [details]
Screenshot_RHV-3.6.8.png

Comment 21 Ilanit Stein 2016-11-03 15:22:40 UTC
Created attachment 1217059 [details]
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png

Comment 22 Ilanit Stein 2016-11-03 15:23:10 UTC
Created attachment 1217060 [details]
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png

Comment 23 Ilanit Stein 2016-11-03 15:25:31 UTC
Tested the bellow versions. 
On each observed a RHEL-7.2 VM, and ran the load, 
mentioned in the bug description on the VM:

1. CFME-5.7.0.9-1 & RHV-4.0.4
Attachment "Screenshot_CFME-5.7.0.9_RHV-4.0.png"

All graphs, CPU, Memory, Disk IO, Network IO seem to reflect consumption changes.

2. CFME-5.6.2.2-1 & RHV-3.6.8-0.1
Attachment "Screenshot_CFME-5.6.2.2_RHV-3.6.8.png"
Attachment "Screenshot_RHV-3.6.8.png"

CPU, Memory, Network IO seem to reflect consumption changes.
Disk IO graph was constantly 0.
 
3. CFME-5.7.0.9-1 & RHV-3.6.8-0.1
Attachment "Screenshot_CFME-5.7.0.9_RHV-3.6.8.png"

CPU, Memory, Network IO seem to reflect consumption changes.
Disk IO graph was constantly 0.

Comment 25 CFME Bot 2016-11-28 16:40:32 UTC
New commit detected on ManageIQ/ovirt_metrics/master:
https://github.com/ManageIQ/ovirt_metrics/commit/bbed99cfdbff383dc1caa28ced7ba8e75a633663

commit bbed99cfdbff383dc1caa28ced7ba8e75a633663
Author:     borod108 <bodnopoz>
AuthorDate: Tue Nov 22 16:02:43 2016 +0200
Commit:     borod108 <bodnopoz>
CommitDate: Thu Nov 24 11:29:05 2016 +0200

    Get the disks_ids in a way that would work for 3.6
    
    When doing query_vm_disk_realtime_metrics use vm_device_history instead
    of disks_vm_map to get the disks ids since this table has the right data
    in version 3.6 of Ovirt DWH and in version 4.0
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1373460

 lib/ovirt_metrics.rb                          |  9 +++++--
 lib/ovirt_metrics/models/vm_device_history.rb | 10 ++++++++
 spec/ovirt_metrics_spec.rb                    | 35 +++++++++++++++++++++++----
 3 files changed, 47 insertions(+), 7 deletions(-)
 create mode 100644 lib/ovirt_metrics/models/vm_device_history.rb

Comment 27 Dave Johnson 2016-12-06 16:52:27 UTC
Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

Comment 28 Satoe Imaishi 2016-12-13 17:45:07 UTC
ovirt_metrics version bump PR: https://github.com/ManageIQ/manageiq/pull/13148

Comment 30 Ilanit Stein 2017-02-07 12:17:04 UTC
*** Bug 1322094 has been marked as a duplicate of this bug. ***

Comment 31 Tasos Papaioannou 2017-05-04 15:58:39 UTC
Verified on 5.8.0.13-rc2 for RHEV 3.6 and RHV 4.1.

Comment 32 CFME Bot 2019-07-23 15:30:29 UTC
New commit detected on ManageIQ/ovirt_metrics/master:

https://github.com/ManageIQ/ovirt_metrics/commit/ca8395331efc3251f77458c1c1118a5e8916016d
commit ca8395331efc3251f77458c1c1118a5e8916016d
Author:     Roberto Ciatti <gekorob.github.com>
AuthorDate: Mon Jul 22 12:30:54 2019 -0400
Commit:     Roberto Ciatti <gekorob.github.com>
CommitDate: Mon Jul 22 12:30:54 2019 -0400

    Remove test and assignment of legacy adapter

    The legacy adapter layer is causing connection endpoint validation
    failure for newer postgresql versions.

    This should solve:
    https://bugzilla.redhat.com/show_bug.cgi?id=1373460

 lib/ovirt_metrics.rb | 9 +-
 1 file changed, 1 insertion(+), 8 deletions(-)


Note You need to log in before you can comment on or make changes to this bug.