Bug 1373460

Summary: Disk I/O Metrics empty in RHEV
Product: Red Hat CloudForms Management Engine Reporter: Victor Estival <vestival>
Component: C&U Capacity and UtilizationAssignee: Boriso <bodnopoz>
Status: CLOSED CURRENTRELEASE QA Contact: Tasos Papaioannou <tpapaioa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.6.0CC: akrzos, cpelland, gblomqui, istein, jhardy, obarenbo, simaishi, sradco, tpapaioa, vestival, wrichter
Target Milestone: GAKeywords: TestOnly
Target Release: 5.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: rhevm
Fixed In Version: 5.8.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1404391 (view as bug list) Environment:
Last Closed: 2017-06-12 17:35:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: RHEVM Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1404391    
Attachments:
Description Flags
Screenshots to show the problem
none
overt-engine-dwhd.log
none
ovirt-engine-dwhd.log after restart and running for a day
none
New CFME screenshot showing missing CPU data
none
Screenshot of dashboard showing CPU consumption data
none
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png
none
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png
none
Screenshot_CFME-5.7.0.9_RHV-4.0.png
none
Screenshot_RHV-3.6.8.png
none
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png
none
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png none

Description Victor Estival 2016-09-06 10:52:18 UTC
Created attachment 1198171 [details]
Screenshots to show the problem

Description of problem: 
After adding successfully RHEV as a provider in CloudForms, no information is shown for CPU or Disk I/O. RHEV detects the load. 

We tried with Interval for 1 day and for 1 hour and the line remains flat.

We generated the load in the VM with this script
#!/bin/bash # e.g. for 3 processors 
for i in 1 2 3 ; do 
perl -e '$z=time()+(10*60); while (time()<$z) { $j++; $j *= 1.1 for (1..9999); }'
done 
wait 

Version-Release number of selected component (if applicable):
RHEV version Version 3.6.6.2-0.1.el6 CFME version 5.6.1.2


How reproducible:
Add a RHEV provider to CFME and configure C&U


Steps to Reproduce:
1.
2.
3.

Actual results:
No CPU neither Disk activity shown in CFME


Expected results: 
Activity shown in CFME


Additional info:

Comment 2 Shirly Radco 2016-09-11 05:10:38 UTC
Hi,

Please attach ovirt-engine-dwhd.log.
Please check that the guest agent is running on the hosts, 
and also check if you see in the ovirt_engine_history database up to date data in the statistics tables samples/hourly and daily.

Comment 3 Wolfram Richter 2016-09-14 06:51:05 UTC
Created attachment 1200734 [details]
overt-engine-dwhd.log

I'm not the original reporter, but have access to the same environment (xxx.hailstorm3.coe.muc.redhat.com).

Comment 4 Shirly Radco 2016-09-14 07:11:13 UTC
I see I/O errors and Broken pipe . The dwh does not manage to connect to the engine db to collect data.
Please try restarting dwh.
If errors persist you will need to check configurations for connecting to engine db.

Comment 5 Wolfram Richter 2016-09-14 07:58:25 UTC
I restarted the ovirt-engine-dwhd on the rhevm. So far I can see no more exceptions in the log, but also not C&U CPU data in CloudForms yet. I will monitor this for a couple of hours and post my findings.

What I find odd is that the memory and network IO graphs are actually available - if the connection to the engine DB would be problematic, wouldn't that also affect these graphs?

Comment 6 Shirly Radco 2016-09-14 08:28:17 UTC
(In reply to Wolfram Richter from comment #5)
> I restarted the ovirt-engine-dwhd on the rhevm. So far I can see no more
> exceptions in the log, but also not C&U CPU data in CloudForms yet. I will
> monitor this for a couple of hours and post my findings.
> 
> What I find odd is that the memory and network IO graphs are actually
> available - if the connection to the engine DB would be problematic,
> wouldn't that also affect these graphs?

Please check in the ovirt_engine_history database in the daily/hourly tables if data is up to date, vm_daily_history, vm_hourly_history.
Hourly - until the hour before last
Daily - until the day before last.

Please confirm that the guest agent is running on all of the vms.

Comment 7 Wolfram Richter 2016-09-14 19:42:46 UTC
The dwh values look good to me:

ovirt_engine_history=# select history_datetime,vm_id,cpu_usage_percent,max_cpu_usage from vm_hourly_history order by history_datetime desc limit 10;
    history_datetime    |                vm_id                 | cpu_usage_percent | max_cpu_usage
------------------------+--------------------------------------+-------------------+---------------
 2016-09-14 19:00:00+02 | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 5 |            31
 2016-09-14 19:00:00+02 | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-14 19:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            88
 2016-09-14 18:00:00+02 | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-14 18:00:00+02 | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 5 |            22
 2016-09-14 18:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                46 |            94
 2016-09-14 17:00:00+02 | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-14 17:00:00+02 | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 6 |            34
 2016-09-14 17:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                52 |            99
 2016-09-14 16:00:00+02 | 2aefe222-b288-4475-99b0-88f791c6cb54 |                63 |            98
(10 rows)

ovirt_engine_history=# select history_datetime,vm_id,cpu_usage_percent,max_cpu_usage from vm_daily_history order by history_datetime desc limit 10;
 history_datetime |                vm_id                 | cpu_usage_percent | max_cpu_usage
------------------+--------------------------------------+-------------------+---------------
 2016-09-12       | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-12       | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 4 |             7
 2016-09-12       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            66
 2016-09-11       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                50 |            66
 2016-09-11       | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-11       | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 4 |             5
 2016-09-10       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            65
 2016-09-10       | d96d84b2-113e-4527-90e3-60345956fe84 |                 0 |             0
 2016-09-10       | 14c3172b-963e-42e9-b970-3f1b48d13bd0 |                 4 |             5
 2016-09-09       | 2aefe222-b288-4475-99b0-88f791c6cb54 |                51 |            65
(10 rows)

ovirt_engine_history=#


The guest agent is running:

Wolframs-MBP-6:ansible wolfram$ ssh root.coe.muc.redhat.com systemctl status ovirt-guest-agent
root.coe.muc.redhat.com's password:
/etc/profile.d/lang.sh: line 19: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
● ovirt-guest-agent.service - oVirt Guest Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-guest-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2016-08-30 15:22:16 CEST; 2 weeks 1 days ago
 Main PID: 951 (python)
   CGroup: /system.slice/ovirt-guest-agent.service
           └─951 /usr/bin/python /usr/share/ovirt-guest-agent/ovirt-guest-agent.py

Aug 30 15:22:15 localhost.localdomain systemd[1]: Starting oVirt Guest Agent...
Aug 30 15:22:16 localhost.localdomain systemd[1]: Started oVirt Guest Agent.
Wolframs-MBP-6:ansible wolfram$

Comment 8 Wolfram Richter 2016-09-14 19:56:35 UTC
Created attachment 1200963 [details]
ovirt-engine-dwhd.log after restart and running for a day

No more exceptions are being logged

Comment 9 Shirly Radco 2016-09-15 07:43:33 UTC
(In reply to Wolfram Richter from comment #8)
> Created attachment 1200963 [details]
> ovirt-engine-dwhd.log after restart and running for a day
> 
> No more exceptions are being logged

Are CPU and Disk activity still missing in CFME?

Comment 10 Wolfram Richter 2016-09-15 08:27:11 UTC
CPU and Disk IO data are still missing.

Comment 11 Wolfram Richter 2016-09-15 08:27:58 UTC
Created attachment 1201159 [details]
New CFME screenshot showing missing CPU data

Comment 12 Wolfram Richter 2016-09-15 13:50:06 UTC
Created attachment 1201247 [details]
Screenshot of dashboard showing CPU consumption data

I was surprised to see that on the Dashboard, the RHEV Vms are actually ranked which indicates to me that CF is in fact able to see the CPU consumption data, although it is not displayed when viewing the utilization data.

Comment 13 Shirly Radco 2016-09-18 12:31:56 UTC
Can you please see who can check why the data is not collected from RHEV?

Comment 14 Victor Estival 2016-09-27 11:34:03 UTC
So I guess that problem is the data is collected  but not shown... 

This environment is in the Red Hat VPN, please tell us who do we need to send IPs, URLs users and password

Comment 15 Victor Estival 2016-10-07 19:16:01 UTC
Hello

Do you have any update?

Comment 16 Oved Ourfali 2016-10-31 10:01:51 UTC
Shirly, can you take a look at it?
I guess Boris can assist.

Comment 17 Ilanit Stein 2016-11-03 15:16:38 UTC
Created attachment 1217052 [details]
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png

Comment 18 Ilanit Stein 2016-11-03 15:17:34 UTC
Created attachment 1217054 [details]
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png

Comment 19 Ilanit Stein 2016-11-03 15:21:48 UTC
Created attachment 1217057 [details]
Screenshot_CFME-5.7.0.9_RHV-4.0.png

Comment 20 Ilanit Stein 2016-11-03 15:22:14 UTC
Created attachment 1217058 [details]
Screenshot_RHV-3.6.8.png

Comment 21 Ilanit Stein 2016-11-03 15:22:40 UTC
Created attachment 1217059 [details]
Screenshot_CFME-5.6.2.2_RHV-3.6.8.png

Comment 22 Ilanit Stein 2016-11-03 15:23:10 UTC
Created attachment 1217060 [details]
Screenshot_CFME-5.7.0.9_RHV-3.6.8.png

Comment 23 Ilanit Stein 2016-11-03 15:25:31 UTC
Tested the bellow versions. 
On each observed a RHEL-7.2 VM, and ran the load, 
mentioned in the bug description on the VM:

1. CFME-5.7.0.9-1 & RHV-4.0.4
Attachment "Screenshot_CFME-5.7.0.9_RHV-4.0.png"

All graphs, CPU, Memory, Disk IO, Network IO seem to reflect consumption changes.

2. CFME-5.6.2.2-1 & RHV-3.6.8-0.1
Attachment "Screenshot_CFME-5.6.2.2_RHV-3.6.8.png"
Attachment "Screenshot_RHV-3.6.8.png"

CPU, Memory, Network IO seem to reflect consumption changes.
Disk IO graph was constantly 0.
 
3. CFME-5.7.0.9-1 & RHV-3.6.8-0.1
Attachment "Screenshot_CFME-5.7.0.9_RHV-3.6.8.png"

CPU, Memory, Network IO seem to reflect consumption changes.
Disk IO graph was constantly 0.

Comment 25 CFME Bot 2016-11-28 16:40:32 UTC
New commit detected on ManageIQ/ovirt_metrics/master:
https://github.com/ManageIQ/ovirt_metrics/commit/bbed99cfdbff383dc1caa28ced7ba8e75a633663

commit bbed99cfdbff383dc1caa28ced7ba8e75a633663
Author:     borod108 <bodnopoz>
AuthorDate: Tue Nov 22 16:02:43 2016 +0200
Commit:     borod108 <bodnopoz>
CommitDate: Thu Nov 24 11:29:05 2016 +0200

    Get the disks_ids in a way that would work for 3.6
    
    When doing query_vm_disk_realtime_metrics use vm_device_history instead
    of disks_vm_map to get the disks ids since this table has the right data
    in version 3.6 of Ovirt DWH and in version 4.0
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1373460

 lib/ovirt_metrics.rb                          |  9 +++++--
 lib/ovirt_metrics/models/vm_device_history.rb | 10 ++++++++
 spec/ovirt_metrics_spec.rb                    | 35 +++++++++++++++++++++++----
 3 files changed, 47 insertions(+), 7 deletions(-)
 create mode 100644 lib/ovirt_metrics/models/vm_device_history.rb

Comment 27 Dave Johnson 2016-12-06 16:52:27 UTC
Please assess the impact of this issue and update the severity accordingly.  Please refer to https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity for a reminder on each severity's definition.

Comment 28 Satoe Imaishi 2016-12-13 17:45:07 UTC
ovirt_metrics version bump PR: https://github.com/ManageIQ/manageiq/pull/13148

Comment 30 Ilanit Stein 2017-02-07 12:17:04 UTC
*** Bug 1322094 has been marked as a duplicate of this bug. ***

Comment 31 Tasos Papaioannou 2017-05-04 15:58:39 UTC
Verified on 5.8.0.13-rc2 for RHEV 3.6 and RHV 4.1.

Comment 32 CFME Bot 2019-07-23 15:30:29 UTC
New commit detected on ManageIQ/ovirt_metrics/master:

https://github.com/ManageIQ/ovirt_metrics/commit/ca8395331efc3251f77458c1c1118a5e8916016d
commit ca8395331efc3251f77458c1c1118a5e8916016d
Author:     Roberto Ciatti <gekorob.github.com>
AuthorDate: Mon Jul 22 12:30:54 2019 -0400
Commit:     Roberto Ciatti <gekorob.github.com>
CommitDate: Mon Jul 22 12:30:54 2019 -0400

    Remove test and assignment of legacy adapter

    The legacy adapter layer is causing connection endpoint validation
    failure for newer postgresql versions.

    This should solve:
    https://bugzilla.redhat.com/show_bug.cgi?id=1373460

 lib/ovirt_metrics.rb | 9 +-
 1 file changed, 1 insertion(+), 8 deletions(-)