Bug 1887149

Summary: [RFE] VM Disk stats should contain IOPS stats
Product: [oVirt] ovirt-engine-dwh Reporter: Shirly Radco <sradco>
Component: RFEsAssignee: Shirly Radco <sradco>
Status: CLOSED CURRENTRELEASE QA Contact: Guilherme Santos <gdeolive>
Severity: low Docs Contact:
Priority: high    
Version: 4.4.3CC: ahadas, alitman, bugs, gdeolive, jean-louis, lleistne, michal.skrivanek, mperina, pelauter, sbonazzo, sradco, tnisan, vjuranek
Target Milestone: ovirt-4.4.5Keywords: FutureFeature
Target Release: 4.4.5Flags: pm-rhel: ovirt-4.4+
pelauter: planning_ack+
sbonazzo: devel_ack+
lleistne: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-dwh-4.4.5 Doc Type: Enhancement
Doc Text:
Feature: Collect VM disks IOPS stats to DWH database Reason: Allow users to view VM disks IOPS stats Result: VM disks IOPS stats are now saved to DWH database and aggregated to hourly and daily data.
Story Points: ---
Clone Of: 1880424 Environment:
Last Closed: 2021-03-18 15:12:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1880424    
Bug Blocks: 1899573    

Description Shirly Radco 2020-10-11 10:02:09 UTC
Add VM Disk IOPS stats to DWH and Grafana

+++ This bug was initially created as a clone of Bug #1880424 +++

Currently the DWH contains read/write stats for each VM:

v4_4_statistics_vms_disks_resources_usage_samples contains:
read_rate_bytes_per_second | read_latency_seconds | write_rate_bytes_per_second | write_latency_seconds | flush_latency_seconds

It would be nice if this could be extended with read and write IOPS stats.

As far as I can see VDSM already reads those stats, but they are never parsed/saved in the ovirt-engine.

I think IOPS stats are equally important as read/write byte stats, as it could be you have huge amount of IOPS on a VM, but low throughput.
So without IOPS stats you cannot notice this.

--- Additional comment from Jean-Louis Dupond on 2020-09-18 14:52:18 UTC ---

attached a possible patch.
Might have forgotten things, but should be quite ok :)

--- Additional comment from Jean-Louis Dupond on 2020-09-18 14:59:41 UTC ---



--- Additional comment from Sandro Bonazzola on 2020-09-18 15:07:12 UTC ---

Shirly can you please review?

--- Additional comment from RHEL Program Management on 2020-09-18 15:07:16 UTC ---

The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

--- Additional comment from RHEL Program Management on 2020-09-18 15:08:00 UTC ---

The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

--- Additional comment from Michal Skrivanek on 2020-09-21 08:46:26 UTC ---

can you please push(and follow on) the patch to gerrit.ovirt.org?

Comment 1 Jean-Louis Dupond 2020-10-29 14:24:26 UTC
Already (partly) implemented:
https://gerrit.ovirt.org/#/c/111834/

It just needs an updated etl export!

Comment 2 Aviv Litman 2020-11-19 15:30:12 UTC
This fix requires two steps: 

1. Add the requested columns (read and write IOPS stats) to DWH.
2. Add the requested columns into reports in Grafana.

The first step will be documented in this bug, 
and the second step will be documented in bug: https://bugzilla.redhat.com/show_bug.cgi?id=1899573

Comment 3 Guilherme Santos 2021-01-31 17:55:52 UTC
Hi Shirly, is the verification steps for this bz to query the iops data from the iops newly created tables and compare with the vdsm one (with virsh), or is there more to be tested? Thanks

Comment 4 Shirly Radco 2021-02-01 10:35:38 UTC
(In reply to Guilherme Santos from comment #3)
> Hi Shirly, is the verification steps for this bz to query the iops data from
> the iops newly created tables and compare with the vdsm one (with virsh), or
> is there more to be tested? Thanks

Also,have DWH running for at least 2 days and see that the daily and hourly aggregations also work as expected.

Comment 6 Guilherme Santos 2021-03-17 22:11:33 UTC
Verified on:
ovirt-engine-4.4.5.9-0.1.el8ev.noarch

Steps:
1. Had an engine with hosts running for 1 ~ 2 days
2. Validated new columns
ovirt_engine_history=# select read_ops_per_second, write_ops_per_second from vm_disk_samples_history order by history_id desc limit 5;
 read_ops_per_second | write_ops_per_second 
---------------------+----------------------
               10052 |                 8436
               10052 |                 8426
               10052 |                 8420
               10052 |                 8412
               10052 |                 8406
(5 rows)

ovirt_engine_history=# select read_ops_per_second, max_read_ops_per_second, write_ops_per_second, max_write_ops_per_second from vm_disk_hourly_history order by history_id desc limit 5;
 read_ops_per_second | max_read_ops_per_second | write_ops_per_second | max_write_ops_per_second 
---------------------+-------------------------+----------------------+--------------------------
               10052 |                   10052 |                 7763 |                     7950
               10052 |                   10052 |                 7413 |                     7590
               10052 |                   10052 |                 7044 |                     7235
               10052 |                   10052 |                 6680 |                     6854
               10052 |                   10052 |                 6313 |                     6495
(5 rows)


ovirt_engine_history=# select read_ops_per_second, max_read_ops_per_second, write_ops_per_second, max_write_ops_per_second from vm_disk_daily_history order by history_id desc limit 5;
 read_ops_per_second | max_read_ops_per_second | write_ops_per_second | max_write_ops_per_second 
---------------------+-------------------------+----------------------+--------------------------
                   0 |                       0 |                    0 |                        0
               33972 |                   34910 |               190063 |                   323702
                   0 |                       0 |                    0 |                        0
               40066 |                   56160 |               126900 |                   182968
                   0 |                       0 |                    0 |                        0
(5 rows)

Results:
Columns present in the DW and hourly and daily aggregations working as expected

Comment 7 Sandro Bonazzola 2021-03-18 15:12:51 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.