Bug 1181665 - [RFE][scale] Use events and not polling to detect disk usage [using the improvement from platform bug 1181659]
Summary: [RFE][scale] Use events and not polling to detect disk usage [using the impro...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: RFEs
Version: ---
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.2.0
: 4.20.9
Assignee: Francesco Romani
QA Contact: guy chen
URL:
Whiteboard:
Depends On: 1181648 1181659
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-13 14:48 UTC by Francesco Romani
Modified: 2019-04-28 13:27 UTC (History)
12 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-02-12 10:10:45 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+
sherold: Triaged+
mtessun: planning_ack+
michal.skrivanek: devel_ack+
eberman: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 83832 0 master MERGED virt: config: drivemonitor: enable by default 2017-11-22 15:39:56 UTC

Description Francesco Romani 2015-01-13 14:48:52 UTC
Description of problem:
RHEV makes heavy use of thin-provisioned disk. VDSM support for them includes
monitoring of their usage, and transparent resizing of them, without the VM noticing.

This feature is built on disk usage polling, because there is no other mean
to detect the disk usage, thus the need for extension.

The polling may be very frequent, and this is among the biggest, if not the single source of load to libvirt.

To improve scalability and resource usage in general, we need an event to notify when disk usage exceeds a threshold. This will allow the feature to work
with much less system load.

The existing polling has to be kept partially as fallback or as recovery option.

This bug tracks the implementation of this feature in RHEV.


Version-Release number of selected component (if applicable):
4.17.0

Comment 1 Michal Skrivanek 2015-01-14 08:30:38 UTC
tentatively planned for 3.6 depending on support in QEMU and libvirt

Comment 2 Michal Skrivanek 2015-05-25 11:31:46 UTC
libvirt side in progress, https://www.redhat.com/archives/libvir-list/2015-May/msg00580.html

this may be a late delivery

Comment 3 Francesco Romani 2015-06-23 14:42:16 UTC
As per last update from libvirt developers, support will most likely slip to 7.3, so we cannot implement this.

Comment 4 Michal Skrivanek 2015-07-02 06:04:29 UTC
as per last comment moving to 4.0 due to platform dependency

Comment 6 Yaniv Kaul 2016-03-14 11:17:47 UTC
Moving to 4.1, as platform bug 1181659 is not yet approved for 7.3.

Comment 11 Red Hat Bugzilla Rules Engine 2016-12-27 16:38:13 UTC
This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.

Comment 13 Francesco Romani 2017-02-08 15:08:40 UTC
(eventually) moved to NEW because I can't work on this until we have the libvirt support available.

Comment 18 Yaniv Kaul 2017-10-15 08:21:51 UTC
Moving back to ASSIGNED, as attached patch was abandoned.

Comment 19 Michal Skrivanek 2017-11-20 15:25:51 UTC
design: https://github.com/oVirt/vdsm/blob/master/doc/thin-provisioning.md

Comment 20 Michal Skrivanek 2017-11-20 15:26:10 UTC
implementation: https://gerrit.ovirt.org/#/q/project:vdsm+branch:master+topic:drivemonitor_event

Comment 21 Michal Skrivanek 2017-11-24 15:10:56 UTC
this is now completed and enabled by default

Comment 22 guy chen 2018-02-04 06:23:12 UTC
Was tested and verified in performance environment and seen a good improvement : 


Lab Topology
System topology (using bare metal hosts)
3 DC
5 Clusters
235 Hosts
Hera : 4 Hosts
leopard : 2 Hosts
Nested hosts : 229 VMS
3  SD
1020 VMS
Hera : 690 VMS
leopard : 330 VMS

Scenario matrix

Delta between the tests
Test Step	Old	New Build 06.12	Delta in HH:MM:SS	Delta in percentage
VM Stop	0:00:08	0:00:06	-0:00:02	-25.00%
VM Start	0:00:36	0:01:00	0:00:24	66.67%
Create VM From template	0:01:09	0:01:21	0:00:12	17.39%
Nested Host stop	0:00:34	0:00:27	-0:00:07	-20.59%
Nested Host start	0:00:50	0:00:30	-0:00:20	-40.00%
Create Nested Host from template	0:01:17	0:01:15	-0:00:02	-2.60%
Sent maintenance to 10 hosts	0:00:08	0:00:07	-0:00:01	-12.50%
Reboot 10 nested hosts	0:05:15	0:05:25	0:00:10	3.17%
Reboot 50 nested hosts	0:05:57	0:05:46	-0:00:11	-3.08%
Reboot 80 nested hosts	0:10:35	0:10:34	-0:00:01	-0.16%
Reboot 100 nested hosts	0:15:09	0:11:40	-0:03:29	-22.99%
Engine restart	0:00:47	0:00:55	0:00:08	17.02%

Comment 23 Sandro Bonazzola 2018-02-12 10:10:45 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.