1255474 – [RFE][SCALE] traffic shaping on ovirtmgmt interface

Bug 1255474 - [RFE][SCALE] traffic shaping on ovirtmgmt interface

Summary: [RFE][SCALE] traffic shaping on ovirtmgmt interface

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	ovirt-engine
Classification:	oVirt
Component:	RFEs
Sub Component:
Version:	---
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Nobody
QA Contact:	Michael Burman
Docs Contact:
URL:	-
Whiteboard:
Depends On:	1271094 1340702 1346318 1348448
Blocks:	migration_improvements 1428232
TreeView+	depends on / blocked

Reported:	2015-08-20 16:45 UTC by Michal Skrivanek
Modified:	2019-04-28 09:39 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2018-06-06 07:41:16 UTC
oVirt Team:	Network
Embargoed:
Dependent Products:
Flags:	ylavi: ovirt-future? mburman: testing_plan_complete+ ylavi: planning_ack? ylavi: devel_ack? ylavi: testing_ack?

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1364145	0	unspecified	CLOSED	Forbid mixing QoS and non-QoS on the same NIC	2021-02-22 00:41:40 UTC
oVirt gerrit	56429	0	master	MERGED	engine: Create default network QoS for ovirtmgmt upon DC creation	2016-04-26 13:27:03 UTC

Internal Links: 1364145

Description Michal Skrivanek 2015-08-20 16:45:00 UTC

Sharing the same interface for VM networks, management and migrations(and display, and storage) is discouraged, yet common. 
Overloading the single interface by e.g. mass migration or a peak in VM activity causes serious issues with engine-vdsm communication. We see timeouts, problems in monitoring, eventually causing non-responsiveness of the host which causes even worse issues. 

In order to keep management working we should employ traffic shaping to guarantee some bandwidth is always available to vdsm

Comment 2 Michal Skrivanek 2015-09-04 09:36:17 UTC

this is part of the overall migration improvement effort tracked by bug 1252426 (hence 4.0 timeframe)

Comment 3 Red Hat Bugzilla Rules Engine 2015-10-19 10:49:59 UTC

Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Dan Kenigsberg 2015-12-09 14:58:12 UTC

rhev-3.6 features host network QoS (bug 1043226). with it, customers can manually set their own capping on migration network. As far as I understand, this RFE tracks setting up a magical good policy by default.

Comment 6 Michal Skrivanek 2015-12-10 09:54:16 UTC

(In reply to Dan Kenigsberg from comment #5)
> rhev-3.6 features host network QoS (bug 1043226). with it, customers can
> manually set their own capping on migration network. As far as I understand,
> this RFE tracks setting up a magical good policy by default.

yes, magic default QoS for management network to make sure heartbeats and essential communication works at all times

Comment 7 Dan Kenigsberg 2016-01-20 10:20:57 UTC

Meni, could you define a mgmt network with QoS (link share, later abs limit) and then define another network with no QoS on the same nic via vdsm API (as Engine blocks this config). Then, repeat on another host, and stress-test the host-to-host communication with two concurrent iperfs (each per network).
Please report the throughput of each network for each QoS flavour.

Comment 8 Dan Kenigsberg 2016-01-27 09:21:37 UTC

Assuming we can drop the Engine-side validation of QoS and non-QoS networks on the same NIC, this would be doable.

Comment 10 Dan Kenigsberg 2016-02-17 09:37:47 UTC

Applying QoS on the management network by default may have serious impact on host CPU. We need to measure that before applying it to most users.
Gil, can we find the time to do that?

Comment 11 Dan Kenigsberg 2016-03-15 13:54:11 UTC

clearing needinfo, as mburman reports no distinguishable effect of performance. 

I suggest to set the default QoS on a network whenever it is assigned with the management role.

To properly implement this RFE, we'd need to set a QoS on each cluster's management network. Since the network is a DC entity (until http://www.ovirt.org/feature/remove-dc-entity-network/ is implemented ) setting its QoS on one cluster may put other clusters' hosts into out-of-sync.

This is a bit ugly, but may amended by the user clearing the default QoS or applying it on the other clusters as well.

Comment 12 Michael Burman 2016-06-15 12:19:06 UTC

This feature currently has no meaning. we depend on BZ - 1346318

Comment 13 Red Hat Bugzilla Rules Engine 2016-06-15 12:19:15 UTC

Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 14 Yaniv Lavi 2018-06-06 07:41:16 UTC

Closing old RFEs, please reopen if still needed.
Patches are always welcomed.

Note You need to log in before you can comment on or make changes to this bug.