Bug 1304969 - [Docs][RFE] Document latency tolerance thresholds for the ovirtmgmt network
Summary: [Docs][RFE] Document latency tolerance thresholds for the ovirtmgmt network
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: rhev-docs@redhat.com
QA Contact: rhev-docs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-05 07:20 UTC by Lucy Bopf
Modified: 2019-05-07 12:56 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-05 12:25:08 UTC
oVirt Team: Docs
Target Upstream Version:


Attachments (Terms of Use)

Description Lucy Bopf 2016-02-05 07:20:57 UTC
The documentation should provide some information about latency tolerance for the ovirtmgmt (RHEV management) network in the following scenarios:

- GUI to Manager
- Manager to hosts
- Storage (NFS/Gluster/ISCSI)
- Live migration

This should include latency estimates for each, and an estimate of the point at which users may see errors or timeouts.

Comment 4 Yaniv Lavi 2016-11-22 13:00:19 UTC
What is needed:
1. An estimate of the point at which the latency between GUI and Manager renders RHV unusable
2. An estimate of the point at which the latency between the Manager and the hosts    renders RHV unusable
3. An estimate of the point at which the latency between the Storage and the engine renders RHV unusable
4. An estimate of the point at which you should assume that live migration is not going to work, but has timed-out

Can we provide this?
Is the scale effort planning to address this?

Comment 5 Yaniv Kaul 2016-11-22 13:07:51 UTC
This is quite an effort, but it is all tests, no development. The only plan we have to improve things here is the move to GWT-RPC, which is unlikely to happen in 4.1 (and may require re-testing item 1).

I don't think any of it belong to the scale team, but regardless, moving NEEDINFO to Gil. 

Note that no. 4 is quite impossible to estimate. Essentially, there's a race between the speed (affected by latency, bandwidth and available CPU) of migration and the speed at which the VM dirties its pages. Latency just adds more chance for the VM to 'win' - but we can't tell by how much really.

Comment 6 Gil Klein 2016-12-05 18:47:55 UTC
Don't we have a clear timeout that we can just document for all of the scenarios described in comment #4?

Comment 7 Yaniv Lavi 2016-12-06 23:43:26 UTC
(In reply to Gil Klein from comment #6)
> Don't we have a clear timeout that we can just document for all of the
> scenarios described in comment #4?

Usability is not failure point. We need testing on this.

Comment 8 Gil Klein 2017-01-03 11:51:36 UTC
(In reply to Yaniv Dary from comment #4)
> What is needed:
> 1. An estimate of the point at which the latency between GUI and Manager
> renders RHV unusable
Do we have any way to test this?
> 2. An estimate of the point at which the latency between the Manager and the
> hosts    renders RHV unusable
Pavel, can you help measure this and reply back?
> 3. An estimate of the point at which the latency between the Storage and the
> engine renders RHV unusable
Raz, can you help measure this and reply back?
> 4. An estimate of the point at which you should assume that live migration
> is not going to work, but has timed-out
I agree with Yaniv K insight provided in comment #5. It depends on the guest memory activity. As this is not RHV specific, I think we should check if platform has any docs about it.
> 
> Can we provide this?
> Is the scale effort planning to address this?

Comment 9 Yaniv Lavi 2017-02-07 08:56:25 UTC
Moving to future until info is provided.


Note You need to log in before you can comment on or make changes to this bug.