Bug 1322529 - nested virt: speed on host network interfaces is not reported in Webadmin
Summary: nested virt: speed on host network interfaces is not reported in Webadmin
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 3.6.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Marcin Mirecki
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On:
Blocks: migration_improvements
TreeView+ depends on / blocked
 
Reported: 2016-03-30 16:23 UTC by jniederm
Modified: 2022-06-30 08:06 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-16 15:20:57 UTC
oVirt Team: Network
Embargoed:
sbonazzo: ovirt-4.2-


Attachments (Terms of Use)
network interfaces subtab (27.75 KB, image/png)
2016-03-30 16:23 UTC, jniederm
no flags Details
caps (12.47 KB, text/plain)
2016-03-31 11:38 UTC, jniederm
no flags Details
stats (5.41 KB, text/plain)
2016-03-31 11:38 UTC, jniederm
no flags Details
sys_class_net_star_speeds (356 bytes, text/plain)
2016-03-31 11:40 UTC, jniederm
no flags Details
vdsm.log (9.70 MB, text/plain)
2016-03-31 11:42 UTC, jniederm
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-46769 0 None None None 2022-06-30 08:06:26 UTC
oVirt gerrit 57108 0 master ABANDONED net: Network speed reported as -1 if it can not be determined 2016-11-20 01:58:34 UTC
oVirt gerrit 57116 0 master NEW webadmin: Handle -1 nic speed in Webadmin 2016-05-23 09:32:09 UTC

Description jniederm 2016-03-30 16:23:47 UTC
Created attachment 1141838 [details]
network interfaces subtab

Description of problem:
All NIC appears to have speed 0 Mbps.

Version-Release number of selected component (if applicable):
3.6 master, commit d35c04d1686850dd03708377

How reproducible:
100%

Steps to Reproduce:
1. Add a host and select it
2. Select Network Interfaces subtab
3. Check "Speed" column

Actual results:
zeros

Expected results:
real link speeds, eg 100Mpbs, 1000Mbps

Comment 1 Dan Kenigsberg 2016-03-30 18:56:13 UTC
Please share `vdsClient -s 0 getVdsCaps` and `vdsClient -s 0 getVdsStats` from that host, as well as the output of `more /sys/class/net/*/speed |cat`

Comment 2 Dan Kenigsberg 2016-03-30 18:56:40 UTC
oh, vdsm.log might be of interest, too.

Comment 3 jniederm 2016-03-31 11:38:22 UTC
Created attachment 1142168 [details]
caps

Comment 4 jniederm 2016-03-31 11:38:58 UTC
Created attachment 1142182 [details]
stats

Comment 5 jniederm 2016-03-31 11:40:12 UTC
Created attachment 1142184 [details]
sys_class_net_star_speeds

It did not work

Comment 6 jniederm 2016-03-31 11:42:50 UTC
Created attachment 1142185 [details]
vdsm.log

Comment 7 Dan Kenigsberg 2016-03-31 12:06:19 UTC
Are the speeds updated when you press the "refresh capabilities" button?

Comment 8 jniederm 2016-03-31 12:10:17 UTC
no

Comment 9 jniederm 2016-03-31 12:12:46 UTC
The problem is in engine. I use following patch fixing it: https://gerrit.ovirt.org/#/c/53670. The patch itself was rejected, see gerrit comments for details.

Comment 10 Dan Kenigsberg 2016-04-01 20:26:29 UTC
The patch was rejected because refreshing capabilities was thought to pull fresh speeds from the host. in comment 8 you say that this is not the case so we need a deeper look

Comment 11 jniederm 2016-04-26 11:22:04 UTC
Properly reported speed is required for computation of migration bandwidth. Rising severity.

Comment 12 Michal Skrivanek 2016-04-28 07:32:24 UTC
rising once more as this is a crucial component for one of the high profile 4.0 features, which is already under test

Comment 13 Marcin Mirecki 2016-04-29 08:49:28 UTC
The problem only occurs on nested hosts which use "virtio" nics.
For these, the zero speed is the "expected" behavior. This is because there is no "set speed" that a virtio-net interface can obtain. The nic is a software implementation of a network interface and obtains the fastest possible speed it can. That speed will vary depending on the network/CPU/memory load on the virtualisation host, as well as the destination of any particular traffic.
For example, when moving packets between 2 VMs on the same host, the packets can be sent as fast as the host's cpu can move them from one process to another.

The question is whether we should report this as 0, or some other message (like 'varying')?

Since this is only a problem for nested hosts, I suppose this is not a critical issue, as such a setup is probably not very often used in production environments.

Comment 14 jniederm 2016-04-29 13:05:45 UTC
How is then possible that getVdsStats is able to report the speed (1000) whereas getVdsCaps behaves probably as you described (it reports 0)?
See attachment 1142182 [details] and attachment 1142168 [details].

Comment 15 Marcin Mirecki 2016-04-29 14:11:40 UTC
The way we calculate speed, is we look at: /sys/class/net/%s/speed
and in case we fail we return 0.
This is what we return in caps.
In stats we also use the same value, but it later goes through this line of code:
        ifrate = last_sample.interfaces[ifid].speed or 1000
which is the cause of the different values.

This is definitely a bug. The two values should be the same.
The question is which value would you like to have? 
0, 1000, -1 (to hint an undetermined value)?
Let me know what you would expect.

Comment 16 Dan Kenigsberg 2016-05-01 05:57:05 UTC
Marcin, can you fix this problem on Engine side, by inventing a silly default (e.g. 1000mbps) when speed is unknown? Than, we can drop the lying on Vdsm's getVdsStats.

Comment 17 Sandro Bonazzola 2016-05-02 10:08:17 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 18 jniederm 2016-05-02 11:54:48 UTC
I'd rather go with reporting -1 and change UI to report "n/a". Silly value could be confusing for both engine devs and users.

E.g. this bug was reported to be able link speed for estimation of migration bandwidth. Till comment 13 I was sure that link speed of virtio NICs actually is 1000 Mbps.

Comment 19 Marcin Mirecki 2016-05-05 16:50:33 UTC
Submitted 2 patches (1 vdsm, 1 engine) done to Jakub's suggestions.
Vdsm now reports unknown speed as -1. This will be the case when network speed can not be determined (error on read). This is done for both vdsStats and vdsCaps. 

If the nic is down, the speed will be reported as 0 in vdsCaps.For vdsStats, it will still be 1000. I suppose this is also an error, but want to ask fromani (the author) if there is no secret purpose of this.

Engine will display this -1 as N/A. This will affect the speed, rxrate and txrate (since speed is used to calculate it).

Let me know if this suits you.

Comment 20 jniederm 2016-05-06 18:49:53 UTC
Looks good to me. If Francesco allows it, it would be great to have the speed reporting unified.

Comment 21 Dan Kenigsberg 2016-05-08 09:50:14 UTC
I'm afraid that I own the original lie regarding unknown->1000. If Engine (including Engine-3.6) can handle that, make sure that speeds are reported uniformly by all verbs.

Comment 22 Yaniv Lavi 2016-05-23 13:22:52 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 23 Yaniv Lavi 2016-05-23 13:25:18 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 24 Dan Kenigsberg 2017-07-16 15:20:57 UTC
I'd love to clean this piece of ugly behavior, but I am afraid that nested virt is not supported in RHV, and the effect is not very important to fix.


Note You need to log in before you can comment on or make changes to this bug.