Description of problem: According to the'Table 5.4. Bandwidth Explained' documentation [1]: ~~~ Defined by user (in Mbps). This value is divided by the number of concurrent migrations (default is 2, to account for ingoing and outgoing migration). Therefore, the user-defined bandwidth must be large enough to accommodate all concurrent migrations. For example, if the Custom bandwidth is defined as 600 Mbps, a virtual machine migration’s maximum bandwidth is actually 300 Mbps. ~~~ According to the 'lib/vdsm/api/vdsm-api.yml': ~~~ 3312 ┆ ┆ ┆ - defaultvalue: needs updating 3313 ┆ ┆ ┆ ┆ ┆ description: maximal bandwidth used by the migration (MiB/s) 3314 name: maxBandwidth 3315 ┆ ┆ ┆ ┆ ┆ type: int 3316 ┆ ┆ ┆ ┆ ┆ added: '4.0' 3317 ~~~ The engine is sending the migration bandwidth limit as Mbps but the vdsm interprets it as MiB/s. [1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2/html/administration_guide/sect-cluster_tasks#Cluster_Migration_Policy_Settings_Explained Version-Release number of selected component (if applicable): 4.2.6.4 How reproducible: Always Steps to Reproduce: 1. Configure the cluster's migration policy and set the migration bandwidth limit to 10 Mbps Actual results: The actual bandwidth consumed is 40Mbps Expected results: According to the documentation, and with a max concurrent migrations equals to 2, the actual bandwidth consumed should be 5Mbps.
Reassigned: ovirt-engine-4.3.0-0.4.master.20181230173049.gitef04cb4.el7.noarch qemu-kvm-ev-2.12.0-18.el7_6.1.1.x86_64 vdsm-4.30.4-81.gitad6147e.el7.x86_64 libvirt-client-4.5.0-10.el7_6.3.x86_64 When setting migration bandwidth limit to 10Mbps, the actual value in source host virsh domjobinfo is: Memory bandwidth: 52.016 MiB/s Observing REST API /ovirt-engine/api/clusters/ shows that the 10Mbps value was set successfully: <bandwidth> <assignment_method>custom</assignment_method> <custom_value>10</custom_value> </bandwidth> Also, need UI fix, the value in Webadmin edit cluster dialog, migration policy => migration bandwidth limit is in Mbps instead of MiB/s
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Maybe I am wrong but 10 Mbps divided by 2 (number of concurrent migrations) and divided by 8 (to change the units to MiB/s) is equal to 0: And according to './lib/vdsm/virt/migration.py', the actual '_maxBandwidth' will get the default value of 'migration_max_bandwidth' in that case. ~~~ 120 self._maxBandwidth = int( 121 kwargs.get('maxBandwidth') or 122 config.getint('vars', 'migration_max_bandwidth') 123 ) ~~~ Which in fact is '52' according to './lib/vdsm/common/config.py.in': ~~~ 98 ('migration_max_bandwidth', '52', 99 'Maximum bandwidth for migration, in MiBps, 0 means libvirt\'s ' 100 'default, since 0.10.x default in libvirt is unlimited'), 101 ~~~ So I would say this behavior is expected and we are getting the correct bandwidth: 52 MiB/s. To test this you need to select a maxBandWidth >= 16 Mbps or you will get the value of the 'migration_max_bandwidth' configuration variable.
I can confirm what Miguel writes in the last sentence above. When I set migration bandwidth to 30, the bandwidth reported by virsh is around 1 MiB/s, which is the expected limit (30 // 2 // 8 == 1). If I change the migration bandwidth to 33, the actual bandwidth is around 2 MiB/s (33 // 2 // 8 == 2). The values reported in the UI and the REST API are the same and they should be in Mbps according to the documentation cited by Miguel at the beginning of the bug description. Together with Miguel's explanation in Comment 0 and Comment 3, migration bandwidth behavior looks correct to me. Meital, could you please clarify what's wrong exactly and why?
Hi Milan and Miguel, After your explanation now I understand the behavior, As a customer, I would like to see an explanation about the value and the scale units(Mib/s / Mbps) so my suggestion is to add this explanation to the tool-tip (i) or on the mouseover on the test box check. Miguel, I didn't understand how you saw 40 when you set 10 Mbps - in this case it should be 52, right? Milan, why not that the engine will send Mib/s instead of Mbps? (meaning why the engine and VDSM not using the same scale units.)
(In reply to meital avital from comment #5) > Milan, why not that the engine will send Mib/s instead of Mbps? (meaning why > the engine and VDSM not using the same scale units.) I don't know, perhaps the units have been changed in Engine some time without changing the API (nobody wants to change the API, to not break cross-version compatibility). Since the original bug is apparently fixed, I'm moving the bug back to modify. If UI improvements are needed, let's have a separate bug for them.
> Hi Milan and Miguel, > > After your explanation now I understand the behavior, > As a customer, I would like to see an explanation about the value and the > scale units(Mib/s / Mbps) > so my suggestion is to add this explanation to the tool-tip (i) or on the > mouseover on the test box check I am not sure the customer needs to know about the unit change performed by the engine. Regarding the value, I guess we could explain the lower limit but I wouldn't do it either because nobody is going to use that low bandwidth. Have in mind that the hypervisors have at least 1000Mbps interfaces (I think 10000Mbps is the most common nowadays) and I would say having VMs with 8 or 16 GB of RAM is quite common too. Migrating those VMs with a high amount of RAM with that low bandwidth will make the migration to fail or at least, in the best case, to take ages. > > Miguel, I didn't understand how you saw 40 when you set 10 Mbps - in this > case it should be 52, right? Actually, I didn't use 10 Mbps in my local environment but 400Mbps. Obviously, I wasn't aware of the lower limit at that time or I wouldn't have written 10Mbps in the bug report > > Milan, why not that the engine will send Mib/s instead of Mbps? (meaning why > the engine and VDSM not using the same scale units.) I thought the same when I found the problem and that is another possibility, certainly. But if we do that, we would need to change not only the message in the dialog, we would need to change all the references in the documentation which IIRC were a lot.
Verified on: ovirt-engine-4.3.0-0.4.master.20181230173049.gitef04cb4.el7.noarch qemu-kvm-ev-2.12.0-18.el7_6.1.1.x86_64 vdsm-4.30.4-81.gitad6147e.el7.x86_64 libvirt-client-4.5.0-10.el7_6.3.x86_64
This bugzilla is included in oVirt 4.3.0 release, published on February 4th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.