1. Proposed title of this feature request
Improve the information of migration summary message in RHV UI
3. What is the nature and description of the request?
When starting a migration of multiple virtual machines, the total time of the migration is nearly same like migration duration. This could be higher, if there were some retries.
But when a migration need to wait until it can start, because it has to wait for any other migrations, this wait-time is not added to the total-time.
4. Why does the customer need this? (List the business requirements here)
The current information could lead to misunderstanding the whole migration process.
5. How would the customer like to achieve this? (List the functional requirements here)
The "total time of migration" should be the time between starting the process of migration (clicking the migrate button) and the migration of the VM itself.
If I select multiple VMs (e.g. 10) and the migration only allow 2-migrations in parallel, the start of whole migration process is much earlier than the finish of last migration.
6. For each functional requirement listed, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.
Select 10 VMs to migrate. The total time of last VM should be 5-times higher than the first migrated VM, as it would need to wait for finish of all other VMs.
And it's only allowed to perform 2 migrations in parallel.
7. Is there already an existing RFE upstream or in Red Hat Bugzilla?
8. Does the customer have any specific timeline dependencies and which release would they like to target (i.e. RHEL5, RHEL6)?
9. Is the sales team involved in this request and do they have any additional input?
10. List any affected packages or components.
11. Would the customer be able to assist in testing this functionality if implemented?
sure, it's just that it was already supposed to be like that. Arik, do we now do any throttling on engine side? We should initiate all the migrations at the same time and queue in vdsm
(In reply to Michal Skrivanek from comment #7)
> sure, it's just that it was already supposed to be like that. Arik, do we
> now do any throttling on engine side? We should initiate all the migrations
> at the same time and queue in vdsm
Well, there are several operations that may take time which will not be included in the current measurements:
1. the time it took to validate the input at the client side (should be negligible)
2. the time it took to process the request (that depends on the number of threads that are available for processing requests).
3. if there are multiple VMs being selected together, we execute them as multiple-actions, which may cause their validation to run in parallel - initiating the threads and waiting for results may introduce some overhead.
4. the time it took to validate the migration request (e.g., that there is an available host).
5. the time it took to schedule the destination host.
From that time we start our 'timer'. the timer would stop when an event that the migration completed is produced. Another phase that may take some time - the infrastructure used to hold 2 connections with VDSM so theoretically if we're about to send 3 migration requests at the same time, the third will be sent to VDSM only after we get response from VDSM about one of the first two requests. I'm not sure if it has changed though.
Unfortunately, we cannot start the timer on the client side since the client side may not be in sync with the engine. I see the motivation in having such measurement but really, the time that the steps above are expected to take is negligible compared to what we measure today.
(In reply to Arik from comment #8)
> Another phase that may take some time -
> the infrastructure used to hold 2 connections with VDSM so theoretically if
> we're about to send 3 migration requests at the same time, the third will be
> sent to VDSM only after we get response from VDSM about one of the first two
> requests. I'm not sure if it has changed though.
well, it's queued on the semaphore inside a separate thread, so all 3 requests should be processed immediately. But looking at the bug description it looks more like requests queued up in engine already and they are not sent to vdsm.
Well, most of the data is already included. Let me try to describe it with facts and an example.
I started a migration with 3 virtual machine at the same time.
2018-02-26 09:47:19.082-05 | 62 | Migration started (VM: vm1, Source: hostC, Destination: hostA, User: someone).
2018-02-26 09:47:19.334-05 | 62 | Migration started (VM: vm2, Source: hostC, Destination: hostA, User: someone).
2018-02-26 09:47:19.581-05 | 62 | Migration started (VM: vm3, Source: hostC, Destination: hostB, User: someone).
2018-02-26 09:47:41.499-05 | 63 | Migration completed (VM: vm1, Source: hostC, Destination: hostA, Duration: 22 seconds, Total: 22 seconds, Actual downtime: 228ms)
2018-02-26 09:47:42.637-05 | 63 | Migration completed (VM: vm2, Source: hostC, Destination: hostA, Duration: 23 seconds, Total: 23 seconds, Actual downtime: 284ms)
2018-02-26 09:48:07.171-05 | 63 | Migration completed (VM: vm3, Source: hostC, Destination: hostB, Duration: 47 seconds, Total: 47 seconds, Actual downtime: (N/A))
As you can see, the migration started for all three systems at the same time
=> 2018-02-26 09:47:19 (ignoring milliseconds)
But in fact, always "Duration" and "Total" is same.
While Total does mention the correct value, the "Duration" for the vm3 should be lower, maybe similar to something around ~20 seconds, as duration should only measure the real migration progress.
Does this make sense to you?
yes, thanks a lot. That matches the implementation, it seems the total is just lost somewhere in the process later on
1. Migrate one VM and check that duration is reported in the end of the migration
2. Migrate 5 VMs and check that for each VM we have report of the duration
3. Migrate VM with each policy and see that there is no effect on the report
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.