Bug 1539777
| Summary: | Improve Migration summary message | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Steffen Froemer <sfroemer> |
| Component: | ovirt-engine | Assignee: | Shmuel Melamud <smelamud> |
| Status: | CLOSED ERRATA | QA Contact: | Israel Pinto <ipinto> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.1.6 | CC: | ahadas, apinnick, lsurette, michal.skrivanek, rbalakri, Rhev-m-bugs, sfroemer, smelamud, srevivo, ykaul, ylavi |
| Target Milestone: | ovirt-4.2.3 | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, the migration summary message showed the same value for 'total migration time' and 'actual migration time'. This value was calculated as the period of time from the start of execution of the migration command until the end of the entire migration process. In the current release, 'actual migration time' is calculated from the first migration progress event to the end of the entire migration process. If the migration command is run several times, 'actual migration time' reflects only the last run, while the 'total migration time' reflects the total time for all runs.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-05-15 17:47:24 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Steffen Froemer
2018-01-29 15:22:27 UTC
sure, it's just that it was already supposed to be like that. Arik, do we now do any throttling on engine side? We should initiate all the migrations at the same time and queue in vdsm (In reply to Michal Skrivanek from comment #7) > sure, it's just that it was already supposed to be like that. Arik, do we > now do any throttling on engine side? We should initiate all the migrations > at the same time and queue in vdsm Well, there are several operations that may take time which will not be included in the current measurements: 1. the time it took to validate the input at the client side (should be negligible) 2. the time it took to process the request (that depends on the number of threads that are available for processing requests). 3. if there are multiple VMs being selected together, we execute them as multiple-actions, which may cause their validation to run in parallel - initiating the threads and waiting for results may introduce some overhead. 4. the time it took to validate the migration request (e.g., that there is an available host). 5. the time it took to schedule the destination host. From that time we start our 'timer'. the timer would stop when an event that the migration completed is produced. Another phase that may take some time - the infrastructure used to hold 2 connections with VDSM so theoretically if we're about to send 3 migration requests at the same time, the third will be sent to VDSM only after we get response from VDSM about one of the first two requests. I'm not sure if it has changed though. Unfortunately, we cannot start the timer on the client side since the client side may not be in sync with the engine. I see the motivation in having such measurement but really, the time that the steps above are expected to take is negligible compared to what we measure today. (In reply to Arik from comment #8) > Another phase that may take some time - > the infrastructure used to hold 2 connections with VDSM so theoretically if > we're about to send 3 migration requests at the same time, the third will be > sent to VDSM only after we get response from VDSM about one of the first two > requests. I'm not sure if it has changed though. well, it's queued on the semaphore inside a separate thread, so all 3 requests should be processed immediately. But looking at the bug description it looks more like requests queued up in engine already and they are not sent to vdsm. Well, most of the data is already included. Let me try to describe it with facts and an example. I started a migration with 3 virtual machine at the same time. 2018-02-26 09:47:19.082-05 | 62 | Migration started (VM: vm1, Source: hostC, Destination: hostA, User: someone). 2018-02-26 09:47:19.334-05 | 62 | Migration started (VM: vm2, Source: hostC, Destination: hostA, User: someone). 2018-02-26 09:47:19.581-05 | 62 | Migration started (VM: vm3, Source: hostC, Destination: hostB, User: someone). 2018-02-26 09:47:41.499-05 | 63 | Migration completed (VM: vm1, Source: hostC, Destination: hostA, Duration: 22 seconds, Total: 22 seconds, Actual downtime: 228ms) 2018-02-26 09:47:42.637-05 | 63 | Migration completed (VM: vm2, Source: hostC, Destination: hostA, Duration: 23 seconds, Total: 23 seconds, Actual downtime: 284ms) 2018-02-26 09:48:07.171-05 | 63 | Migration completed (VM: vm3, Source: hostC, Destination: hostB, Duration: 47 seconds, Total: 47 seconds, Actual downtime: (N/A)) As you can see, the migration started for all three systems at the same time => 2018-02-26 09:47:19 (ignoring milliseconds) But in fact, always "Duration" and "Total" is same. While Total does mention the correct value, the "Duration" for the vm3 should be lower, maybe similar to something around ~20 seconds, as duration should only measure the real migration progress. Does this make sense to you? yes, thanks a lot. That matches the implementation, it seems the total is just lost somewhere in the process later on Verify with: Engine version:4.2.3.2-0.1.el7 Steps: 1. Migrate one VM and check that duration is reported in the end of the migration 2. Migrate 5 VMs and check that for each VM we have report of the duration 3. Migrate VM with each policy and see that there is no effect on the report PASS Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488 BZ<2>Jira Resync |