Bug 1282744
Summary: | Actual downtime - Sometimes libvirt doesn't report 'downtime_net' in jobStats while migrating VM/s | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Michael Burman <mburman> | ||||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.2 | CC: | dyuan, fjin, jdenemar, mburman, rbalakri, zpeng | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | virt | ||||||||
Fixed In Version: | libvirt-1.3.3-1.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-11-03 18:30:54 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Michael Burman
2015-11-17 10:11:33 UTC
Could you attach libvirtd debug logs from both source and destination hosts? Created attachment 1096025 [details]
source log
Created attachment 1096052 [details]
destination log
Fixed upstream by v1.3.2-86-gcb483a6: commit cb483a68fdc3503efc9b0996570e58aaf0c11c17 Author: Jiri Denemark <jdenemar> AuthorDate: Tue Feb 23 10:47:01 2016 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Tue Mar 8 16:26:00 2016 +0100 qemu: Fix a race when computing migration downtime Computing a total downtime during a migration requires us to store a time stamp when guest CPUs get stopped. The value (and all other statistics) is then transferred to the destination to compute the downtime. Because the stopped time stamp is stored by a STOP event handler while the statistics which will be sent over to the destination are copied synchronously within qemuMigrationWaitForCompletion. Depending on the timing of STOP and MIGRATION events, we may end up copying (and transferring) statistics without the stopped time stamp set. Let's make sure we always use the correct time stamp. https://bugzilla.redhat.com/show_bug.cgi?id=1282744 Signed-off-by: Jiri Denemark <jdenemar> This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions test with pure libvirt: libvirt-2.0.0-8.el7.x86_64 qemu-kvm-rhev-2.6.0-24.el7.x86_64 step: 1: prepare two machine 2: loop live migration 60 times 3: check downtime every time both on source and target statistics always get actual downtime. test with RHV both source and target build: libvirt-2.0.0-8.el7.x86_64 vdsm-4.18.13-1.el7ev.x86_64 3.10.0-505.el7.x86_64 test ping-pong migration from 2 rhel 7.3 servers check event log in the UI reports: Migration completed (VM: n2, Source: A, Destination: B, Duration: 53 seconds, Total: 53 seconds, Actual downtime: 368ms) Migration completed (VM: n2, Source: B, Destination: A, Duration: 53 seconds, Total: 53 seconds, Actual downtime: 310ms) Migration completed (VM: n2, Source: A, Destination: B, Duration: 54 seconds, Total: 54 seconds, Actual downtime: 365ms) Migration completed (VM: n2, Source: B, Destination: A, Duration: 55 seconds, Total: 55 seconds, Actual downtime: 302ms) Migration completed (VM: n2, Source: A, Destination: B, Duration: 53 seconds, Total: 53 seconds, Actual downtime: 373ms) Migration completed (VM: n2, Source: B, Destination: A, Duration: 53 seconds, Total: 53 seconds, Actual downtime: 303ms) Migration completed (VM: n2, Source: A, Destination: B, Duration: 53 seconds, Total: 53 seconds, Actual downtime: 387ms) Migration completed (VM: n2, Source: B, Destination: A, Duration: 53 seconds, Total: 53 seconds, Actual downtime: 299ms) Migration completed (VM: n2, Source: A, Destination: B, Duration: 54 seconds, Total: 54 seconds, Actual downtime: 389ms) Migration completed (VM: n2, Source: B, Destination: A, Duration: 54 seconds, Total: 54 seconds, Actual downtime: 266ms) all get actual downtime. worked as expect, move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2577.html |