RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1212077 - Add support for QMP event notification when migration finishes
Summary: Add support for QMP event notification when migration finishes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 1212032 (view as bug list)
Depends On: 580006
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-15 13:59 UTC by Jiri Denemark
Modified: 2015-11-19 06:28 UTC (History)
22 users (show)

Fixed In Version: libvirt-1.2.17-3.el7
Doc Type: Enhancement
Doc Text:
Clone Of: 580006
Environment:
Last Closed: 2015-11-19 06:28:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2202 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2015-11-19 08:17:58 UTC

Description Jiri Denemark 2015-04-15 13:59:35 UTC
+++ This bug was initially created as a clone of Bug #580006 +++

Description of problem:
QEMU will be sending a QMP event when migration finishes and libvirt needs to
start using it to avoid polling if possible.


Version-Release number of selected component (if applicable):
libvirt-1.2.8-16.el7
qemu-kvm-rhev-2.2*


How reproducible:


Steps to Reproduce:
1. enable debug logs for libvirt
2. start a domain
3. migrate it to another host (don't use progress indication in virsh)
4. watch logs to see if libvirt is constantly polling for migration status


Actual results:
Lots of debug messages are printed every 50ms when libvirt sends query-migrate
command to QEMU.


Expected results:
No query-migrate command is send to QEMU (unless explicitly requested by a
client).


Additional info:

Comment 1 Jiri Denemark 2015-04-15 14:00:11 UTC
*** Bug 1212032 has been marked as a duplicate of this bug. ***

Comment 2 Jiri Denemark 2015-06-02 12:51:33 UTC
Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2015-June/msg00064.html

Comment 3 Jiri Denemark 2015-06-11 08:18:41 UTC
The third version of the series is now ACKed upstream: https://www.redhat.com/archives/libvir-list/2015-June/msg00451.html

I will push the patches once the corresponding QEMU part is applied upstream.

Comment 4 Jiri Denemark 2015-07-09 20:11:28 UTC
Implemented upstream by v1.2.17-66-ga60ee61..v1.2.17-70-g3409f5b:

commit a60ee613c499c255faf7c36e7cd8582b5a79ca57
Author: Jiri Denemark <jdenemar>
Date:   Thu May 28 13:35:52 2015 +0200

    qemu_monitor: Wire up MIGRATION event
    
    Thanks to Juan's work QEMU finally emits an event whenever migration
    state changes.
    
    Signed-off-by: Jiri Denemark <jdenemar>

commit 3df4d2a45aaafd0fe2191c648c83a609726638f7
Author: Jiri Denemark <jdenemar>
Date:   Tue Jul 7 18:17:25 2015 +0200

    qemu: Enable migration events on QMP monitor
    
    Even if QEMU supports migration events it doesn't send them by default.
    We have to enable them by calling migrate-set-capabilities. Let's enable
    migration events everytime we can and clear QEMU_CAPS_MIGRATION_EVENT in
    case migrate-set-capabilities does not support events.
    
    Signed-off-by: Jiri Denemark <jdenemar>

commit 66c95964a5706c223f4857e55396318c857abb6f
Author: Jiri Denemark <jdenemar>
Date:   Tue May 26 13:42:06 2015 +0200

    qemuDomainGetJobStatsInternal: Support migration events
    
    When QEMU supports migration events the qemuDomainJobInfo structure will
    no longer be updated with migration statistics. We have to enter a job
    and explicitly ask QEMU every time virDomainGetJob{Info,Stats} is
    called.
    
    Signed-off-by: Jiri Denemark <jdenemar>

commit 6d2edb6a42d0d41971fcf53c16cdc785d36c3ecc
Author: Jiri Denemark <jdenemar>
Date:   Fri May 29 08:37:59 2015 +0200

    qemu: Update migration state according to MIGRATION event
    
    We don't need to call query-migrate every 50ms when we get the current
    migration state via MIGRATION event.
    
    Signed-off-by: Jiri Denemark <jdenemar>

commit 3409f5bc4e0ea870982f4eee3bb6f97fe9f9f883
Author: Jiri Denemark <jdenemar>
Date:   Fri May 29 08:38:44 2015 +0200

    qemu: Wait for migration events on domain condition
    
    Since we already support the MIGRATION event, we just need to make sure
    the domain condition is signalled whenever a p2p connection drops or the
    domain is paused due to IO error and we can avoid waking up every 50 ms
    to check whether something happened.
    
    Signed-off-by: Jiri Denemark <jdenemar>

Comment 8 Michal Privoznik 2015-07-13 12:31:00 UTC
I'm affraid this introduced a small regression. Fortunatelly I've a patch for that:

https://www.redhat.com/archives/libvir-list/2015-July/msg00503.html

Moving back to ASSIGNED.

Comment 9 Michal Privoznik 2015-07-13 13:29:00 UTC
Pushed upstream:

commit 45cc2fca5c97d7e6d6e00389c6f459b5660d0f1f
Author:     Michal Privoznik <mprivozn>
AuthorDate: Mon Jul 13 14:15:03 2015 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Mon Jul 13 15:07:12 2015 +0200

    qemuProcessHandleMigrationStatus: Update migration status more frequently
    
    After Jirka's migration patches libvirt is listening on migration
    events from qemu instead of actively polling on the monitor. There is,
    however, a little regression (introduced in 6d2edb6a42d0d41). The
    problem is, the current status of migration job is updated in
    qemuProcessHandleMigrationStatus if and only if migration job was
    started. But eventually every asynchronous job may result in
    migration. Therefore, since this job is not strictly a
    migration job, internal state was not updated and later checks failed:
    
      virsh # save fedora22 /tmp/fedora22_ble.save
      error: Failed to save domain fedora22 to /tmp/fedora22_ble.save
      error: operation failed: domain save job: is not active
    
    Signed-off-by: Michal Privoznik <mprivozn>

v1.2.17-90-g45cc2fc

To POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2015-July/msg00590.html

Comment 10 Fangge Jin 2015-08-21 11:03:23 UTC
Reproduce comment#9 on build:
  libvirt-1.2.17-2.el7
  qemu-kvm-rhev-2.3.0-18.el7.x86_64

# virsh save r71 /tmp/r71.save
error: Failed to save domain r71 to /tmp/r71.save
error: operation failed: domain save job: is not active


Verify it on build:
  libvirt-1.2.17-5.el7
  qemu-kvm-rhev-2.3.0-18.el7.x86_64

# virsh save r71 /tmp/r71.save

Domain r71 saved to /tmp/r71.save

Comment 11 Fangge Jin 2015-08-21 11:12:50 UTC
Reproduce this bug on build:
  libvirt-1.2.8-16.el7.x86_64
  qemu-kvm-rhev-2.2.0-9.el7.x86_64

Got this log 20 times per minute during migration:
2015-08-21 03:26:17.583+0000: 9231: debug : qemuMonitorGetMigrationStatus:2271 : mon=0x7fbe2c272b00


Verify this bug on build:
  libvirt-1.2.17-5.el7
  qemu-kvm-rhev-2.3.0-18.el7.x86_64

Scenario 1(migration successfully without any error):

On source host:
# virsh migrate r71 qemu+ssh://10.66.6.6/system --live

# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 08:46:31.522+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to setup
2015-08-21 08:46:31.524+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to active
2015-08-21 08:46:34.829+0000: 32696: debug : qemuMonitorGetMigrationStatus:2132 : mon:0x7fcf60020230 vm:0x7fcf2026e4c0 json:1 fd:26
2015-08-21 08:46:55.903+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to completed
2015-08-21 08:46:55.903+0000: 32694: debug : qemuMonitorGetMigrationStatus:2132 : mon:0x7fcf60020230 vm:0x7fcf2026e4c0 json:1 fd:26

# virsh event r71 --loop --all
event 'lifecycle' for domain r71: Suspended Paused
event 'lifecycle' for domain r71: Stopped Migrated

Get dominfo during migration:
# virsh domjobinfo r71
Job type:         Unbounded   
Time elapsed:     5310         ms
Data processed:   94.084 MiB
Data remaining:   764.305 MiB
Data total:       1.103 GiB
Memory processed: 94.084 MiB
Memory remaining: 764.305 MiB
Memory total:     1.103 GiB
Memory bandwidth: 20.001 MiB/s
Constant pages:   69495       
Normal pages:     23886       
Normal data:      93.305 MiB
Expected downtime: 300          ms
Setup time:       3            ms

On target host:
# virsh event r71 --loop --all
event 'agent-lifecycle' for domain r71: state: 'disconnected' reason: 'domain started'
event 'lifecycle' for domain r71: Started Migrated
event 'lifecycle' for domain r71: Resumed Migrated
event 'agent-lifecycle' for domain r71: state: 'connected' reason: 'channel event'

# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 08:46:31.497+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to active
2015-08-21 08:46:55.927+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to completed
2015-08-21 08:46:55.927+0000: 5107: debug : qemuProcessHandleMigrationStatus:1536 : got MIGRATION event without a migration job


Scenario 2(p2p disconnect during migration):

On source host:
Migrate via p2p, and execute command "iptables -A INPUT -s 10.66.6.6 -j DROP" during migration, after 30s, migration failed:
# virsh migrate r71 qemu+ssh://10.66.6.6/system --live --p2p
error: operation failed: Lost connection to destination host

# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 08:58:12.042+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to setup
2015-08-21 08:58:12.045+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to active
2015-08-21 08:58:42.034+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to cancelling
2015-08-21 08:58:42.035+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to cancelled

# virsh event r71 --loop --all

On target host:
# virsh event r71 --loop --all
event 'agent-lifecycle' for domain r71: state: 'disconnected' reason: 'domain started'
event 'lifecycle' for domain r71: Started Migrated

# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 08:58:12.018+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to active


Scenario 3(I/O error during migration):
a)without --abort-on-error option, migration can succeed even if I/O error during migration.
On source host:
# virsh migrate r71 qemu+ssh://10.66.6.6/system --live

# virsh event r71 --loop --all
event 'io-error' for domain r71: /var/lib/libvirt/images/mypool/disk1.img (virtio-disk1) pause
event 'io-error-reason' for domain r71: /var/lib/libvirt/images/mypool/disk1.img (virtio-disk1) pause due to enospc
event 'lifecycle' for domain r71: Suspended I/O Error
event 'io-error' for domain r71: /var/lib/libvirt/images/mypool/disk1.img (virtio-disk1) pause
event 'io-error-reason' for domain r71: /var/lib/libvirt/images/mypool/disk1.img (virtio-disk1) pause due to enospc
event 'lifecycle' for domain r71: Stopped Migrated

# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 09:52:47.957+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to setup
2015-08-21 09:52:47.958+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to active
2015-08-21 09:55:31.130+0000: 32688: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7fcf2026e4c0 r71 changed state to completed
2015-08-21 09:55:31.131+0000: 32694: debug : qemuMonitorGetMigrationStatus:2132 : mon:0x7fcf60010bd0 vm:0x7fcf2026e4c0 json:1 fd:26

On target host:
# virsh event r71 --loop --all
event 'agent-lifecycle' for domain r71: state: 'disconnected' reason: 'domain started'
event 'lifecycle' for domain r71: Started Migrated
event 'lifecycle' for domain r71: Resumed Migrated
event 'agent-lifecycle' for domain r71: state: 'connected' reason: 'channel event'


# tailf /var/log/libvirtd_debug.log | grep -E 
2015-08-21 09:52:47.930+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to active
2015-08-21 09:55:31.134+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to completed
2015-08-21 09:55:31.134+0000: 5107: debug : qemuProcessHandleMigrationStatus:1536 : got MIGRATION event without a migration job


b)with --abort-on-error option, migration is cancelled if I/O error during migration.
On source host:
# virsh migrate r71 qemu+ssh://10.66.6.6/system --live --abort-on-error
error: internal error: early end of file from monitor: possible problem:
2015-08-21T10:50:17.782271Z qemu-kvm: error while loading state section id 2(ram)
2015-08-21T10:50:17.782445Z qemu-kvm: load of migration failed: Input/output error

# virsh event r71 --loop --all
event 'io-error' for domain r71: /var/lib/libvirt/images/mypool/disk1.img (virtio-disk1) pause
event 'io-error-reason' for domain r71: /var/lib/libvirt/images/mypool/disk1.img (virtio-disk1) pause due to enospc
event 'lifecycle' for domain r71: Suspended I/O Error
...
event 'lifecycle' for domain r71: Resumed Unpaused
event 'lifecycle' for domain r71: Resumed Migrated
...
event 'lifecycle' for domain r71: Suspended I/O Error
...

# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 10:50:15.676+0000: 14267: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7f4d30229550 r71 changed state to setup
2015-08-21 10:50:15.680+0000: 14267: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7f4d30229550 r71 changed state to active
2015-08-21 10:50:17.871+0000: 14267: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7f4d30229550 r71 changed state to cancelling
2015-08-21 10:50:17.871+0000: 14267: debug : qemuProcessHandleMigrationStatus:1535 : Migration of domain 0x7f4d30229550 r71 changed state to cancelled

On target host:
# tailf /var/log/libvirtd_debug.log | grep -E 'qemuProcessHandleMigrationStatus|qemuMonitorGetMigrationStatus'
2015-08-21 10:50:15.647+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to active
2015-08-21 10:50:17.782+0000: 5107: debug : qemuProcessHandleMigrationStatus:1532 : Migration of domain 0x7f596401ea50 r71 changed state to failed

# virsh event r71 --loop --all
event 'agent-lifecycle' for domain r71: state: 'disconnected' reason: 'domain started'
event 'lifecycle' for domain r71: Started Migrated
event 'lifecycle' for domain r71: Stopped Failed

Comment 12 Fangge Jin 2015-08-21 11:16:29 UTC
Is there anything I missed in my verification steps?

Comment 13 Fangge Jin 2015-08-21 11:23:43 UTC
And a small question, why there is always this log after migration finished on target host:

2015-08-21 09:55:31.134+0000: 5107: debug : qemuProcessHandleMigrationStatus:1536 : got MIGRATION event without a migration job

Comment 14 Jiri Denemark 2015-08-31 12:34:58 UTC
It would be nice to also check that migration works as expected with qemu-kvm package (without "-rhev" suffix) with which libvirt should still be using polling for migration status.

Comment 15 Fangge Jin 2015-09-01 05:21:00 UTC
Verify with qemu-kvm package, version:
libvirt-1.2.17-6.el7.x86_64
qemu-kvm-1.5.3-101.el7.x86_64

Steps:
1.# virsh migrate rhel7d1 qemu+ssh://10.66.6.6/system --live --verbose
Migration: [100 %]

Got this log 20 times per minute during migration, which indicates libvirt polls for migration status 20 times per minute:
debug : qemuMonitorGetMigrationStatus:2132 : mon:0x7f98b800fe90 vm:0x7f98742254f0 json:1 fd:26

2.Get domjobinfo during migration:
# virsh domjobinfo rhel7d1
Job type:         Unbounded   
Time elapsed:     4594         ms
Data processed:   61.614 MiB
Data remaining:   709.559 MiB
Data total:       1.102 GiB
Memory processed: 61.614 MiB
Memory remaining: 709.559 MiB
Memory total:     1.102 GiB
Memory bandwidth: 15.029 MiB/s
Constant pages:   91761       
Normal pages:     15541       
Normal data:      60.707 MiB
Expected downtime: 30           ms
Setup time:       2            ms


So the verification with qemu-kvm package passed.

Comment 17 errata-xmlrpc 2015-11-19 06:28:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html


Note You need to log in before you can comment on or make changes to this bug.