Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 916067

Summary:

when cancel the migration with ctrl+c during block migration(full disk copy or incremental disk copy), then migration again will cause domain destroyed

Product:

Red Hat Enterprise Linux 6

Reporter:

hongming <honzhang>

Component:

qemu-kvm

Assignee:

Paolo Bonzini <pbonzini>

Status:

CLOSED ERRATA

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

6.4

CC:

areis, chayang, cwei, dgilbert, dyuan, juzhang, knoel, mkenneth, mzhan, neil, pbonzini, qzhang, rbalakri, rpacheco, shu, virt-maint, weizhan, xwei, ydu, zpeng

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

qemu-kvm-0.12.1.2-2.467.el6

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-07-22 06:02:55 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
libvirtd_debug.log	none

Description hongming 2013-02-27 07:22:20 UTC

Created attachment 703301 [details]
libvirtd_debug.log

Description of problem:
when cancel the migration with ctrl+c during block migration(full disk copy or incremental disk copy), then migration again will cause domain destroyed

Version-Release number of selected component (if applicable):
libvirt-0.10.2-18.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.355.el6.x86_64


How reproducible:
100%

Steps to Reproduce:
1.
# virsh start rhel6.3
Domain rhel6.3 started

2.
# virsh migrate --verbose --live --persistent rhel6.3 qemu+ssh://10.66.6.76/system --copy-storage-inc
root.6.76's password: 
Migration: [  9 %]^Cerror: operation aborted: migration job: canceled by client

3.
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 28    rhel6.3                        running


4.
# virsh migrate --verbose --live --persistent rhel6.3 qemu+ssh://10.66.6.76/system --copy-storage-inc
root.6.76's password: 
error: Unable to read from monitor: Connection reset by peer

5.
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rhel6.3                        shut off

The error log as follows

2013-02-21 08:13:24.962+0000: 9552: error : qemuMonitorIORead:513 : Unable to read from monitor: Connection reset by peer
2013-02-21 08:13:24.962+0000: 9552: debug : qemuMonitorIO:646 : Error on monitor Unable to read from monitor: Connection reset by peer
2013-02-21 08:13:24.962+0000: 9552: debug : virEventPollUpdateHandle:146 : EVENT_POLL_UPDATE_HANDLE: watch=40 events=12
2013-02-21 08:13:24.962+0000: 9552: debug : virEventPollInterruptLocked:697 : Skip interrupt, 1 -747411360
2013-02-21 08:13:24.962+0000: 9552: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7f4ab801a640
2013-02-21 08:13:24.962+0000: 9552: debug : qemuMonitorIO:680 : Triggering error callback
2013-02-21 08:13:24.962+0000: 9552: debug : qemuProcessHandleMonitorError:342 : Received error on 0x7f4ab810bae0 'rhel6.3'  


Actual results:
As above

Expected results:
domain works fine after cancel the migration

Additional info:

Comment 3 Paolo Bonzini 2013-04-04 13:29:10 UTC

We don't support migration with block migration, but I'll try to reproduce upstream and come back.

Comment 4 Paolo Bonzini 2013-05-21 14:19:03 UTC

Reproduced with QEMU.

$ /usr/libexec/qemu-kvm -drive file=/vm/test.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -S -vnc :0 -monitor stdio

$ /usr/libexec/qemu-kvm-rhel -drive file=/vm/test2.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -S -vnc :1 -incoming tcp:localhost:12345 -monitor stdio

on the first instance:
(qemu) migrate -d -b tcp:localhost:12345
(qemu) migrate_cancel 
(qemu) migrate -d -b tcp:localhost:12345
(qemu)
qemu-system-x86_64: /home/pbonzini/work/redhat-git/qemu-kvm-rhel6/block.c:3915: bdrv_set_dirty_tracking: Assertion `!bs->dirty_bitmap' failed.
Aborted

Works upstream.  I'm leaving it open in case OpenStack starts using migration with non-shared storage.

Comment 5 neil 2013-06-12 17:31:17 UTC

It would be nice to see this working. We're using block migration for creating VM backups with zero downtime (simply "cont" the original VM after migration completes), but this throws a wrench into those plans.

I can confirm that it works with upstream qemu as well.

Comment 6 neil 2013-06-12 20:08:25 UTC

I was able to narrow down the issue.

bad:  qemu-kvm-0.12.1.2-2.295.el6_3.8.x86_64.rpm
good: qemu-kvm-0.12.1.2-2.295.el6_3.2.x86_64.rpm

So it broke somewhere between those two versions.

Looking at the changelog, I suspect the issue is in here somewhere:

* Mon Oct 15 2012 Michal Novotny <minovotn> - qemu-kvm-0.12.1.2-2.295
.el6_3.4
- kvm-bitmap-add-a-generic-bitmap-and-bitops-library.patch [bz#852458]
- kvm-bitops-fix-test_and_change_bit.patch [bz#852458]
- kvm-add-hierarchical-bitmap-data-type-and-test-cases.patch [bz#852458]
- kvm-block-implement-dirty-bitmap-using-HBitmap.patch [bz#852458]
- kvm-block-return-count-of-dirty-sectors-not-chunks.patch [bz#852458]
- kvm-block-allow-customizing-the-granularity-of-the-dirty.patch [bz#852458]
- kvm-mirror-use-target-cluster-size-as-granularity.patch [bz#852458]
- kvm-virtio-console-Fix-failure-on-unconnected-pty.patch [bz#861049]
- Resolves: bz#852458
  ( copy cluster-sized blocks to the target of live storage migration)
- Resolves: bz#861049
  (Fedora 16 and 17 guests hang during boot)

@minovotn, any ideas?

Comment 7 Ademar Reis 2013-07-22 21:34:55 UTC

*** Bug 983415 has been marked as a duplicate of this bug. ***

Comment 8 neil 2014-01-29 18:57:16 UTC

Based on https://rhn.redhat.com/rhn/errata/details/Details.do?eid=25114 I tested this out with qemu-kvm-0.12.1.2-2.415.el6_5.3. While the behaviour has changed, it unfortunately has regressed further.

In the newer version of qemu-kvm, the first migration will complete, but the source VM is frozen. When I say frozen, I mean the source VM uses 100% CPU time and does not respond over the monitor interface (not even a prompt is given) and it must be killed using kill.

Since the first migration rendered the source VM inoperable, I was unable to test a second migration.

Comment 10 Paolo Bonzini 2014-05-29 10:25:58 UTC

*** Bug 1102566 has been marked as a duplicate of this bug. ***

Comment 12 Jeff Nelson 2015-04-23 14:57:11 UTC

Fix included in qemu-kvm-0.12.1.2-2.467.el6

Comment 14 Xiaoqing Wei 2015-04-29 09:07:59 UTC

Reproduce on qemu-kvm-rhev-0.12.1.2-2.460.el6.x86_64

[root@dhcp-9-242 staf-kvm-devel]# virsh migrate --verbose --live --persistent 2k12r2 qemu+ssh://10.66.8.191/system --copy-storage-inc
root.8.191's password: 
Migration: [  6 %]^Cerror: operation aborted: migration job: canceled by client

[root@dhcp-9-242 staf-kvm-devel]# virsh migrate --verbose --live --persistent 2k12r2 qemu+ssh://10.66.8.191/system --copy-storage-inc
root.8.191's password: 
error: Unable to read from monitor: Connection reset by peer

[root@dhcp-9-242 staf-kvm-devel]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     2k12r2                         shut off



Verify on qemu-kvm-rhev-0.12.1.2-2.469.el6.x86_64

[root@dhcp-9-242 staf-kvm-devel]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     2k12r2                         shut off

[root@dhcp-9-242 staf-kvm-devel]# virsh start 2k12r2
Domain 2k12r2 started

[root@dhcp-9-242 staf-kvm-devel]# virsh migrate --verbose --live --persistent 2k12r2 qemu+ssh://10.66.8.191/system --copy-storage-inc
root.8.191's password: 
Migration: [  6 %]^Cerror: operation aborted: migration job: canceled by client

[root@dhcp-9-242 staf-kvm-devel]# virsh migrate --verbose --live --persistent 2k12r2 qemu+ssh://10.66.8.191/system --copy-storage-inc
root.8.191's password: 
Migration: [100 %]
[root@dhcp-9-242 staf-kvm-devel]# 



Move to VERIFIED.

Comment 16 errata-xmlrpc 2015-07-22 06:02:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1275.html