Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 671100 - possible migration failure due to erroneous interpretation of subsection
possible migration failure due to erroneous interpretation of subsection
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.0
Unspecified Unspecified
low Severity medium
: rc
: ---
Assigned To: Paolo Bonzini
Virtualization Bugs
: Triaged
: 629453 (view as bug list)
Depends On:
Blocks: 580954
  Show dependency treegraph
 
Reported: 2011-01-20 05:36 EST by Paolo Bonzini
Modified: 2011-05-19 09:01 EDT (History)
6 users (show)

See Also:
Fixed In Version: qemu-kvm-0.12.1.2-2.143.el6
Doc Type: Bug Fix
Doc Text:
Cause: a possible ambiguity in the migration format is handled incorrectly by the receiving end. Consequence: in very rare cases migration may fail. Fix: the ambiguity is resolved correctly. Result: incoming migration data is interpreted correctly and migration succeeds.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-05-19 07:34:39 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0534 normal SHIPPED_LIVE Important: qemu-kvm security, bug fix, and enhancement update 2011-05-19 07:20:36 EDT

  None (edit)
Description Paolo Bonzini 2011-01-20 05:36:30 EST
See http://permalink.gmane.org/gmane.comp.emulators.qemu/87767 and thread.

> Although it's rare to happen in live migration, when the head of a
> byte stream contains 0x05 which is the marker of subsection, the
> loader gets corrupted because vmstate_subsection_load() continues even
> the device doesn't require it.  This patch adds a checker whether
> subsection is needed, and skips following routines if not needed.

This was reported with Kemari, but it is not limited to it.  After a VMS_STRUCT a 0x5 byte is part of the parent data stream, but it is parsed as a subsection of the data stream.

Having subsection nested and under VMS_STRUCT is simply not going to work, so the patch linked above is the right solution.
Comment 3 Paolo Bonzini 2011-02-01 04:01:13 EST
The bug is very rare; we can provide a patched package that will always fail without this patch and always pass with it.  Is that okay?
Comment 7 Mike Cao 2011-02-15 00:26:59 EST
(In reply to comment #3)
> The bug is very rare; we can provide a patched package that will always fail
> without this patch and always pass with it.  Is that okay?

I will use this workround to verify this bug .Could you provide me the scratch build ?
Mike
Comment 8 Paolo Bonzini 2011-02-15 11:31:37 EST
I placed them at http://people.redhat.com/pbonzini/bz671100/

The ".pbtest" rpms won't pass a save/restore (virsh save/virsh restore), the ".pbfixed" rpms will.

Unfortunately, due to bug 677712, you won't be able to restore a ".pbtest" vm with the ".pbfixed" rpms, which would also be a nice test.
Comment 9 Shaolong Hu 2011-02-18 03:42:04 EST
Reproduced on qemu-kvm-0.12.1.2-2.113.el6.pbtest.x86_64.rpm at http://people.redhat.com/pbonzini/bz671100/ as following steps.

Reproduce Procedure:
---------------------
1. boot guest A with:

/usr/libexec/qemu-kvm -M rhel6.0.0 -enable-kvm -m 2G -smp 2 -name RHEL-Server-6.0_64_raw -uuid `uuidgen` -rtc base=utc -boot order=cd -drive file=./RHEL-Server-6.0_64_raw,if=none,id=drive-virtio-disk0,format=raw,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,script=/etc/qemu-ifup,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:7b:a2:fa -usb -device usb-tablet,id=input0 -vnc :10 -monitor stdio

2. boot guest B with:

/usr/libexec/qemu-kvm -M rhel6.0.0 -enable-kvm -m 2G -smp 2 -name RHEL-Server-6.0_64_raw -uuid `uuidgen` -rtc base=utc -boot order=cd -drive file=./RHEL-Server-6.0_64_raw,if=none,id=drive-virtio-disk0,format=raw,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,script=/etc/qemu-ifup,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:7b:a2:fa -usb -device usb-tablet,id=input0 -vnc :10 -monitor stdio -incoming tcp:0:5555

3. in guest A:
   (qemu) migrate -d tcp:xx.xx.xx.xx:5555
4. in guest A:
   (qemu) info migrate

Actual results:
----------------
After step 4, info migrate suggest that migrate failed, no dmesg on guest A and B.


Verify this bug on qemu-kvm-0.12.1.2-2.113.el6.pbfixed.x86_64.rpm at http://people.redhat.com/pbonzini/bz671100/ and qemu-kvm-0.12.1.2-2.146.el6 as the same steps above.

Actual results:
----------------
After step 4, info migrate suggest migrate completed.

Conclusion:
-------------
According to above results, this bug has been resolved.
Comment 11 Dor Laor 2011-03-24 06:31:29 EDT
*** Bug 629453 has been marked as a duplicate of this bug. ***
Comment 12 Paolo Bonzini 2011-05-05 09:18:02 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: a possible ambiguity in the migration format is handled incorrectly by the receiving end.

Consequence: in very rare cases migration may fail.

Fix: the ambiguity is resolved correctly.

Result: incoming migration data is interpreted correctly and migration succeeds.
Comment 13 errata-xmlrpc 2011-05-19 07:34:39 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html
Comment 14 errata-xmlrpc 2011-05-19 09:01:07 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Note You need to log in before you can comment on or make changes to this bug.