Bug 1978526 - Storage is not copied at all when do vm live migration with --copy-storage-inc
Summary: Storage is not copied at all when do vm live migration with --copy-storage-inc
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: beta
: ---
Assignee: Virtualization Maintenance
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1978716
TreeView+ depends on / blocked
 
Reported: 2021-07-02 05:12 UTC by Fangge Jin
Modified: 2022-01-13 04:30 UTC (History)
6 users (show)

Fixed In Version: libvirt-7.6.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1978716 (view as bug list)
Environment:
Last Closed: 2021-12-07 21:57:54 UTC
Type: Bug
Target Upstream Version: 7.6.0
Embargoed:


Attachments (Terms of Use)
libvirtd log (85.91 KB, application/x-bzip)
2021-07-02 05:12 UTC, Fangge Jin
no flags Details
The qmp log of src and dest hosts (14.10 KB, application/gzip)
2021-07-02 08:19 UTC, Han Han
no flags Details

Description Fangge Jin 2021-07-02 05:12:02 UTC
Created attachment 1797017 [details]
libvirtd log

Description of problem:
As subject

Version-Release number of selected component (if applicable):
qemu-kvm-6.0.0-7.el9.x86_64
libvirt-7.4.0-1.el9.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Start a vm with local storage

2. Pre-create the disk image on target host manually
# qemu-img create -f qcow2 /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 10G

# qemu-img info /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 -U
image: /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 196 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false


2. Migrate vm with --copy-storage-inc
# virsh migrate avocado-vt-vm1 qemu+ssh://******/system --live --verbose --copy-storage-inc

3. Check the disk image on dest host, found its size is small
# qemu-img info /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2 -U
image: /var/lib/avocado/data/avocado-vt/images/jeos-27-x86_64.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 5.01 MiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false

4. Try to login to vm on dest host, failed

5. Check libvirtd.log, can't find blockdev-add 



Actual results:
As description


Expected results:
Disk should be copied during migration when --copy-storage-inc is used


Additional info:
Can't reproduce with --copy-storage-all

Comment 1 Fangge Jin 2021-07-02 05:12:26 UTC
Created attachment 1797029 [details]
domain xml

Comment 2 Han Han 2021-07-02 08:19:43 UTC
Created attachment 1797066 [details]
The qmp log of src and dest hosts

Reproduced on libvirt-7.4.0-1.el9.x86_64 qemu-kvm-6.0.0-4.el9.x86_64.
See the qmp log files generated by qemu-monitor.stp.

Comment 3 Peter Krempa 2021-07-02 11:58:59 UTC
The problem happens because a wrong constant was used in the logic expression which is used to determine whether storage migration needs to take place.

The original condition is:

bool storageMigration = flags & (VIR_MIGRATE_NON_SHARED_DISK | QEMU_MONITOR_MIGRATE_NON_SHARED_INC);

The correct one is:

bool storageMigration = flags & (VIR_MIGRATE_NON_SHARED_DISK | VIR_MIGRATE_NON_SHARED_INC);

QEMU_MONITOR_MIGRATE_NON_SHARED_INC equals to 0x04
VIR_MIGRATE_NON_SHARED_INC equals to 0x80

Comment 4 Peter Krempa 2021-07-12 14:42:27 UTC
Fixed upstream:

commit b249fa78718cd6c21109b385b568ecd3d6a3a8dd
Author: Peter Krempa <pkrempa>
Date:   Fri Jul 2 14:17:58 2021 +0200

    NEWS: Mention implications of the bug in migration code
    
    Wrong flag use could have user-visible implications. Mention the fix.
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

commit f58349c9c6d26d98e7c8c195b1160d0c0cfff080
Author: Peter Krempa <pkrempa>
Date:   Fri Jul 2 14:17:57 2021 +0200

    qemu: migration: Use correct flag constant for enabling storage migration
    
    The 'storageMigration' flag is supposed to be true if storage migration
    is requested, which is based on VIR_MIGRATE_NON_SHARED_DISK or
    VIR_MIGRATE_NON_SHARED_INC flags. The assignment to the variable used
    QEMU_MONITOR_MIGRATE_NON_SHARED_INC (0x04) instead of
    VIR_MIGRATE_NON_SHARED_INC (0x80), caused libvirtd to skip the actual
    copy of data.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1978526
    Fixes: da69f4b2084bff140238e450e264d6036ebef898
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

v7.5.0-44-gb249fa7871

Comment 5 Han Han 2021-07-16 03:25:42 UTC
Test on libvirt v7.5.0-97-g133d05a15e and qemu-6.0.0-1.fc35.x86_64 as comment0. PASS

Comment 9 Han Han 2021-08-23 02:43:40 UTC
Verified on libvirt-7.6.0-2.el9.x86_64 qemu-kvm-6.0.0-12.el9.x86_64:
1. Prepare a nbd based on an image with OS
2. Create backing file of nbd on both hosts
# qemu-img info /var/lib/libvirt/images/backing.qcow2 
image: /var/lib/libvirt/images/backing.qcow2
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 26 MiB
cluster_size: 65536
backing file: json:{"file":{"driver":"nbd","server":{"type":"inet","host":"10.0.150.247","port":"10809"}}}
backing file format: qcow2
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false

3. Refresh the storage pool of backing image

4. Start a VM with backing image. Finish migration with --copy-storage-inc
# virsh migrate fedora34 qemu+ssh://root.150.247/system --live --verbose --copy-storage-inc
Migration: [100 %]

5. Login the VM and write some data to the disk
# dd if=/dev/zero of=file bs=10M count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 2.77315 s, 37.8 MB/s

No I/O error.

Comment 12 Han Han 2022-01-13 04:30:53 UTC
Covered by RHEL-120247


Note You need to log in before you can comment on or make changes to this bug.