VM xml, dest xml, libvirtd log Description of problem: As subject Version-Release number of selected component (if applicable): libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 qemu-kvm-4.2.0-17.module+el8.2.0+6141+0f540f16.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start an VM 2. copy storage migration to a nvme disk ➜ ~ xmllint --xpath //disk /tmp/nvme.xml <disk type="nvme" device="disk"> <driver name="qemu" type="raw" cache="directsync" error_policy="report" rerror_policy="report" io="native" discard="unmap" detect_zeroes="unmap"/> <source type="pci" managed="yes" namespace="1"> <address domain="0x0000" bus="0x44" slot="0x00" function="0x0"/> </source> <backingStore/> <target dev="sda" bus="sata"/> <address type="drive" controller="0" bus="0" target="0" unit="0"/> </disk> ➜ ~ virsh migrate demo qemu+ssh://XXXXX/system --xml /tmp/nvme.xml --verbose --copy-storage-all error: internal error: cannot precreate storage for disk type 'nvme' Actual results: As above Expected results: vm migrated Additional info: Referring to https://libvirt.org/formatdomain.html, the migration should be suppported: The difference between <disk type='nvme'> and <hostdev/> is that the latter is plain host device assignment with all its limitations (e.g. no live migration), while the former makes hypervisor to run the NVMe disk through hypervisor's block layer thus enabling all features provided by the layer (e.g. snapshots, domain migration, etc.).
I don't think we will be able to pre-create storage for nvme and we are definitely not planning on adding a storage driver for NVMe disks. Without pre-creation (e.g. if you correctly format your destination drive) the storage migration should work.
Agreed with Peter here, I'm not quite sure how would one pre-create a PCI device. NVMe disks are PCI devices after all and thus are different to file based disks (where the corresponding file can be created anytime). In theory we could create NVMe namespaces, but that is not trivial and certainly supported by only a very few disks (in fact I haven't seen one yet). Therefore I think this should be closed.
(In reply to Michal Privoznik from comment #3) > Agreed with Peter here, I'm not quite sure how would one pre-create a PCI > device. NVMe disks are PCI devices after all and thus are different to file > based disks (where the corresponding file can be created anytime). In theory > we could create NVMe namespaces, but that is not trivial and certainly > supported by only a very few disks (in fact I haven't seen one yet). > Therefore I think this should be closed. Note that the host nvme <address domain="0x0000" bus="0x44" slot="0x00" function="0x0"/> exists on host. It should be resued by libvirt, just like copy-storage migration to host block device.
(In reply to Han Han from comment #4) > Ah, so maybe what we should do is to silently ignore precreate of NVMe disk if requested. I mean, for network disks, we already do that: https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_migration.c#L223 and maybe we can do the same for NVMe disks.
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2020-May/msg00253.html but according to my review I think we need slightly better approach. My previous comment simplified things too much.
(In reply to Michal Privoznik from comment #6) > Patch proposed upstream: > > https://www.redhat.com/archives/libvir-list/2020-May/msg00253.html > > but according to my review I think we need slightly better approach. My > previous comment simplified things too much. The v2 doesn't work: ➜ ~ virsh -k0 migrate demo qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --verbose --copy-storage-all --xml /tmp/demo.xml error: internal error: cannot precreate storage for disk type 'nvme' It will not meet the condition: /* Skip disks we don't want to migrate and already existing disks. */ if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || (diskSrcPath && virFileExists(diskSrcPath))) { continue; } Gdb debug process: (gdb) s qemuMigrationDstPrecreateStorage (incremental=false, migrate_disks=0x0, nmigrate_disks=0, nbd=0x7fffd800f5f0, vm=0x7fffd8008be0) at ../../src/qemu/qemu_migration.c:312 312 if (!(conn = virGetConnectStorage())) (gdb) n 315 for (i = 0; i < nbd->ndisks; i++) { (gdb) 318 g_autofree char *nvmePath = NULL; (gdb) 320 VIR_DEBUG("Looking up disk target '%s' (capacity=%llu)", (gdb) 323 if (!(disk = virDomainDiskByTarget(vm->def, nbd->disks[i].target))) { (gdb) 330 if (disk->src->type == VIR_STORAGE_TYPE_NVME) { (gdb) 331 virPCIDeviceAddressGetSysfsFile(&disk->src->nvme->pciAddr, &nvmePath); (gdb) 332 diskSrcPath = nvmePath; (gdb) 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || (gdb) p nvmePath $1 = 0x7fffd801a970 "/sys/bus/pci/devices/0000:45:00.0" (gdb) n 343 if (incremental) { (gdb) p diskSrcPath
(In reply to Han Han from comment #7) > (In reply to Michal Privoznik from comment #6) > > Patch proposed upstream: > > > > https://www.redhat.com/archives/libvir-list/2020-May/msg00253.html > > > > but according to my review I think we need slightly better approach. My > > previous comment simplified things too much. > > The v2 doesn't work: > ➜ ~ virsh -k0 migrate demo > qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --verbose > --copy-storage-all --xml /tmp/demo.xml > > error: internal error: cannot precreate storage for disk type 'nvme' > > > It will not meet the condition: > /* Skip disks we don't want to migrate and already existing disks. */ > if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || > (diskSrcPath && virFileExists(diskSrcPath))) { > continue; > } > > Gdb debug process: > (gdb) s > qemuMigrationDstPrecreateStorage (incremental=false, migrate_disks=0x0, > nmigrate_disks=0, nbd=0x7fffd800f5f0, vm=0x7fffd8008be0) at > ../../src/qemu/qemu_migration.c:312 > > 312 if (!(conn = virGetConnectStorage())) > (gdb) n > 315 for (i = 0; i < nbd->ndisks; i++) { > (gdb) > 318 g_autofree char *nvmePath = NULL; > (gdb) > 320 VIR_DEBUG("Looking up disk target '%s' (capacity=%llu)", > (gdb) > 323 if (!(disk = virDomainDiskByTarget(vm->def, > nbd->disks[i].target))) { > (gdb) > 330 if (disk->src->type == VIR_STORAGE_TYPE_NVME) { > (gdb) > 331 > virPCIDeviceAddressGetSysfsFile(&disk->src->nvme->pciAddr, &nvmePath); > (gdb) > 332 diskSrcPath = nvmePath; > (gdb) > 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, > migrate_disks) || > (gdb) p nvmePath > $1 = 0x7fffd801a970 "/sys/bus/pci/devices/0000:45:00.0" > (gdb) n > 343 if (incremental) { > (gdb) p diskSrcPath This if() statement on line 338 looks like this: /* Skip disks we don't want to migrate and already existing disks. */ if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || (diskSrcPath && virFileExists(diskSrcPath))) { continue; } Since diskSrcPath is set to nvmePath (and thus both are set), then I guess virFileExists() must have returned false for the path. Can you please check if that is so? I've just tested what I was suggesting and it works for me. There is one quirk - the device has to exist. In my case: $ test -e /sys/bus/pci/devices/0000\:02\:00.0/; echo $? 0
(In reply to Michal Privoznik from comment #8) > (In reply to Han Han from comment #7) > > (In reply to Michal Privoznik from comment #6) > > > Patch proposed upstream: > > > > > > https://www.redhat.com/archives/libvir-list/2020-May/msg00253.html > > > > > > but according to my review I think we need slightly better approach. My > > > previous comment simplified things too much. > > > > The v2 doesn't work: > > ➜ ~ virsh -k0 migrate demo > > qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --verbose > > --copy-storage-all --xml /tmp/demo.xml > > > > error: internal error: cannot precreate storage for disk type 'nvme' > > > > > > It will not meet the condition: > > /* Skip disks we don't want to migrate and already existing disks. */ > > if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || > > (diskSrcPath && virFileExists(diskSrcPath))) { > > continue; > > } > > > > Gdb debug process: > > (gdb) s > > qemuMigrationDstPrecreateStorage (incremental=false, migrate_disks=0x0, > > nmigrate_disks=0, nbd=0x7fffd800f5f0, vm=0x7fffd8008be0) at > > ../../src/qemu/qemu_migration.c:312 > > > > 312 if (!(conn = virGetConnectStorage())) > > (gdb) n > > 315 for (i = 0; i < nbd->ndisks; i++) { > > (gdb) > > 318 g_autofree char *nvmePath = NULL; > > (gdb) > > 320 VIR_DEBUG("Looking up disk target '%s' (capacity=%llu)", > > (gdb) > > 323 if (!(disk = virDomainDiskByTarget(vm->def, > > nbd->disks[i].target))) { > > (gdb) > > 330 if (disk->src->type == VIR_STORAGE_TYPE_NVME) { > > (gdb) > > 331 > > virPCIDeviceAddressGetSysfsFile(&disk->src->nvme->pciAddr, &nvmePath); > > (gdb) > > 332 diskSrcPath = nvmePath; > > (gdb) > > 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, > > migrate_disks) || > > (gdb) p nvmePath > > $1 = 0x7fffd801a970 "/sys/bus/pci/devices/0000:45:00.0" > > (gdb) n > > 343 if (incremental) { > > (gdb) p diskSrcPath > > This if() statement on line 338 looks like this: > > /* Skip disks we don't want to migrate and already existing disks. */ > if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || > (diskSrcPath && virFileExists(diskSrcPath))) { > continue; > } > > Since diskSrcPath is set to nvmePath (and thus both are set), then I guess > virFileExists() must have returned false for the path. Can you please check > if that is so? > Yes. qemuMigrationAnyCopyDisk() returns true and returns false: 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || (gdb) s qemuMigrationAnyCopyDisk (nmigrate_disks=nmigrate_disks@entry=0, migrate_disks=migrate_disks@entry=0x0, disk=<optimized out>, disk=<optimized out>) at ../../src/qemu/qemu_migration.c:283 283 if (nmigrate_disks) { (gdb) finish Run till exit from #0 qemuMigrationAnyCopyDisk (nmigrate_disks=nmigrate_disks@entry=0, migrate_disks=migrate_disks@entry=0x0, disk=<optimized out>, disk=<optimized out>) at ../../src/qemu/qemu_migration.c:283 0x00007fffae5766d2 in qemuMigrationDstPrecreateStorage (incremental=false, migrate_disks=0x0, nmigrate_disks=0, nbd=0x7fffdc00e4b0, vm=0x7fffdc019dd0) at ../../src/qemu/qemu_migration.c:338 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || Value returned is $4 = true (gdb) s virFileExists (path=path@entry=0x7fffdc01cca0 "/sys/bus/pci/devices/0000:45:00.0") at ../../src/util/virfile.c:1876 1876 return access(path, F_OK) == 0; (gdb) finish Run till exit from #0 virFileExists (path=path@entry=0x7fffdc01cca0 "/sys/bus/pci/devices/0000:45:00.0") at ../../src/util/virfile.c:1876 0x00007fffae5766eb in qemuMigrationDstPrecreateStorage (incremental=false, migrate_disks=0x0, nmigrate_disks=0, nbd=0x7fffdc00e4b0, vm=0x7fffdc019dd0) at ../../src/qemu/qemu_migration.c:338 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, migrate_disks) || Value returned is $5 = false > I've just tested what I was suggesting and it works for me. There is one > quirk - the device has to exist. In my case: > > $ test -e /sys/bus/pci/devices/0000\:02\:00.0/; echo $? > 0
(In reply to Han Han from comment #9) > virFileExists (path=path@entry=0x7fffdc01cca0 > "/sys/bus/pci/devices/0000:45:00.0") at ../../src/util/virfile.c:1876 > > 1876 return access(path, F_OK) == 0; > (gdb) finish > Run till exit from #0 virFileExists (path=path@entry=0x7fffdc01cca0 > "/sys/bus/pci/devices/0000:45:00.0") at ../../src/util/virfile.c:1876 > > 0x00007fffae5766eb in qemuMigrationDstPrecreateStorage (incremental=false, > migrate_disks=0x0, nmigrate_disks=0, nbd=0x7fffdc00e4b0, vm=0x7fffdc019dd0) > at ../../src/qemu/qemu_migration.c:338 > 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, > migrate_disks) || > Value returned is $5 = false > This means that the path "/sys/bus/pci/devices/0000:45:00.0" is not accessible. This means that the PCI device doesn't exist and therefore libvirt tries to pre-create the NVMe disk which fails. Are you sure the path really exist? e.g. ls -lZ $path or something similar.
(In reply to Michal Privoznik from comment #10) > (In reply to Han Han from comment #9) > > virFileExists (path=path@entry=0x7fffdc01cca0 > > "/sys/bus/pci/devices/0000:45:00.0") at ../../src/util/virfile.c:1876 > > > > 1876 return access(path, F_OK) == 0; > > (gdb) finish > > Run till exit from #0 virFileExists (path=path@entry=0x7fffdc01cca0 > > "/sys/bus/pci/devices/0000:45:00.0") at ../../src/util/virfile.c:1876 > > > > 0x00007fffae5766eb in qemuMigrationDstPrecreateStorage (incremental=false, > > migrate_disks=0x0, nmigrate_disks=0, nbd=0x7fffdc00e4b0, vm=0x7fffdc019dd0) > > at ../../src/qemu/qemu_migration.c:338 > > 338 if (!qemuMigrationAnyCopyDisk(disk, nmigrate_disks, > > migrate_disks) || > > Value returned is $5 = false > > > > This means that the path "/sys/bus/pci/devices/0000:45:00.0" is not > accessible. This means that the PCI device doesn't exist and therefore > libvirt tries to pre-create the NVMe disk which fails. Are you sure the path > really exist? e.g. ls -lZ $path or something similar. Sorry I used the wrong nvme pci address on v2 test above... In fact it works: ➜ ~ virsh -k0 migrate demo qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --verbose --copy-storage-all --xml /tmp/demo.xml Migration: [100 %]
(In reply to Han Han from comment #11) > Sorry I used the wrong nvme pci address on v2 test above... > In fact it works: > ➜ ~ virsh -k0 migrate demo > qemu+ssh://hp-dl385g10-15.lab.eng.pek2.redhat.com/system --verbose > --copy-storage-all --xml /tmp/demo.xml > Migration: [100 %] Great! Can you send the patch then please? I think this is the best behaviour we can expect. Creating NVMe disks is oxymoron (we could potentially create namespaces, but let's not go down that hole), and if correct PCI address is provided then migration succeeds, if a wrong PCI address is provided then libvirt prints a sensible error, in my opinion.
I've posted the patch upstream: https://www.redhat.com/archives/libvir-list/2020-May/msg01152.html
Pushed to master as: a5a297f387 qemu: Skip pre-creation of NVMe disks v6.4.0-4-ga5a297f387
Created attachment 1715033 [details] the VM xml of src and dest Verified on libvirt-6.6.0-5.module+el8.3.0+8092+f9e72d7e.x86_64 qemu-kvm-5.1.0-7.module+el8.3.0+8099+dba2fe3e.x86_64 Steps: 1. Start a VM and prepare a destination VM xml with nvme disk: <domain> ... <disk type='nvme' device='disk'> <driver name='qemu' type='qcow2'/> <target dev='vda' bus='virtio'/> <source type='pci' managed='yes' namespace='1'> <address domain='0x0000' bus='0x44' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> ... </domain> 2. Format the nvme disk on dest host with the same size of src disk # virsh domblklist q35 Target Source --------------------------------------------- vda /var/lib/libvirt/images/q35.qcow2 # qemu-img info -U /var/lib/libvirt/images/q35.qcow2 image: /var/lib/libvirt/images/q35.qcow2 file format: qcow2 virtual size: 10 GiB (10737418240 bytes) disk size: 850 MiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false ➜ ~ qemu-img create /dev/nvme0n1 10G -f qcow2 Formatting '/dev/nvme0n1', fmt=qcow2 cluster_size=65536 compression_type=zlib size=10737418240 lazy_refcounts=off refcount_bits=16 3. Migrate the vm with --copy-storage-all and --xml # virsh migrate q35 qemu+ssh://XXXX/system --copy-storage-all --xml /tmp/src-dst-XMLs/q35-dst.xml If the virtual size of src is not equal to that in dest, it will hit the error of https://bugzilla.redhat.com/show_bug.cgi?id=1822538#c14
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137