Description of problem: The vfio post-copy migration is not supported currently. Using this bug to track the future support of the vfio post-copy migration Version-Release number of selected component (if applicable): qemu-kvm-8.0.0-9.el9.x86_64 5.14.0-344.2520_944365724.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. update the MT2910 CX-7 firmware and make sure the vf has set VF_MIGRATION_MODE – MIGRATION_ENABLED $ sudo flint -d 0000:b1:00.0 query full FW Version: 28.37.1014 FW Release Date: 4.5.2023 Part Number: MCX713106AC-VEA_Ax Description: NVIDIA ConnectX-7 HHHL Adapter Card; 200GbE; Dual-port QSFP112; PCIe 5.0 x16; Crypto Enabled; Secure Boot Enabled $ sudo mstconfig -d b1:00.0 query VF_MIGRATION_MODE Device #1: ---------- Device type: ConnectX7 Name: MCX713106AC-VEA_Ax Description: NVIDIA ConnectX-7 HHHL Adapter Card; 200GbE; Dual-port QSFP112; PCIe 5.0 x16; Crypto Enabled; Secure Boot Enabled Device: b1:00.0 Configurations: Next Boot VF_MIGRATION_MODE MIGRATION_ENABLED(2) note: The minimum firmware version for migration is 28.36.1010 The cmd we use to enable the VF_MIGRATION_MODE # sudo mstconfig -d b1:00.1 set VF_MIGRATION_MODE=2 # reboot 2. create 2 VFs and setup the VF for vfio migration create 2 VFs # sudo sh -c "echo 2 > /sys/bus/pci/devices/0000:b1:00.0/sriov_numvfs" setup vf's mac address # sudo sh -c "ip link set ens4f0np0 vf 0 mac 52:54:00:93:93:01" # sudo sh -c "ip link set ens4f0np0 vf 1 mac 52:54:00:93:93:02" unbind the vf from mlx5_core driver # sudo sh -c "echo 0000:b1:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind" sudo sh -c "echo 0000:b1:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind" set switchdev mode on PF # sudo sh -c "devlink dev eswitch set pci/0000:b1:00.0 mode switchdev" enable two VFs' migration feature # sudo sh -c "devlink port function set pci/0000:b1:00.0/1 migratable enable" # sudo sh -c "devlink port function set pci/0000:b1:00.0/2 migratable enable" bind vf to mlx5_vfio_pci # modprobe mlx5_vfio_pci # sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/new_id" # sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/remove_id" check the PF and VF status $ lshw -c network -businfo WARNING: you should run this program as super-user. Bus info Device Class Description =========================================================== pci@0000:b1:00.0 ens4f0np0 network MT2910 Family [ConnectX-7] pci@0000:b1:00.1 ens4f1np1 network MT2910 Family [ConnectX-7] pci@0000:b1:00.2 network ConnectX Family mlx5Gen Virtual Function pci@0000:b1:00.3 network ConnectX Family mlx5Gen Virtual Function pci@0000:b1:00.0 ens4f0npf0vf0 network MT2910 Family [ConnectX-7] pci@0000:b1:00.0 ens4f0npf0vf1 network MT2910 Family [ConnectX-7] 3. start a VMs with a mlx5_vfio_pci vfs The detailed qemu-kvm cmd line script is as follows: /usr/libexec/qemu-kvm \ ... -device '{"driver":"vfio-pci","host":"0000:b1:00.2","id":"hostdev0","bus":"pci.4","addr":"0x0","enable-migration":"on"}' \ 4. start a target VM with a mlx5_vfio_pci VF in listening mode 4.1 The detailed qemu-kvm cmd line script is as follows: /usr/libexec/qemu-kvm \ ... -device '{"driver":"vfio-pci","host":"0000:b1:00.2","id":"hostdev0","bus":"pci.4","addr":"0x0","enable-migration":"on"}' \ -incoming defer \ 4.2 setup the target vm into listening mode (qemu) migrate_incoming tcp:[::]:5800 (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on clear-bitmap-shift: 18 socket address: [ tcp::::5800 ] 4.3 setup the target vm for post-copy migration (qemu) migrate_set_capability postcopy-ram on 5. setup the migration capability and parameter of the source VM (qemu) migrate_set_capability postcopy-ram on (qemu) info migrate_capabilities xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off release-ram: off return-path: off pause-before-switchover: off multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off validate-uuid: off background-snapshot: off zero-copy-send: off postcopy-preempt: off switchover-ack: off 6. migrate the VM from source host to target host (qemu) migrate tcp:10.8.3.14:5800 7. change to post-copy migration when migration status is active (qemu) info migrate ... Migration status: active (qemu) migrate_start_postcopy 8. Check the migration status In the source VM: (qemu) qemu-kvm: failed to save SaveStateEntry with id(name): 1(ram): -5 qemu-kvm: Detected IO failure for postcopy. Migration paused. (qemu) info status VM status: paused (finish-migrate) In the target VM: (qemu) qemu-kvm: VFIO_MAP_DMA failed: Bad address qemu-kvm: vfio_dma_map(0x5623accb2e80, 0xc0000, 0xa000, 0x7f8408400000) = -14 (Bad address) qemu: hardware error: vfio: DMA mapping failed, unable to continue CPU #0: RAX=ffffffff988528d0 RBX=ffffffff9981a940 RCX=0000000000000001 RDX=4000000000000000 RSI=0000000000000087 RDI=000000000002eb8c RBP=0000000000000000 RSP=ffffffff99803ea8 R8 =00000015972ce95b R9 =0000000010020401 R10=00000000000000f2 R11=0000000000000000 R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000 RIP=ffffffff988528db RFL=00000252 [---ZA--] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 0000000000000000 ffffffff 00c00100 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00100 FS =0000 0000000000000000 ffffffff 00c00100 GS =0000 ff19505777a00000 ffffffff 00c00100 LDT=0000 0000000000000000 ffffffff 00c00000 TR =0040 fffffe0000003000 00004087 00008b00 DPL=0 TSS64-busy GDT= fffffe0000001000 0000007f IDT= fffffe0000000000 00000fff CR0=80050033 CR2=00005629e7bc7728 CR3=0000000102e18003 CR4=00771ef0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000fffe0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 Opmask00=0000000004000000 Opmask01=0000000000000000 Opmask02=00000000001fffff Opmask03=0000000000000000 Opmask04=0000000000000000 Opmask05=0000000000000000 Opmask06=0000000000080008 Opmask07=0000000000000000 ZMM00=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 2a8409477a0fbccb 61af6b781c66c585 ZMM01=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 6c6b8b4f37f27ca5 b26336ab4f3215f2 ZMM02=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 16946e0d94c47c37 ad1ba4c2d2a230bb ZMM03=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 7cbf9536f2d2662e 2cd98d61e2c0d73b ZMM04=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000140 ZMM05=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000040 ZMM06=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 80209ce0d0be156e a68b554dcc9714e9 ZMM07=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 4fd1d561c2c28d5a de07a402878fcb02 ZMM08=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 fcd3a6df221e7884 f9d715b294d16898 ZMM09=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ba937123029be06d b197fde5058ee59a ZMM10=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 43d8f60cfd021ee8 bb8f7fc2a6c71122 ZMM11=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 a093d3fed298814d 0bdbb738d9f6da0d ZMM12=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ZMM13=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ZMM14=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 a54ff53a3c6ef372 bb67ae856a09e667 ZMM15=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 5be0cd191f83d9ab 9b05688c510e527f ZMM16=0000000000000000 0000000000000000 0000000000000000 0000000000000000 620bf8df241ff96e 000000000008a410 17d29074308c0483 000000000008a108 ZMM17=0000000000000000 0000000000000000 0000000000000000 0000000000000000 a7dab260e69ca5b2 0000000000094558 dbe526ba9a2ede27 0000000000094390 ZMM18=0000000000000000 0000000000000000 0000000000000000 0000000000000000 229c430d8fda5706 0000000000094618 67d5102fcfb7eea3 00000000000945c0 ZMM19=0000000000000000 0000000000000000 0000000000000000 0000000000000000 533bfcb0f7c26d7f 00000000000b1848 9f2fd2f9c124c755 0000000000094680 ZMM20=0000000000000000 0000000000000000 0000000000000000 0000000000000000 69d2e6fb3bdd91db 00000000000b3228 fb7e69b9a43ada3b 00000000000b18c0 ZMM21=0000000000000000 0000000000000000 0000000000000000 0000000000000000 e93bf67906460ab9 00000000000c20c0 69d2e6fb3bdd91db 00000000000b3228 ZMM22=0000000000000000 0000000000000000 0000000000000000 0000000000000000 fb7e69b9a43ada3b 00000000000b18c0 533bfcb0f7c26d7f 00000000000b1848 ZMM23=0000000000000000 0000000000000000 0000000000000000 0000000000000000 9f2fd2f9c124c755 0000000000094680 229c430d8fda5706 0000000000094618 ZMM24=0000000000000000 0000000000000000 0000000000000000 0000000000000000 67d5102fcfb7eea3 00000000000945c0 a7dab260e69ca5b2 0000000000094558 ZMM25=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ZMM26=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ZMM27=0000000000000000 0000000000000000 0000000000000000 0000000000000000 e823242d2628322d 34e7bfee32312432 34ee232c24333238 32ee2d3431eebfef ZMM28=0000000000000000 0000000000000000 0000000000000000 0000000000000000 a2fabfbfbfbfbfbf bfdfbfbf15e8a678 a2d9bfbfbfbfbfbf bfccbfbf15e8bfef ZMM29=0000000000000000 0000000000000000 0000000000000000 0000000000000000 4141414141414141 4141414141414141 4141414141414141 4141414141414141 ZMM30=0000000000000000 0000000000000000 0000000000000000 0000000000000000 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a ZMM31=0000000000000000 0000000000000000 0000000000000000 0000000000000000 2020202020202020 2020202020202020 2020202020202020 2020202020202020 CPU #1: EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 SS =0000 00000000 0000ffff 00009300 DS =0000 00000000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000 XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000 XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000 XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000 CPU #2: EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 SS =0000 00000000 0000ffff 00009300 DS =0000 00000000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000 XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000 XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000 XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000 CPU #3: EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 SS =0000 00000000 0000ffff 00009300 DS =0000 00000000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000 XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000 XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000 XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000 dst_93.sh: line 45: 9565 Aborted (core dumped) Actual results: The VM with a mlx5_vfio_pci VF can not do post-copy migration Expected results: The VM with a mlx5_vfio_pci VF can do post-copy migration or The qemu-kvm throws the info which indicates the mlx5_vfio_pci VF post-copy migration is not supported Additional info: (1) The full qemu-kvm cmd is like: /usr/libexec/qemu-kvm \ -name guest=rhel93,debug-threads=on \ -machine pc-q35-rhel9.2.0,usb=off,dump-guest-core=off,memory-backend=pc.ram,hpet=off,acpi=on \ -accel kvm \ -cpu Icelake-Server,ds=on,ss=on,dtes64=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off \ -m 8192 \ -object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":8589934592}' \ -overcommit mem-lock=off \ -smp 4,sockets=4,dies=1,cores=1,threads=1 \ -uuid ce70e79f-8854-490a-8b0b-f5261a9b8bad \ -no-user-config \ -nodefaults \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-shutdown \ -global ICH9-LPC.disable_s3=1 \ -global ICH9-LPC.disable_s4=1 \ -boot strict=on \ -device '{"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}' \ -device '{"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"}' \ -device '{"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"}' \ -device '{"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"}' \ -device '{"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"}' \ -device '{"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"}' \ -device '{"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"}' \ -device '{"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}' \ -device '{"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"}' \ -device '{"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"}' \ -device '{"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"}' \ -device '{"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"}' \ -device '{"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"}' \ -device '{"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"}' \ -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/images/migration/RHEL93.qcow2", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \ -device '{"driver": "virtio-blk-pci", "id": "image1", "drive": "drive_image1", "bootindex": 1, "write-cache": "on", "bus": "pci.2", "addr": "0x0"}' \ -vnc 0.0.0.0:93 \ -device '{"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"}' \ -device '{"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.6","addr":"0x0"}' \ -object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \ -device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.7","addr":"0x0"}' \ -monitor stdio \ -qmp tcp:0:5555,server,nowait \ -device '{"driver":"vfio-pci","host":"0000:b1:00.2","id":"hostdev0","bus":"pci.4","addr":"0x0","enable-migration":"on"}' \
Moved to backlog per internal discussion since it may be a while before this is implemented
I don't know that anyone is considering enabling post-copy with device assignment. Doing so would certainly require support of IOMMU page faulting, which is some ways off. However the ability of these steps to trigger a fault and backtrace in QEMU is concerning. Do we need a means for a device to block post-copy?