RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2229868 - [vfio migration]Disable postcopy for VM with migratable vfio device
Summary: [vfio migration]Disable postcopy for VM with migratable vfio device
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Cédric Le Goater
QA Contact: Yanghang Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-08 06:06 UTC by Yanghang Liu
Modified: 2024-01-24 08:40 UTC (History)
18 users (show)

Fixed In Version: qemu-kvm-8.0.0-16.el9_3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:28:04 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/qemu-kvm qemu-kvm merge_requests 318 0 None None None 2023-09-11 15:29:48 UTC
Red Hat Issue Tracker RHELPLAN-164788 0 None None None 2023-08-08 06:09:21 UTC
Red Hat Product Errata RHSA-2023:6368 0 None None None 2023-11-07 08:28:10 UTC

Description Yanghang Liu 2023-08-08 06:06:05 UTC
Description of problem:
The vfio post-copy migration is not supported currently.
Using this bug to track the future support of the vfio post-copy migration

Version-Release number of selected component (if applicable):
qemu-kvm-8.0.0-9.el9.x86_64
5.14.0-344.2520_944365724.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. update the MT2910 CX-7 firmware and make sure the vf has set VF_MIGRATION_MODE – MIGRATION_ENABLED

$ sudo  flint -d 0000:b1:00.0 query full
FW Version:            28.37.1014
FW Release Date:       4.5.2023
Part Number:           MCX713106AC-VEA_Ax
Description:           NVIDIA ConnectX-7 HHHL Adapter Card; 200GbE; Dual-port QSFP112; PCIe 5.0 x16; Crypto Enabled; Secure Boot Enabled

$ sudo mstconfig -d   b1:00.0  query VF_MIGRATION_MODE
Device #1:
----------

Device type:    ConnectX7       
Name:           MCX713106AC-VEA_Ax
Description:    NVIDIA ConnectX-7 HHHL Adapter Card; 200GbE; Dual-port QSFP112; PCIe 5.0 x16; Crypto Enabled; Secure Boot Enabled
Device:         b1:00.0         

Configurations:                                      Next Boot
         VF_MIGRATION_MODE                           MIGRATION_ENABLED(2)


note: 
  The minimum firmware version for migration is 28.36.1010
  The cmd we use to enable the VF_MIGRATION_MODE
    # sudo mstconfig -d   b1:00.1  set VF_MIGRATION_MODE=2
    # reboot 


2. create 2 VFs and setup the VF for vfio migration

create 2 VFs
# sudo sh -c  "echo 2 > /sys/bus/pci/devices/0000:b1:00.0/sriov_numvfs"

setup vf's mac address
# sudo sh -c  "ip link set ens4f0np0 vf 0 mac 52:54:00:93:93:01"
# sudo sh -c  "ip link set ens4f0np0 vf 1 mac 52:54:00:93:93:02"

unbind the vf from mlx5_core driver
# sudo sh -c  "echo 0000:b1:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind"
sudo sh -c  "echo 0000:b1:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind"

set switchdev mode on PF
# sudo sh -c "devlink dev eswitch set pci/0000:b1:00.0 mode switchdev"

enable two VFs' migration feature
# sudo sh -c "devlink port function set pci/0000:b1:00.0/1 migratable enable"
# sudo sh -c "devlink port function set pci/0000:b1:00.0/2 migratable enable"

bind vf to mlx5_vfio_pci
# modprobe mlx5_vfio_pci
# sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/new_id"
# sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/remove_id"

check the PF and VF status
$ lshw -c network -businfo
WARNING: you should run this program as super-user.
Bus info          Device         Class          Description
===========================================================
pci@0000:b1:00.0  ens4f0np0      network        MT2910 Family [ConnectX-7]
pci@0000:b1:00.1  ens4f1np1      network        MT2910 Family [ConnectX-7]
pci@0000:b1:00.2                 network        ConnectX Family mlx5Gen Virtual Function
pci@0000:b1:00.3                 network        ConnectX Family mlx5Gen Virtual Function
pci@0000:b1:00.0  ens4f0npf0vf0  network        MT2910 Family [ConnectX-7]
pci@0000:b1:00.0  ens4f0npf0vf1  network        MT2910 Family [ConnectX-7]


3. start a VMs with a mlx5_vfio_pci vfs 

The detailed qemu-kvm cmd line script is as follows:

/usr/libexec/qemu-kvm \
...
-device '{"driver":"vfio-pci","host":"0000:b1:00.2","id":"hostdev0","bus":"pci.4","addr":"0x0","enable-migration":"on"}' \


4. start a  target VM with a mlx5_vfio_pci VF in listening mode

4.1 The detailed qemu-kvm cmd line script is as follows:

/usr/libexec/qemu-kvm \
...
-device '{"driver":"vfio-pci","host":"0000:b1:00.2","id":"hostdev0","bus":"pci.4","addr":"0x0","enable-migration":"on"}' \
-incoming defer \

4.2 setup the target vm into listening mode

(qemu) migrate_incoming tcp:[::]:5800

(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
socket address: [
        tcp::::5800
]

4.3 setup the target vm for post-copy migration

(qemu) migrate_set_capability postcopy-ram on


5.  setup the migration capability and parameter of the source VM

(qemu) migrate_set_capability postcopy-ram on

(qemu)  info migrate_capabilities 
xbzrle: off
rdma-pin-all: off
auto-converge: off
zero-blocks: off
compress: off
events: off
postcopy-ram: on
x-colo: off
release-ram: off
return-path: off
pause-before-switchover: off
multifd: off
dirty-bitmaps: off
postcopy-blocktime: off
late-block-activate: off
x-ignore-shared: off
validate-uuid: off
background-snapshot: off
zero-copy-send: off
postcopy-preempt: off
switchover-ack: off


6.  migrate the VM from source host to target host

(qemu) migrate tcp:10.8.3.14:5800

7.  change to post-copy migration when migration status is active

(qemu) info migrate
... 
Migration status: active
(qemu) migrate_start_postcopy


8.  Check the migration status 

In the source VM:

(qemu) 
qemu-kvm: failed to save SaveStateEntry with id(name): 1(ram): -5
qemu-kvm: Detected IO failure for postcopy. Migration paused.


(qemu) info status
VM status: paused (finish-migrate)


In the target VM:
(qemu) 
qemu-kvm: VFIO_MAP_DMA failed: Bad address
qemu-kvm: vfio_dma_map(0x5623accb2e80, 0xc0000, 0xa000, 0x7f8408400000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
CPU #0:
RAX=ffffffff988528d0 RBX=ffffffff9981a940 RCX=0000000000000001 RDX=4000000000000000
RSI=0000000000000087 RDI=000000000002eb8c RBP=0000000000000000 RSP=ffffffff99803ea8
R8 =00000015972ce95b R9 =0000000010020401 R10=00000000000000f2 R11=0000000000000000
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=ffffffff988528db RFL=00000252 [---ZA--] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00100
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00100
FS =0000 0000000000000000 ffffffff 00c00100
GS =0000 ff19505777a00000 ffffffff 00c00100
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 fffffe0000003000 00004087 00008b00 DPL=0 TSS64-busy
GDT=     fffffe0000001000 0000007f
IDT=     fffffe0000000000 00000fff
CR0=80050033 CR2=00005629e7bc7728 CR3=0000000102e18003 CR4=00771ef0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
Opmask00=0000000004000000 Opmask01=0000000000000000 Opmask02=00000000001fffff Opmask03=0000000000000000
Opmask04=0000000000000000 Opmask05=0000000000000000 Opmask06=0000000000080008 Opmask07=0000000000000000
ZMM00=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 2a8409477a0fbccb 61af6b781c66c585
ZMM01=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 6c6b8b4f37f27ca5 b26336ab4f3215f2
ZMM02=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 16946e0d94c47c37 ad1ba4c2d2a230bb
ZMM03=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 7cbf9536f2d2662e 2cd98d61e2c0d73b
ZMM04=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000140
ZMM05=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000040
ZMM06=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 80209ce0d0be156e a68b554dcc9714e9
ZMM07=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 4fd1d561c2c28d5a de07a402878fcb02
ZMM08=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 fcd3a6df221e7884 f9d715b294d16898
ZMM09=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ba937123029be06d b197fde5058ee59a
ZMM10=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 43d8f60cfd021ee8 bb8f7fc2a6c71122
ZMM11=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 a093d3fed298814d 0bdbb738d9f6da0d
ZMM12=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM13=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM14=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 a54ff53a3c6ef372 bb67ae856a09e667
ZMM15=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 5be0cd191f83d9ab 9b05688c510e527f
ZMM16=0000000000000000 0000000000000000 0000000000000000 0000000000000000 620bf8df241ff96e 000000000008a410 17d29074308c0483 000000000008a108
ZMM17=0000000000000000 0000000000000000 0000000000000000 0000000000000000 a7dab260e69ca5b2 0000000000094558 dbe526ba9a2ede27 0000000000094390
ZMM18=0000000000000000 0000000000000000 0000000000000000 0000000000000000 229c430d8fda5706 0000000000094618 67d5102fcfb7eea3 00000000000945c0
ZMM19=0000000000000000 0000000000000000 0000000000000000 0000000000000000 533bfcb0f7c26d7f 00000000000b1848 9f2fd2f9c124c755 0000000000094680
ZMM20=0000000000000000 0000000000000000 0000000000000000 0000000000000000 69d2e6fb3bdd91db 00000000000b3228 fb7e69b9a43ada3b 00000000000b18c0
ZMM21=0000000000000000 0000000000000000 0000000000000000 0000000000000000 e93bf67906460ab9 00000000000c20c0 69d2e6fb3bdd91db 00000000000b3228
ZMM22=0000000000000000 0000000000000000 0000000000000000 0000000000000000 fb7e69b9a43ada3b 00000000000b18c0 533bfcb0f7c26d7f 00000000000b1848
ZMM23=0000000000000000 0000000000000000 0000000000000000 0000000000000000 9f2fd2f9c124c755 0000000000094680 229c430d8fda5706 0000000000094618
ZMM24=0000000000000000 0000000000000000 0000000000000000 0000000000000000 67d5102fcfb7eea3 00000000000945c0 a7dab260e69ca5b2 0000000000094558
ZMM25=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM26=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM27=0000000000000000 0000000000000000 0000000000000000 0000000000000000 e823242d2628322d 34e7bfee32312432 34ee232c24333238 32ee2d3431eebfef
ZMM28=0000000000000000 0000000000000000 0000000000000000 0000000000000000 a2fabfbfbfbfbfbf bfdfbfbf15e8a678 a2d9bfbfbfbfbfbf bfccbfbf15e8bfef
ZMM29=0000000000000000 0000000000000000 0000000000000000 0000000000000000 4141414141414141 4141414141414141 4141414141414141 4141414141414141
ZMM30=0000000000000000 0000000000000000 0000000000000000 0000000000000000 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a
ZMM31=0000000000000000 0000000000000000 0000000000000000 0000000000000000 2020202020202020 2020202020202020 2020202020202020 2020202020202020
CPU #1:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
CPU #2:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
CPU #3:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
dst_93.sh: line 45:  9565 Aborted                 (core dumped) 

Actual results:
The  VM with a mlx5_vfio_pci VF can not do post-copy migration

Expected results:
The  VM with a mlx5_vfio_pci VF can do post-copy migration 
or
The qemu-kvm throws the info which indicates the mlx5_vfio_pci VF  post-copy migration is not supported


Additional info:
(1) The full qemu-kvm cmd is like:
/usr/libexec/qemu-kvm \
-name guest=rhel93,debug-threads=on \
-machine pc-q35-rhel9.2.0,usb=off,dump-guest-core=off,memory-backend=pc.ram,hpet=off,acpi=on \
-accel kvm \
-cpu Icelake-Server,ds=on,ss=on,dtes64=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off \
-m 8192 \
-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":8589934592}' \
-overcommit mem-lock=off \
-smp 4,sockets=4,dies=1,cores=1,threads=1 \
-uuid ce70e79f-8854-490a-8b0b-f5261a9b8bad \
-no-user-config \
-nodefaults \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-shutdown \
-global ICH9-LPC.disable_s3=1 \
-global ICH9-LPC.disable_s4=1 \
-boot strict=on \
-device '{"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}' \
-device '{"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"}' \
-device '{"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"}' \
-device '{"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"}' \
-device '{"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"}' \
-device '{"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"}' \
-device '{"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"}' \
-device '{"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}' \
-device '{"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"}' \
-device '{"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"}' \
-device '{"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"}' \
-device '{"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"}' \
-device '{"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"}' \
-device '{"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/images/migration/RHEL93.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "virtio-blk-pci", "id": "image1", "drive": "drive_image1", "bootindex": 1, "write-cache": "on", "bus": "pci.2", "addr": "0x0"}' \
-vnc 0.0.0.0:93 \
-device '{"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"}' \
-device '{"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.6","addr":"0x0"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.7","addr":"0x0"}' \
-monitor stdio \
-qmp tcp:0:5555,server,nowait \
-device '{"driver":"vfio-pci","host":"0000:b1:00.2","id":"hostdev0","bus":"pci.4","addr":"0x0","enable-migration":"on"}' \

Comment 1 John Ferlan 2023-08-09 18:34:41 UTC
Moved to backlog per internal discussion since it may be a while before this is implemented

Comment 2 Alex Williamson 2023-08-09 18:35:42 UTC
I don't know that anyone is considering enabling post-copy with device assignment.  Doing so would certainly require support of IOMMU page faulting, which is some ways off.  However the ability of these steps to trigger a fault and backtrace in QEMU is concerning.  Do we need a means for a device to block post-copy?

Comment 4 Peter Xu 2023-08-24 16:53:47 UTC
Indeed I am not aware of any explicit blocker for postcopy when there are assigned devices.  It could have been overlooked when adding vfio migration for precopy, which silently allowed postcopy to also happen.

To add it, one way to do is to fail the migration properly by checking ram_block_discard_is_disabled() (e.g., we can fail postcopy_start() when ram block discard is disabled).  VFIO in this case should only be one provider of the features that will disable ram discard.  In general postcopy should have issue as long as any page pinned, in this case I'd assume ram_block_discard_is_disabled() check would be the closest we can get so far from qemu.  This should automatically also fail postcopy for other ram discard disable-ers, like vDPA.

One thing I'm not sure though is we seem to have the failover-vf feature supported (which I am not extremely familiar with).  In which case IIUC we'll have VFIO device but allow postcopy.  But maybe it's fine: IIUC the vfio device needs to be unplugged before migration starts in that case (in replacement of the other pairing virtio device).  Hopefully that unplug operation will also trigger a proper qemu_vfio_close() -> ram_block_discard_disable(false), so postcopy hopefully won't be affected in this case.  But I really don't know enough on the feature to know.

Copy Juan too.

Comment 6 Yanhui Ma 2023-08-31 07:28:29 UTC
(In reply to Peter Xu from comment #4)
> Indeed I am not aware of any explicit blocker for postcopy when there are
> assigned devices.  It could have been overlooked when adding vfio migration
> for precopy, which silently allowed postcopy to also happen.
> 
> To add it, one way to do is to fail the migration properly by checking
> ram_block_discard_is_disabled() (e.g., we can fail postcopy_start() when ram
> block discard is disabled).  VFIO in this case should only be one provider
> of the features that will disable ram discard.  In general postcopy should
> have issue as long as any page pinned, in this case I'd assume
> ram_block_discard_is_disabled() check would be the closest we can get so far
> from qemu.  This should automatically also fail postcopy for other ram
> discard disable-ers, like vDPA.
> 
> One thing I'm not sure though is we seem to have the failover-vf feature
> supported (which I am not extremely familiar with).  In which case IIUC
> we'll have VFIO device but allow postcopy.  But maybe it's fine: IIUC the
> vfio device needs to be unplugged before migration starts in that case (in
> replacement of the other pairing virtio device).  Hopefully that unplug
> operation will also trigger a proper qemu_vfio_close() ->
> ram_block_discard_disable(false), so postcopy hopefully won't be affected in
> this case.  But I really don't know enough on the feature to know.
> 
> Copy Juan too.

Here is a failover post-copy bug, we can do post-copy migration for failover, but still have some issue.
Bug 1817965 - Live post-copy migration of the vm with failover VF device fails.

Comment 7 Yanghang Liu 2023-09-01 09:41:37 UTC
This issue can be reproduced via libvirt:

Test env:
5.14.0-355.el9.x86_64
qemu-kvm-8.0.0-13.el9.x86_64
libvirt-9.7.0-1.el9.x86_64

Test step:
(1) create a MT2910 VF and setup the VF for migration
(2) start a Q35 + SEABIOS RHEL93 domain

    <hostdev mode='subsystem' type='pci' managed='no'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0xb1' slot='0x00' function='0x2'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </hostdev>

(3) do post-copy migration for the domain
$ sudo virsh migrate --verbose --live --postcopy --postcopy-after-precopy  rhel93 qemu+ssh://10.8.3.15/system
$ sudo virsh migrate-postcopy rhel93

(4) check the source qemu-kvm log

2023-09-01 09:19:04.792+0000: initiating migration
2023-09-01T09:19:08.861674Z qemu-kvm: failed to save SaveStateEntry with id(name): 3(ram): -5
2023-09-01T09:19:08.861782Z qemu-kvm: Detected IO failure for postcopy. Migration paused.

(5) check the target qemu-kvm log:

2023-09-01T09:19:08.794938Z qemu-kvm: VFIO_MAP_DMA failed: Bad address
2023-09-01T09:19:08.794981Z qemu-kvm: vfio_dma_map(0x5596eb7ce630, 0xc0000, 0x7000, 0x7fb76e400000) = -2 (No such file or directory)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
CPU #0:
RAX=00000000ffffffa2 RBX=00007ffcbcb25c08 RCX=0000000000000001 RDX=0000000000000004
RSI=00007f5d5ffb7d94 RDI=00007ffcbcb25f82 RBP=000000000000434c RSP=00007ffcbcb25708
R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=00007f5d5fffac80
R12=00007ffcbcb25f80 R13=0000000000000006 R14=0000000000000004 R15=00007f5d5ffb7d94
RIP=00007f5d5fecff86 RFL=00000202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00c00000
CS =0033 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
SS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 00007f5d60154240 ffffffff 00c00000
GS =0000 0000000000000000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 fffffe0000003000 00004087 00008b00 DPL=0 TSS64-busy
GDT=     fffffe0000001000 0000007f
IDT=     fffffe0000000000 00000fff
CR0=80050033 CR2=00007f5d6014a000 CR3=0000000102c1c006 CR4=00771ef0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
Opmask00=0000000001000840 Opmask01=0000000000000001 Opmask02=00000000fffff7ff Opmask03=0000000000000000
Opmask04=0000000000000000 Opmask05=00000000fc00001d Opmask06=000000000000ec00 Opmask07=0000000000000000
ZMM00=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00656c61636f6c2f 62696c2f7273752f
ZMM01=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00656c61636f6c2f 62696c2f7273752f
ZMM02=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 5f5f004554415649 52505f4342494c47
ZMM03=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000068745f00
ZMM04=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 6c5f6c74706e5f5f 5f62645f64616572
ZMM05=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM06=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM07=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM08=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM09=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM10=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM11=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM12=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM13=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM14=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM15=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM16=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM17=0000000000000000 0000000000000000 0000000000000000 0000000000000000 3034323a383d4d41 455254535f4c414e 52554f4a00433d53 4547415353454d5f
ZMM18=0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007ffcbcb25f2f 0000000000000008 0000000000000010 00007f5d5fffbb00
ZMM19=0000000000000000 0000000000000000 0000000000000000 0000000000000000 4154454e4f4d5f43 4c2f386674752e43 2f656c61636f6c2f 62696c2f7273752f
ZMM20=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000041 0000000000005952 4154454e4f4d5f43 4c2f386674752e43
ZMM21=0000000000000000 0000000000000000 0000000000000000 0000000000000000 000055c44df16d2e 000055c44df16d25 000055c44df16d70 000055c44df16d65
ZMM22=0000000000000000 0000000000000000 0000000000000000 0000000000000000 000055c44df16d59 000055c44df16d53 000055c44df16d14 000055c44df16d0c
ZMM23=0000000000000000 0000000000000000 0000000000000000 0000000000000000 000055c44df16cfb 000055c44df16cf1 000055c44df16ce0 000055c44df16cd6
ZMM24=0000000000000000 0000000000000000 0000000000000000 0000000000000000 000055c44df16cc5 000055c44df16cbc 000055c44df16cab 000055c44df16ca4
ZMM25=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM26=0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
ZMM27=0000000000000000 0000000000000000 0000000000000000 0000000000000000 00020e150d08bff2 fc2b352b262e2b37 202c1e2f2c34232a bff7ec051314ed02
ZMM28=0000000000000000 0000000000000000 0000000000000000 0000000000000000 bf2d202b20332022 bff0ecf8f4f7f7ec 0e1208ed0e0d1e21 2dbf2b202c2a2e21
ZMM29=0000000000000000 0000000000000000 0000000000000000 0000000000000000 4141414141414141 4141414141414141 4141414141414141 4141414141414141
ZMM30=0000000000000000 0000000000000000 0000000000000000 0000000000000000 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a 1a1a1a1a1a1a1a1a
ZMM31=0000000000000000 0000000000000000 0000000000000000 0000000000000000 2020202020202020 2020202020202020 2020202020202020 2020202020202020
CPU #1:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
CPU #2:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
CPU #3:
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000
2023-09-01 09:19:08.971+0000: shutting down, reason=failed

Comment 8 Yanghang Liu 2023-09-01 09:54:34 UTC
Test with Cedric's build : https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=55043628

Test step:
(1) create a MT2910 VF and setup the VF for migration
(2) start a Q35 + SEABIOS RHEL93 domain
(3) do post-copy migration for the domain
$  sudo virsh migrate --verbose --live --postcopy --postcopy-after-precopy  rhel93 qemu+ssh://10.8.3.15/system
error: internal error: unable to execute QEMU command 'migrate': 0000:b1:00.2: VFIO migration is not supported with postcopy migration

Comment 13 Yanghang Liu 2023-09-12 10:14:02 UTC
Test with Cedric's build : https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=55203844

Test result: PASS

Test step:
(1) create a MT2910 VF and setup the VF for migration
(2) start a Q35 + OVMF RHEL93 domain
(3) do post-copy migration for the domain
$ sudo /bin/virsh migrate --verbose --persistent --postcopy  --live rhel93-ovmf qemu+ssh://10.8.3.14/system
error: internal error: unable to execute QEMU command 'migrate': 0000:b1:00.2: VFIO migration is not supported with postcopy migration

Comment 18 Yanan Fu 2023-09-20 01:45:31 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 21 Yanghang Liu 2023-09-20 07:36:05 UTC
Verification:

Test env: qemu-kvm-8.0.0-16.el9_3.x86_64

Test result: PASS

Test step:
(1) create a MT2910 VF and setup the VF for migration
(2) start a Q35 + SEABIOS RHEL93 VM
(3) do post-copy migration for the VM
# /bin/virsh migrate --verbose --persistent  --postcopy  --timeout 3 --timeout-postcopy  --live rhel93-seabios qemu+ssh://10.73.212.98/system
error: internal error: unable to execute QEMU command 'migrate': 0000:22:00.1: VFIO migration is not supported with postcopy migration

Comment 23 errata-xmlrpc 2023-11-07 08:28:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6368


Note You need to log in before you can comment on or make changes to this bug.