Bug 2236659
Summary: | [vfio migration] The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Yanghang Liu <yanghliu> |
Component: | qemu-kvm | Assignee: | Cédric Le Goater <clegoate> |
qemu-kvm sub component: | Live Migration | QA Contact: | Yanghang Liu <yanghliu> |
Status: | CLOSED MIGRATED | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | alex.williamson, chayang, clegoate, coli, jinzhao, juzhang, kraxel, vgoyal, virt-maint, xuwei, yalzhang, yanghliu, zhguo |
Version: | 9.3 | Keywords: | MigratedToJIRA, Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-09-22 13:14:24 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yanghang Liu
2023-09-01 06:42:51 UTC
221127:vfio_device_dirty_tracking_update section 0x0 - 0x9ffff -> update [0x0 - 0x9ffff] 221127:vfio_device_dirty_tracking_update section 0xa0000 - 0xaffff -> update [0x0 - 0xaffff] 221127:vfio_device_dirty_tracking_update section 0xc0000 - 0xc3fff -> update [0x0 - 0xc3fff] 221127:vfio_device_dirty_tracking_update section 0xc4000 - 0xdffff -> update [0x0 - 0xdffff] 221127:vfio_device_dirty_tracking_update section 0xe0000 - 0xfffff -> update [0x0 - 0xfffff] (1) ^^^ legacy real mode stuff below 1M 221127:vfio_device_dirty_tracking_update section 0x100000 - 0x7fffffff -> update [0x0 - 0x7fffffff] (2) ^^^ ram below 4G 221127:vfio_device_dirty_tracking_update section 0x80000000 - 0x807fffff -> update [0x0 - 0x807fffff] (3) ^^^ emulated vga pci memory bar (not fully sure, 'info pci' in qemu monitor or 'cat /proc/iomem' in the guest should tell) 221127:vfio_device_dirty_tracking_update section 0x100000000 - 0x27fffffff -> update [0x100000000 - 0x27fffffff] (4) ^^^ ram above 4G 221127:vfio_device_dirty_tracking_update section 0x383800000000 - 0x383800001fff -> update [0x100000000 - 0x383800001fff] 221127:vfio_device_dirty_tracking_update section 0x383800003000 - 0x3838000fffff -> update [0x100000000 - 0x3838000fffff] (5) ^^^ the assigned nic. 221127:vfio_device_dirty_tracking_start nr_ranges 2 32:[0x0 - 0x807fffff], 64:[0x100000000 - 0x3838000fffff] vfio tries to combine 1+2+3 into one range (which is fine, almost no holes there). vfio also tries to combine 4+5 into one range, which probably fails because there is a huge hole between end of 4 and start of 5. vfio can't cope with the (new) edk2 behavior of placing pci bars at the end of the available address space. QEMU computes the DMA logging ranges which are given to the driver to track for the 32-bit and 64-bit ranges. We could imagine introducing more ranges to overcome the issue but it would have no effect on the driver since only the overall region size is taken into account. See routine mlx5vf_create_tracker() in kernel. To reduce the overall size, we could exclude the device RAM regions but how can we know these are not used by the target device or any other device ? For the record, MLX5 HW has a 42 bits address space limitation for dirty tracking (min is 12). In that case, 46 bits was requested. (In reply to Cédric Le Goater from comment #4) > QEMU computes the DMA logging ranges which are given to the driver to > track for the 32-bit and 64-bit ranges. We could imagine introducing > more ranges to overcome the issue but it would have no effect on the > driver since only the overall region size is taken into account. See > routine mlx5vf_create_tracker() in kernel. This statement is incorrect. We could probably add more ranges. Sorry for the noise. Started a discussion on the topic upstream. Test for https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=55203844. Test env: host: source: virtlab1025.lab.eng.rdu2.redhat.com taget: virtlab1024.lab.eng.rdu2.redhat.com 5.14.0-362.1.1.el9_3.x86_64 libvirt-9.5.0-7.el9.x86_64 qemu-kvm-8.0.0-13.el9_3.vfio_20230908.x86_64q VM 5.14.0-362.2.1.el9_3.x86_64 Test result: PASS Test step: 1. create a MT2910 VFs and setup the VF for vfio migration on source host 2. create a MT2910 VFs and setup the VF for vfio migration on target host 3. start a Q35 + OVMF VM with a mlx5_vfio_pci VF on the source host The xml: <os firmware='efi'> <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type> <firmware> <feature enabled='yes' name='enrolled-keys'/> <feature enabled='yes' name='secure-boot'/> </firmware> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/var/lib/libvirt/qemu/nvram/rhel93_VARS.fd</nvram> <boot dev='hd'/> </os> ... <hostdev mode='subsystem' type='pci' managed='no'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0xb1' slot='0x00' function='0x1'/> </source> </hostdev> 4. migrate the VM $ sudo /bin/virsh migrate --verbose --persistent --live rhel93-ovmf qemu+ssh://10.8.3.14/system Migration: [100.00 %] 5. check the migration capabilities during the migration $ sudo virsh qemu-monitor-command --hmp rhel93-ovmf 'info migrate_capabilities' xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: on postcopy-ram: off x-colo: off release-ram: off return-path: on pause-before-switchover: on multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off validate-uuid: off background-snapshot: off zero-copy-send: off postcopy-preempt: off switchover-ack: off $ sudo virsh qemu-monitor-command --hmp rhel93-ovmf 'info migrate' globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on clear-bitmap-shift: 18 Migration status: active total time: 8171 ms expected downtime: 373 ms setup: 197 ms transferred ram: 844070 kbytes throughput: 280.80 mbps remaining ram: 460 kbytes total ram: 4215216 kbytes duplicate: 848638 pages skipped: 0 pages normal: 208745 pages normal bytes: 834980 kbytes dirty sync count: 4 page size: 4 kbytes multifd bytes: 0 kbytes pages-per-second: 37594 dirty pages rate: 49 pages precopy ram: 844070 kbytes vfio device transferred: 112 kbytes 6. check the migration status on the source host $ sudo virsh qemu-monitor-event rhel93-ovmf --loop event MIGRATION at 1694512584.234664 for domain 'rhel93-ovmf': {"status":"setup"} event MIGRATION_PASS at 1694512584.398320 for domain 'rhel93-ovmf': {"pass":1} event MIGRATION at 1694512584.431836 for domain 'rhel93-ovmf': {"status":"active"} event MIGRATION_PASS at 1694512591.977791 for domain 'rhel93-ovmf': {"pass":2} event MIGRATION_PASS at 1694512592.247532 for domain 'rhel93-ovmf': {"pass":3} event MIGRATION_PASS at 1694512592.405233 for domain 'rhel93-ovmf': {"pass":4} event STOP at 1694512592.438034 for domain 'rhel93-ovmf': <null> event MIGRATION at 1694512592.439205 for domain 'rhel93-ovmf': {"status":"pre-switchover"} event MIGRATION at 1694512592.440122 for domain 'rhel93-ovmf': {"status":"device"} event MIGRATION_PASS at 1694512592.564528 for domain 'rhel93-ovmf': {"pass":5} event MIGRATION at 1694512592.762087 for domain 'rhel93-ovmf': {"status":"completed"} $ sudo tail -f /var/log/libvirt/qemu/rhel93-ovmf.log 2023-09-12 09:56:24.233+0000: initiating migration 2023-09-12 09:56:32.796+0000: shutting down, reason=migrated $ sudo virsh domjobinfo rhel93-ovmf --completed Job type: Completed Operation: Outgoing migration Time elapsed: 9747 ms Time elapsed w/o network: 9744 ms Data processed: 825.028 MiB Data remaining: 0.000 B Data total: 4.020 GiB Memory processed: 825.028 MiB Memory remaining: 0.000 B Memory total: 4.020 GiB Memory bandwidth: 99.608 MiB/s Dirty rate: 0 pages/s Page size: 4096 bytes Iteration: 5 Postcopy requests: 0 Constant pages: 848671 Normal pages: 208934 Normal data: 816.148 MiB Total downtime: 353 ms Downtime w/o network: 350 ms Setup time: 197 ms 7. check the VM status on the target host # ifconifg or lspci ← We can get the VF info via ifconfig or lspci # dmesg ← There is no error in the VM dmesg 8. migrate the VM back sudo /bin/virsh migrate --verbose --persistent --live rhel93-ovmf qemu+ssh://10.8.3.15/system Migration: [100.00 %] Test for https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=55203844. Test env: host: source: virtlab1025.lab.eng.rdu2.redhat.com taget: virtlab1024.lab.eng.rdu2.redhat.com 5.14.0-362.1.1.el9_3.x86_64 libvirt-9.5.0-7.el9.x86_64 qemu-kvm-8.0.0-13.el9_3.vfio_20230908.x86_64q VM Win2022 Test result: PASS Test step: 1. create a MT2910 VFs and setup the VF for vfio migration on source host 2. create a MT2910 VFs and setup the VF for vfio migration on target host 3. start a Q35 + OVMF Win2022 VM with a mlx5_vfio_pci VF on the source host The xml: <os firmware='efi'> <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type> <firmware> <feature enabled='yes' name='enrolled-keys'/> <feature enabled='yes' name='secure-boot'/> </firmware> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd</loader> <nvram template='/usr/share/edk2/ovmf/OVMF_VARS.secboot.fd'>/home/yanghliu/.config/libvirt/qemu/nvram/win2022_VARS.fd</nvram> <boot dev='hd'/> </os> ... <hostdev mode='subsystem' type='pci' managed='no'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0xb1' slot='0x00' function='0x1'/> </source> </hostdev> 4. migrate the VM $ sudo /bin/virsh migrate --verbose --persistent --live win2022 qemu+ssh://10.8.3.14/system Migration: [100.00 %] 5. check the migration status on the source host $ sudo virsh qemu-monitor-event win2022 --loop event MIGRATION at 1694773382.231271 for domain 'win2022': {"status":"setup"} event MIGRATION_PASS at 1694773382.396051 for domain 'win2022': {"pass":1} event MIGRATION at 1694773382.427334 for domain 'win2022': {"status":"active"} event MIGRATION_PASS at 1694773404.416982 for domain 'win2022': {"pass":2} event MIGRATION_PASS at 1694773406.742163 for domain 'win2022': {"pass":3} event MIGRATION_PASS at 1694773407.134693 for domain 'win2022': {"pass":4} event MIGRATION_PASS at 1694773407.337088 for domain 'win2022': {"pass":5} event MIGRATION_PASS at 1694773407.550906 for domain 'win2022': {"pass":6} event MIGRATION_PASS at 1694773407.763225 for domain 'win2022': {"pass":7} event MIGRATION_PASS at 1694773407.963020 for domain 'win2022': {"pass":8} event STOP at 1694773407.989569 for domain 'win2022': <null> event MIGRATION at 1694773407.990726 for domain 'win2022': {"status":"pre-switchover"} event MIGRATION at 1694773407.991567 for domain 'win2022': {"status":"device"} event MIGRATION_PASS at 1694773408.138123 for domain 'win2022': {"pass":9} event MIGRATION at 1694773408.334748 for domain 'win2022': {"status":"completed"} $ sudo tail -f /var/log/libvirt/qemu/win2022.log 2023-09-15 10:18:04.633+0000: initiating migration 2023-09-15 10:18:27.835+0000: shutting down, reason=migrated $ sudo virsh domjobinfo win2022 --completed Job type: Completed Operation: Outgoing migration Time elapsed: 24365 ms Time elapsed w/o network: 24363 ms Data processed: 2.414 GiB Data remaining: 0.000 B Data total: 4.020 GiB Memory processed: 2.414 GiB Memory remaining: 0.000 B Memory total: 4.020 GiB Memory bandwidth: 107.875 MiB/s Dirty rate: 0 pages/s Page size: 4096 bytes Iteration: 5 Postcopy requests: 0 Constant pages: 435240 Normal pages: 630714 Normal data: 2.406 GiB Total downtime: 437 ms Downtime w/o network: 435 ms Setup time: 201 ms 6. check the VM status on the target host # ipconfig ← We can get the VF info via ipconfig 7. migrate the VM back $ sudo /bin/virsh migrate --verbose --persistent --live win2022 qemu+ssh://10.8.3.15/system Migration: [100.00 %] $ sudo virsh domjobinfo win2022 --completed Job type: Completed Operation: Outgoing migration Time elapsed: 27308 ms Time elapsed w/o network: 27306 ms Data processed: 2.671 GiB Data remaining: 0.000 B Data total: 4.020 GiB Memory processed: 2.671 GiB Memory remaining: 0.000 B Memory total: 4.020 GiB Memory bandwidth: 105.727 MiB/s Dirty rate: 0 pages/s Page size: 4096 bytes Iteration: 9 Postcopy requests: 0 Constant pages: 435516 Normal pages: 697801 Normal data: 2.662 GiB Total downtime: 378 ms Downtime w/o network: 376 ms Setup time: 196 ms Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |