Bug 1627272 - boot guest with q35+vIOMMU+ device assignment, qemu crash when return assigned network devices from vfio driver to ixgbe in guest
Summary: boot guest with q35+vIOMMU+ device assignment, qemu crash when return assigne...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.7
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: ---
Assignee: Peter Xu
QA Contact: Sitong Liu
URL:
Whiteboard:
Depends On:
Blocks: 1649160 1629562 1647719 1651787
TreeView+ depends on / blocked
 
Reported: 2018-09-10 14:20 UTC by Pei Zhang
Modified: 2019-08-22 09:19 UTC (History)
11 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-19.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1629562 1647719 (view as bug list)
Environment:
Last Closed: 2019-08-22 09:18:53 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2553 None None None 2019-08-22 09:19:46 UTC

Description Pei Zhang 2018-09-10 14:20:04 UTC
Description of problem:
Boot qemu with 2 assigned ixgbe network devices, then start testpmd in guest with these 2 network devices. Next, quit testpmd and return the assigned NICs from vfio to ixgbe driver, "# ifconfig up" the ixgbe NIC will cause qemu crash.


Version-Release number of selected component (if applicable):
3.10.0-944.el7.x86_64
qemu-kvm-rhev-2.12.0-13.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Boot qemu with assigned network devices, refer to[1]

2. In guest, load vfio and start testpmd, refer to[2]

4. In guest, quit testpmd, then return nic from vfio driver to ixgbe

# testpmd> quit

# dpdk-devbind --bind=ixgbe 0000:01:00.0
# dpdk-devbind --bind=ixgbe 0000:02:00.0

5. In guest set up ixgbe nic, qemu crash

# dpdk-devbind --status
...
Network devices using kernel driver
===================================
0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=enp1s0 drv=ixgbe unused=vfio-pci 
0000:02:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=enp2s0 drv=ixgbe unused=vfio-pci 
0000:03:00.0 'Virtio network device 1041' if=eth0 drv=virtio-pci unused=vfio-pci *Active*

# ifconfig enp1s0 up

(qemu) qemu-kvm: VFIO_MAP_DMA: -17
qemu-kvm: vfio_dma_map(0x55635ff4ad00, 0x100000000, 0x180000000, 0x7fcc00000000) = -17 (File exists)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
CPU #0:
RAX=0000000000000820 RBX=0000000000000082 RCX=ffff88793cded000 RDX=ffff9bcd00c50000
RSI=0000000200000025 RDI=0000000000000081 RBP=ffff887a2570ba48 RSP=ffff887a2570b9f8
R8 =0000010000000031 R9 =000000017cdc7604 R10=0000000000000002 R11=ffff887a3499d000
R12=0000000000000080 R13=ffff88793fd14540 R14=0000000000000204 R15=ffff88793cdec0c0
RIP=ffffffff8ffe9975 RFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 00007fa8b4cab740 ffffffff 00c00000
GS =0000 ffff887a3fc00000 ffffffff 00c00000
LDT=0000 0000000000000000 000fffff 00000000
TR =0040 ffff887a3fc04000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff887a3fc0c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007fa8b485b224 CR3=0000000269f14000 CR4=001607f0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0020 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=8000000000000000 c01f FPR7=0000000000000000 0000
XMM00=000000ff0000ff000000000000ff0000 XMM01=47445800707500307331706e65006769
XMM02=0000000100000000ff00000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000210000000000000000 XMM05=40404040404040404040404040404040
XMM06=5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b XMM07=20202020202020202020202020202020
XMM08=00000000000000202020002020000000 XMM09=ffff00ffffffffffffffffffff000000
XMM10=00202020002020000000000000000000 XMM11=ffffffffffffff000000ff0000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
CPU #1:
RAX=ffffffff90169100 RBX=ffffffff90758560 RCX=0100000000000000 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000046 RBP=ffff88793cedfea8 RSP=ffff88793cedfea8
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=7fffffffffffffff
R12=0000000000000001 R13=ffff88793cedc000 R14=ffff88793cedc000 R15=ffff88793cedc000
RIP=ffffffff90169306 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff887a3fc80000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff887a3fc84000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff887a3fc8c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007f7678001000 CR3=0000000030a10000 CR4=001607e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000ff00000000000000 XMM01=6e6974747568530a002e2e2e6579420a
XMM02=00000000000000000000000000000000 XMM03=000000000000ff000000000000000000
XMM04=313d7178742d2d00313d7178722d2d00 XMM05=00000000000000020000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=0000ff000000ff0000ffffffffffffff
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
CPU #2:
RAX=ffffffff90169100 RBX=ffffffff90758560 RCX=0100000000000000 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000046 RBP=ffff88793cee3ea8 RSP=ffff88793cee3ea8
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000019 R11=7fffffffffffffff
R12=0000000000000002 R13=ffff88793cee0000 R14=ffff88793cee0000 R15=ffff88793cee0000
RIP=ffffffff90169306 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff887a3fd00000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff887a3fd04000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff887a3fd0c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007f76c96e2b68 CR3=0000000030a10000 CR4=001607e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=000000000000000000000000ff000000 XMM01=25252525252525252525252525252525
XMM02=00000000000000000000000000000000 XMM03=ffffffffffffffffffff000000ff0000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=0000ff000000ff0000ffffffffffffff
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
CPU #3:
RAX=ffffffff90169100 RBX=ffffffff90758560 RCX=0100000000000000 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000046 RBP=ffff88793cee7ea8 RSP=ffff88793cee7ea8
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=7fffffffffffffff
R12=0000000000000003 R13=ffff88793cee4000 R14=ffff88793cee4000 R15=ffff88793cee4000
RIP=ffffffff90169306 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff887a3fd80000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff887a3fd84000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff887a3fd8c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007f76c8ee1b68 CR3=0000000030a10000 CR4=001607e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=000000000000000000000000ff000000 XMM01=25252525252525252525252525252525
XMM02=00000000000000000000000000000000 XMM03=ffffffffffffffffffff000000ff0000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=0000ff000000ff0000ffffffffffffff
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
Aborted (core dumped)


Actual results:
qemu crash.


Expected results:
qemu should not crash.


Reference:
[1]
/usr/libexec/qemu-kvm -name rhel7.6 -M q35,kernel-irqchip=split \
-cpu host -m 8G \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-smp 4,sockets=1,cores=4,threads=1 \
-device pcie-root-port,id=root.1,chassis=1 \
-device pcie-root-port,id=root.2,chassis=2 \
-device pcie-root-port,id=root.3,chassis=3 \
-device pcie-root-port,id=root.4,chassis=4 \
-device vfio-pci,host=0000:81:00.0,bus=root.1 \
-device vfio-pci,host=0000:81:00.1,bus=root.2 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=88:66:da:5f:dd:01,bus=root.3 \
-drive file=/home/images_nfv-virt-rt-kvm/rhel7.6.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.4 \
-vnc :2 \
-monitor stdio \


[2]
# modprobe vfio
# modprobe vfio-pci

# dpdk-devbind --bind=vfio-pci 0000:01:00.0
# dpdk-devbind --bind=vfio-pci 0000:02:00.0

# echo 4 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages

# /usr/bin/testpmd \
-l 1,2,3 \
-n 4 \
-d /usr/lib64/librte_pmd_ixgbe.so \
-w 0000:01:00.0 -w 0000:02:00.0 \
-- \
--nb-cores=2 \
-i \
--disable-rss \
--rxq=1 --txq=1 


Additional info:
1. This is regression bug.
qemu-kvm-rhev-2.12.0-12.el7.x86_64   works well

Comment 3 Alex Williamson 2018-09-10 20:43:19 UTC
Errno 17 is EEXIST, vIOMMU probably didn't invalidate all the mappings from the DPDK domain before the device was added back to the static identity domain.

Comment 4 Peter Xu 2018-09-13 05:58:05 UTC
(In reply to Alex Williamson from comment #3)
> Errno 17 is EEXIST, vIOMMU probably didn't invalidate all the mappings from
> the DPDK domain before the device was added back to the static identity
> domain.

Agreed.

It's very possible that we didn't handle the case where the context entry switches from valid to invalid.  In that case now we might ignore the shadow sync but actually we should do a "unmap all" instead.

I'm preparing a package for test.  Will update soon.

Comment 5 Peter Xu 2018-09-13 06:21:27 UTC
Pei, could you try this package to see whether it can fix the problem?

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=18303398

Comment 6 Peter Xu 2018-09-13 06:55:32 UTC
I made a mistake on building previous package... Please instead try this one:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=18303752

Please make sure that the package is suffixed with bz1627272_2 not bz1627272.

Comment 7 Pei Zhang 2018-09-13 07:45:55 UTC
(In reply to Peter Xu from comment #6)
> I made a mistake on building previous package... Please instead try this one:
> 
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=18303752
> 
> Please make sure that the package is suffixed with bz1627272_2 not bz1627272.

Peter, this bug has gone with above build. Thanks.

Versions:
3.10.0-945.el7.x86_64
qemu-kvm-rhev-2.12.0-15.el7.bz1627272_2.x86_64

Testings:
Testing with both PF and VF, following same steps of Description, everything works well as expected. 
  - Qemu works well
  - No any error in both host and guest #dmesg
  - testpmd in guest can receive packets.

Comment 8 Peter Xu 2018-09-13 08:02:10 UTC
Thanks for the quick feedback, Pei.

Posted a fix upstream:

https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg01555.html

Comment 19 Miroslav Rezanina 2018-11-21 15:14:06 UTC
Fix included in qemu-kvm-rhev-2.12.0-19.el7

Comment 21 Sitong Liu 2019-01-21 06:58:23 UTC
== Verification ==

Versions:
qemu-kvm-rhev-2.12.0-21.el7.x86_64
3.10.0-993.el7.x86_64

Steps:
1. Boot vm with q35 + vIOMMU and the assigned NICs.

# /usr/libexec/qemu-kvm -M q35,kernel-irqchip=split  \
-cpu host -m 8G \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
...
-device vfio-pci,host=0000:5e:00.0,bus=root.1 \
-device vfio-pci,host=0000:5e:00.1,bus=root.2 \
...

2. In vm, load vfio and start testpmd.

# modprobe vfio
# modprobe vfio-pci

# dpdk-devbind --bind=vfio-pci 0000:01:00.0
# dpdk-devbind --bind=vfio-pci 0000:02:00.0

# echo 4 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages

# testpmd> start

3. In vm, quit testpmd, return nic from vfio driver to ixgbe.

# testpmd> quit

# dpdk-devbind --bind=ixgbe 0000:01:00.0
# dpdk-devbind --bind=ixgbe 0000:02:00.0

# ifconfig enp1s0 up

4. In vm, start testpmd and send packets from another host.

# ifconfig enp1s0 down

# dpdk-devbind --bind=vfio-pci 0000:01:00.0
# dpdk-devbind --bind=vfio-pci 0000:02:00.0

# testpmd> start

Result:

After step 3, qemu works well, no error in host/guest dmesg.
After step 4, testpmd in guest can receive packets well.

Move to 'Verified'.

Comment 23 errata-xmlrpc 2019-08-22 09:18:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2553


Note You need to log in before you can comment on or make changes to this bug.