RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1647719 - boot guest with q35+vIOMMU+ device assignment, qemu crash when return assigned network devices from vfio driver to ixgbe in guest [rhel-7.6.z]
Summary: boot guest with q35+vIOMMU+ device assignment, qemu crash when return assigne...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.7
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: ---
Assignee: Peter Xu
QA Contact: Sitong Liu
URL:
Whiteboard:
Depends On: 1627272
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-08 08:27 UTC by Oneata Mircea Teodor
Modified: 2019-01-29 18:32 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-18.el7_6.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1627272
Environment:
Last Closed: 2019-01-29 18:32:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0209 0 None None None 2019-01-29 18:32:35 UTC

Description Oneata Mircea Teodor 2018-11-08 08:27:55 UTC
This bug has been copied from bug #1627272 and has been proposed to be backported to 7.6 z-stream (EUS).

Comment 4 Miroslav Rezanina 2018-12-06 07:39:58 UTC
Fix included in qemu-kvm-rhev-2.12.0-18.el7_6.3

Comment 6 Sitong Liu 2018-12-18 05:38:58 UTC
Reproduced qemu crash on qemu-kvm-rhev-2.12.0-18.el7_6.2.x86_64.

Reproduce steps according to bz 1627272 Description steps 1-5.

qemu cli:

/usr/libexec/qemu-kvm -name rhel7.6 -M q35,kernel-irqchip=split \
-cpu host -m 8G \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-smp 4,sockets=1,cores=4,threads=1 \
-device pcie-root-port,id=root.1,chassis=1 \
-device pcie-root-port,id=root.2,chassis=2 \
-device pcie-root-port,id=root.3,chassis=3 \
-device pcie-root-port,id=root.4,chassis=4 \
-device vfio-pci,host=0000:5e:00.0,bus=root.1 \
-device vfio-pci,host=0000:5e:00.1,bus=root.2 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=88:66:da:5f:dd:01,bus=root.3 \
-drive file=/home/images_nfv-virt-rt-kvm/rhel7.6.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.4 \
-vnc :2 \
-monitor stdio \
 
qemu crash:

(qemu) qemu-kvm: VFIO_MAP_DMA: -17
qemu-kvm: vfio_dma_map(0x560ae1cdcd00, 0x100000000, 0x180000000, 0x7f81c0000000) = -17 (File exists)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
CPU #0:
RAX=ffffffffbb969a20 RBX=ffffffffbbf583e0 RCX=0000000000000048 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000046 RBP=ffffffffbbe03eb0 RSP=ffffffffbbe03eb0
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=0000000000000000
R12=0000000000000000 R13=ffffffffbbe00000 R14=ffffffffbbe00000 R15=ffffffffbbe00000
RIP=ffffffffbb969c26 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff8b313fc00000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff8b313fc04000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff8b313fc0c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=0000000001cf5cb0 CR3=000000007deae000 CR4=007607f0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=742033342d34372d6d7640746f6f725b
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000
XMM10=20002020000000000000000000000000 XMM11=ffffffff000000ffffffffffffffffff
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
CPU #1:
RAX=ffffffffbb969a20 RBX=ffffffffbbf583e0 RCX=0000000000000048 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000046 RBP=ffff8b303cee7ea8 RSP=ffff8b303cee7ea8
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=0000000000000000
R12=0000000000000001 R13=ffff8b303cee4000 R14=ffff8b303cee4000 R15=ffff8b303cee4000
RIP=ffffffffbb969c26 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff8b313fc80000 ffffffff 00c00000
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff8b313fc84000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff8b313fc8c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007feaa9666000 CR3=000000007eeee000 CR4=007607e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0020 [ST=0] FTW=00 MXCSR=00009fc0
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=8000000000000000 c01f FPR7=0000000000000000 0000
XMM00=0000ff0000ff0000ff00000000000000 XMM01=4241006c77006c73006769666e6f632f
XMM02=5251504f4e4d4c4b4a49484746454443 XMM03=00000000000000412f30303a30300030
XMM04=00000000000000000000000000000000 XMM05=6d736d695f646d00646961726d646f6e
XMM06=30302c303030302c3a65727574616566 XMM07=3a353530303a6c65646f6d3a36303030
XMM08=ffffffffffffffffffffffffffffffff XMM09=30303a726f646e65763a757063363878
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
CPU #2:
RAX=0000000000000720 RBX=0000000000000082 RCX=ffff8b303cdea000 RDX=ffffb66180c50000
RSI=0000000200000025 RDI=0000000000000071 RBP=ffff8b3136f87a48 RSP=ffff8b3136f879f8
R8 =0000010000000031 R9 =000000017cdc75c4 R10=0000000000000002 R11=ffff8b3134972000
R12=0000000000000070 R13=ffff8b303fd14540 R14=00000000000001c4 R15=ffff8b303cde90c0
RIP=ffffffffbb7e9d45 RFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 00007f2910a13740 ffffffff 00c00000
GS =0000 ffff8b313fd00000 ffffffff 00c00000
LDT=0000 0000000000000000 000fffff 00000000
TR =0040 ffff8b313fd04000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff8b313fd0c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007f29105c0230 CR3=000000026fa24000 CR4=007607e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000307331706e65
XMM02=000000000000002600007fffde4796e0 XMM03=00000000000000000000000000000000
XMM04=00000000000000210000000000000000 XMM05=40404040404040404040404040404040
XMM06=5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b XMM07=20202020202020202020202020202020
XMM08=00000000000000202020002020000000 XMM09=ffff00ffffffffffffffffffff000000
XMM10=00202020002020000000000000000000 XMM11=ffffffffffffff000000ff0000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
CPU #3:
RAX=ffffffffbb969a20 RBX=ffffffffbbf583e0 RCX=0000000000000048 RDX=0000000000000000
RSI=0000000000000000 RDI=0000000000000046 RBP=ffff8b303cef7ea8 RSP=ffff8b303cef7ea8
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=000000d8b2e69000
R12=0000000000000003 R13=ffff8b303cef4000 R14=ffff8b303cef4000 R15=ffff8b303cef4000
RIP=ffffffffbb969c26 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1
ES =0000 0000000000000000 ffffffff 00c00000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00c00000
FS =0000 0000000000000000 ffffffff 00c00000
GS =0000 ffff8b313fd80000 ffffffff 00c00000
LDT=0000 0000000000000000 000fffff 00000000
TR =0040 ffff8b313fd84000 00002087 00008b00 DPL=0 TSS64-busy
GDT=     ffff8b313fd8c000 0000007f
IDT=     ffffffffff528000 00000fff
CR0=80050033 CR2=00007f6fd0be6e00 CR3=000000007eeee000 CR4=007607e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000d01
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001fa0
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=ffffffffffffff00ffffffff0000ff00 XMM01=55555555555555555555555555555555
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000032000000200000004700000030 XMM05=00000035000000580000002000000050
XMM06=000000740000002d0000003000000034 XMM07=00000061000000640000004100000020
XMM08=00000000000000000000000000000000 XMM09=00000030000000350000005b00000020
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
Aborted                 (core dumped)

Versions:
kernel-3.10.0-978.el7.x86_64

Comment 7 Sitong Liu 2018-12-18 06:34:33 UTC
Reproduction

Comment 8 Sitong Liu 2018-12-18 06:36:10 UTC
Updated host kernel to 3.10.0-957.4.1.el7.x86_64, got same qemu crash as comment 6.

Comment 9 Sitong Liu 2018-12-18 06:36:37 UTC
Verification

Comment 10 Sitong Liu 2018-12-18 06:39:30 UTC
Versions:
qemu-kvm-rhev-2.12.0-18.el7_6.3.x86_64
kernel-3.10.0-957.4.1.el7.x86_64

Steps:

1. Boot VM with device assignment and viommu.

/usr/libexec/qemu-kvm -name rhel7.6 -M q35,kernel-irqchip=split \
-cpu host -m 8G \
-device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-smp 4,sockets=1,cores=4,threads=1 \
-device pcie-root-port,id=root.1,chassis=1 \
-device pcie-root-port,id=root.2,chassis=2 \
-device pcie-root-port,id=root.3,chassis=3 \
-device pcie-root-port,id=root.4,chassis=4 \
-device vfio-pci,host=0000:5e:00.0,bus=root.1 \
-device vfio-pci,host=0000:5e:00.1,bus=root.2 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=88:66:da:5f:dd:01,bus=root.3 \
-drive file=/home/images_nfv-virt-rt-kvm/rhel7.6.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.4 \
-vnc :2 \
-monitor stdio \

2. In guest, load vfio and start testpmd.

# modprobe vfio
# modprobe vfio-pci

# dpdk-devbind --bind=vfio-pci 0000:01:00.0
# dpdk-devbind --bind=vfio-pci 0000:02:00.0

# echo 4 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages

# /usr/bin/testpmd \
-l 1,2,3 \
-n 4 \
-d /usr/lib64/librte_pmd_ixgbe.so \
-w 0000:01:00.0 -w 0000:02:00.0 \
-- \
--nb-cores=2 \
-i \
--disable-rss \
--rxq=1 --txq=1 

3. In guest, quit testpmd, then return nic from vfio driver to ixgbe.

# testpmd> quit

# dpdk-devbind --bind=ixgbe 0000:01:00.0
# dpdk-devbind --bind=ixgbe 0000:02:00.0

4. In guest, set up ixgbe nic, and check guest dmesg.

# dpdk-devbind --status
...
Network devices using kernel driver
===================================
0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=enp1s0 drv=ixgbe unused=vfio-pci 
0000:02:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=enp2s0 drv=ixgbe unused=vfio-pci 
0000:03:00.0 'Virtio network device 1041' if=eth0 drv=virtio-pci unused=vfio-pci *Active*

# ifconfig enp1s0 up
# ifconfig enp2s0 up

# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 88:66:da:5f:dd:01 brd ff:ff:ff:ff:ff:ff
7: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:14:22:c4 brd ff:ff:ff:ff:ff:ff
8: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:14:22:c6 brd ff:ff:ff:ff:ff:ff

# dmesg
...
[  704.935362] DMAR: 64bit 0000:01:00.0 uses identity mapping
[  704.957388] pps pps0: new PPS source ptp0
[  704.957392] ixgbe 0000:01:00.0: registered PHC device on enp1s0
[  705.090617] IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready
[  709.456882] ixgbe 0000:01:00.0 enp1s0: NIC Link is Up 10 Gbps, Flow Control: None
[  709.457035] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes ready
[  715.103574] DMAR: 64bit 0000:02:00.0 uses identity mapping
[  715.124992] pps pps1: new PPS source ptp1
[  715.124996] ixgbe 0000:02:00.0: registered PHC device on enp2s0
[  715.258117] IPv6: ADDRCONF(NETDEV_UP): enp2s0: link is not ready
[  719.605068] ixgbe 0000:02:00.0 enp2s0: NIC Link is Up 10 Gbps, Flow Control: None
[  719.605221] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready
...

5. Repeat steps 2-4, qemu and guest work well.

Comment 11 Sitong Liu 2018-12-18 06:46:29 UTC
According comment 10, mark this bz as Verified.

Best regards,
Sitong

Comment 13 errata-xmlrpc 2019-01-29 18:32:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0209


Note You need to log in before you can comment on or make changes to this bug.