Bug 1647719
Summary: | boot guest with q35+vIOMMU+ device assignment, qemu crash when return assigned network devices from vfio driver to ixgbe in guest [rhel-7.6.z] | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Oneata Mircea Teodor <toneata> |
Component: | qemu-kvm-rhev | Assignee: | Peter Xu <peterx> |
Status: | CLOSED ERRATA | QA Contact: | Sitong Liu <siliu> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 7.7 | CC: | ailan, alex.williamson, chayang, hhuang, jinzhao, juzhang, michen, mrezanin, mtessun, peterx, pezhang, siliu, virt-maint, yfu |
Target Milestone: | rc | Keywords: | Regression, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.12.0-18.el7_6.3 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1627272 | Environment: | |
Last Closed: | 2019-01-29 18:32:27 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1627272 | ||
Bug Blocks: |
Description
Oneata Mircea Teodor
2018-11-08 08:27:55 UTC
Fix included in qemu-kvm-rhev-2.12.0-18.el7_6.3 Reproduced qemu crash on qemu-kvm-rhev-2.12.0-18.el7_6.2.x86_64. Reproduce steps according to bz 1627272 Description steps 1-5. qemu cli: /usr/libexec/qemu-kvm -name rhel7.6 -M q35,kernel-irqchip=split \ -cpu host -m 8G \ -device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \ -object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \ -numa node,memdev=mem -mem-prealloc \ -smp 4,sockets=1,cores=4,threads=1 \ -device pcie-root-port,id=root.1,chassis=1 \ -device pcie-root-port,id=root.2,chassis=2 \ -device pcie-root-port,id=root.3,chassis=3 \ -device pcie-root-port,id=root.4,chassis=4 \ -device vfio-pci,host=0000:5e:00.0,bus=root.1 \ -device vfio-pci,host=0000:5e:00.1,bus=root.2 \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=88:66:da:5f:dd:01,bus=root.3 \ -drive file=/home/images_nfv-virt-rt-kvm/rhel7.6.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.4 \ -vnc :2 \ -monitor stdio \ qemu crash: (qemu) qemu-kvm: VFIO_MAP_DMA: -17 qemu-kvm: vfio_dma_map(0x560ae1cdcd00, 0x100000000, 0x180000000, 0x7f81c0000000) = -17 (File exists) qemu: hardware error: vfio: DMA mapping failed, unable to continue CPU #0: RAX=ffffffffbb969a20 RBX=ffffffffbbf583e0 RCX=0000000000000048 RDX=0000000000000000 RSI=0000000000000000 RDI=0000000000000046 RBP=ffffffffbbe03eb0 RSP=ffffffffbbe03eb0 R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=0000000000000000 R12=0000000000000000 R13=ffffffffbbe00000 R14=ffffffffbbe00000 R15=ffffffffbbe00000 RIP=ffffffffbb969c26 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 0000000000000000 ffffffff 00c00000 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00000 FS =0000 0000000000000000 ffffffff 00c00000 GS =0000 ffff8b313fc00000 ffffffff 00c00000 LDT=0000 0000000000000000 ffffffff 00c00000 TR =0040 ffff8b313fc04000 00002087 00008b00 DPL=0 TSS64-busy GDT= ffff8b313fc0c000 0000007f IDT= ffffffffff528000 00000fff CR0=80050033 CR2=0000000001cf5cb0 CR3=000000007deae000 CR4=007607f0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000fffe0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=00000000000000000000000000000000 XMM01=742033342d34372d6d7640746f6f725b XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000 XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000 XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000 XMM10=20002020000000000000000000000000 XMM11=ffffffff000000ffffffffffffffffff XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000 XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000 CPU #1: RAX=ffffffffbb969a20 RBX=ffffffffbbf583e0 RCX=0000000000000048 RDX=0000000000000000 RSI=0000000000000000 RDI=0000000000000046 RBP=ffff8b303cee7ea8 RSP=ffff8b303cee7ea8 R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=0000000000000000 R12=0000000000000001 R13=ffff8b303cee4000 R14=ffff8b303cee4000 R15=ffff8b303cee4000 RIP=ffffffffbb969c26 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 0000000000000000 ffffffff 00c00000 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00000 FS =0000 0000000000000000 ffffffff 00c00000 GS =0000 ffff8b313fc80000 ffffffff 00c00000 LDT=0000 0000000000000000 ffffffff 00c00000 TR =0040 ffff8b313fc84000 00002087 00008b00 DPL=0 TSS64-busy GDT= ffff8b313fc8c000 0000007f IDT= ffffffffff528000 00000fff CR0=80050033 CR2=00007feaa9666000 CR3=000000007eeee000 CR4=007607e0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000fffe0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=0020 [ST=0] FTW=00 MXCSR=00009fc0 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=8000000000000000 c01f FPR7=0000000000000000 0000 XMM00=0000ff0000ff0000ff00000000000000 XMM01=4241006c77006c73006769666e6f632f XMM02=5251504f4e4d4c4b4a49484746454443 XMM03=00000000000000412f30303a30300030 XMM04=00000000000000000000000000000000 XMM05=6d736d695f646d00646961726d646f6e XMM06=30302c303030302c3a65727574616566 XMM07=3a353530303a6c65646f6d3a36303030 XMM08=ffffffffffffffffffffffffffffffff XMM09=30303a726f646e65763a757063363878 XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000 XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000 XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000 CPU #2: RAX=0000000000000720 RBX=0000000000000082 RCX=ffff8b303cdea000 RDX=ffffb66180c50000 RSI=0000000200000025 RDI=0000000000000071 RBP=ffff8b3136f87a48 RSP=ffff8b3136f879f8 R8 =0000010000000031 R9 =000000017cdc75c4 R10=0000000000000002 R11=ffff8b3134972000 R12=0000000000000070 R13=ffff8b303fd14540 R14=00000000000001c4 R15=ffff8b303cde90c0 RIP=ffffffffbb7e9d45 RFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 0000000000000000 ffffffff 00c00000 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00000 FS =0000 00007f2910a13740 ffffffff 00c00000 GS =0000 ffff8b313fd00000 ffffffff 00c00000 LDT=0000 0000000000000000 000fffff 00000000 TR =0040 ffff8b313fd04000 00002087 00008b00 DPL=0 TSS64-busy GDT= ffff8b313fd0c000 0000007f IDT= ffffffffff528000 00000fff CR0=80050033 CR2=00007f29105c0230 CR3=000000026fa24000 CR4=007607e0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000fffe0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=00000000000000000000000000000000 XMM01=00000000000000000000307331706e65 XMM02=000000000000002600007fffde4796e0 XMM03=00000000000000000000000000000000 XMM04=00000000000000210000000000000000 XMM05=40404040404040404040404040404040 XMM06=5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b XMM07=20202020202020202020202020202020 XMM08=00000000000000202020002020000000 XMM09=ffff00ffffffffffffffffffff000000 XMM10=00202020002020000000000000000000 XMM11=ffffffffffffff000000ff0000000000 XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000 XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000 CPU #3: RAX=ffffffffbb969a20 RBX=ffffffffbbf583e0 RCX=0000000000000048 RDX=0000000000000000 RSI=0000000000000000 RDI=0000000000000046 RBP=ffff8b303cef7ea8 RSP=ffff8b303cef7ea8 R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000000 R11=000000d8b2e69000 R12=0000000000000003 R13=ffff8b303cef4000 R14=ffff8b303cef4000 R15=ffff8b303cef4000 RIP=ffffffffbb969c26 RFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 0000000000000000 ffffffff 00c00000 CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00000 FS =0000 0000000000000000 ffffffff 00c00000 GS =0000 ffff8b313fd80000 ffffffff 00c00000 LDT=0000 0000000000000000 000fffff 00000000 TR =0040 ffff8b313fd84000 00002087 00008b00 DPL=0 TSS64-busy GDT= ffff8b313fd8c000 0000007f IDT= ffffffffff528000 00000fff CR0=80050033 CR2=00007f6fd0be6e00 CR3=000000007eeee000 CR4=007607e0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000fffe0ff0 DR7=0000000000000400 EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001fa0 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=ffffffffffffff00ffffffff0000ff00 XMM01=55555555555555555555555555555555 XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 XMM04=00000032000000200000004700000030 XMM05=00000035000000580000002000000050 XMM06=000000740000002d0000003000000034 XMM07=00000061000000640000004100000020 XMM08=00000000000000000000000000000000 XMM09=00000030000000350000005b00000020 XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000 XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000 XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000 Aborted (core dumped) Versions: kernel-3.10.0-978.el7.x86_64 Reproduction Updated host kernel to 3.10.0-957.4.1.el7.x86_64, got same qemu crash as comment 6. Verification Versions: qemu-kvm-rhev-2.12.0-18.el7_6.3.x86_64 kernel-3.10.0-957.4.1.el7.x86_64 Steps: 1. Boot VM with device assignment and viommu. /usr/libexec/qemu-kvm -name rhel7.6 -M q35,kernel-irqchip=split \ -cpu host -m 8G \ -device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on \ -object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \ -numa node,memdev=mem -mem-prealloc \ -smp 4,sockets=1,cores=4,threads=1 \ -device pcie-root-port,id=root.1,chassis=1 \ -device pcie-root-port,id=root.2,chassis=2 \ -device pcie-root-port,id=root.3,chassis=3 \ -device pcie-root-port,id=root.4,chassis=4 \ -device vfio-pci,host=0000:5e:00.0,bus=root.1 \ -device vfio-pci,host=0000:5e:00.1,bus=root.2 \ -netdev tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=88:66:da:5f:dd:01,bus=root.3 \ -drive file=/home/images_nfv-virt-rt-kvm/rhel7.6.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.4 \ -vnc :2 \ -monitor stdio \ 2. In guest, load vfio and start testpmd. # modprobe vfio # modprobe vfio-pci # dpdk-devbind --bind=vfio-pci 0000:01:00.0 # dpdk-devbind --bind=vfio-pci 0000:02:00.0 # echo 4 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages # /usr/bin/testpmd \ -l 1,2,3 \ -n 4 \ -d /usr/lib64/librte_pmd_ixgbe.so \ -w 0000:01:00.0 -w 0000:02:00.0 \ -- \ --nb-cores=2 \ -i \ --disable-rss \ --rxq=1 --txq=1 3. In guest, quit testpmd, then return nic from vfio driver to ixgbe. # testpmd> quit # dpdk-devbind --bind=ixgbe 0000:01:00.0 # dpdk-devbind --bind=ixgbe 0000:02:00.0 4. In guest, set up ixgbe nic, and check guest dmesg. # dpdk-devbind --status ... Network devices using kernel driver =================================== 0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=enp1s0 drv=ixgbe unused=vfio-pci 0000:02:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' if=enp2s0 drv=ixgbe unused=vfio-pci 0000:03:00.0 'Virtio network device 1041' if=eth0 drv=virtio-pci unused=vfio-pci *Active* # ifconfig enp1s0 up # ifconfig enp2s0 up # ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 88:66:da:5f:dd:01 brd ff:ff:ff:ff:ff:ff 7: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether b4:96:91:14:22:c4 brd ff:ff:ff:ff:ff:ff 8: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether b4:96:91:14:22:c6 brd ff:ff:ff:ff:ff:ff # dmesg ... [ 704.935362] DMAR: 64bit 0000:01:00.0 uses identity mapping [ 704.957388] pps pps0: new PPS source ptp0 [ 704.957392] ixgbe 0000:01:00.0: registered PHC device on enp1s0 [ 705.090617] IPv6: ADDRCONF(NETDEV_UP): enp1s0: link is not ready [ 709.456882] ixgbe 0000:01:00.0 enp1s0: NIC Link is Up 10 Gbps, Flow Control: None [ 709.457035] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes ready [ 715.103574] DMAR: 64bit 0000:02:00.0 uses identity mapping [ 715.124992] pps pps1: new PPS source ptp1 [ 715.124996] ixgbe 0000:02:00.0: registered PHC device on enp2s0 [ 715.258117] IPv6: ADDRCONF(NETDEV_UP): enp2s0: link is not ready [ 719.605068] ixgbe 0000:02:00.0 enp2s0: NIC Link is Up 10 Gbps, Flow Control: None [ 719.605221] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready ... 5. Repeat steps 2-4, qemu and guest work well. According comment 10, mark this bz as Verified. Best regards, Sitong Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0209 |