Bug 1358653 - [RFE] Interrupt remapping support for Intel vIOMMUs
Summary: [RFE] Interrupt remapping support for Intel vIOMMUs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Peter Xu
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On: 1350196 1370005
Blocks: 1273718
TreeView+ depends on / blocked
 
Reported: 2016-07-21 08:18 UTC by Peter Xu
Modified: 2016-11-07 21:24 UTC (History)
9 users (show)

Fixed In Version: qemu-kvm-rhev-2.6.0-22.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-07 21:24:56 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2673 normal SHIPPED_LIVE qemu-kvm-rhev bug fix and enhancement update 2016-11-08 01:06:13 UTC

Description Peter Xu 2016-07-21 08:18:17 UTC
Description of problem:

Allow vIOMMUs for Intel guests to support interrupt remapping. 

Interrupt remapping (IR) is essential to provide interrupt protections, keep the system away from malicious and faulty interrupts. For host systems, we have IOMMU hardwares that protect host kernel. While for guest, we still do not have such a protection. We need this to provide a safer environment (along with DMA remapping) for the guest kernel.

Version-Release number of selected component (if applicable):

N/A

How reproducible:

N/A

Steps to Reproduce:

N/A

Actual results:

In guest, run the command:

  # journalctl -k | grep remap

and got nothing.

Expected results:

In guest, run the command:

  # journalctl -k | grep remap

Should see something like: 

"DMAR-IR: Enabled IRQ remapping in x2apic mode"

Comment 3 Miroslav Rezanina 2016-08-22 18:26:20 UTC
Fix included in qemu-kvm-rhev-2.6.0-22.el7

Comment 6 Peter Xu 2016-08-24 19:28:42 UTC
Sorry I should post this comment before hand. Anyway...

In general, IR testing can be based on the following three aspects:

1. IOAPIC interrupts (e.g., e1000)
2. MSI/MSIX interrupts (e.g., virtio-net-pci)
3. vhost backends (e.g., tap+virtio-net-pci, with vhostforce)

So basically you were testing exactly all simple cases, and I think that should cover this bz verification. 

One more thing is that, we'd better at least make sure interrupts are delivered correctly in guest for each test case. For network cards, it means:

1. we can see interrupts for specific device in /proc/interrupts (should be non-zero at least on one vCPU)
2. very basic functionality test (e.g., for net cards, we can just test ssh or ping, after setting up an IP for specific port)

One thing to mention is that, we'd better provide the following parameter all the time when IR is enabled:

  -global ioapic.version=0x20

This will boost emulated IOAPIC version to 0x20. This is required when IOAPIC interrupts are used in RHEL guests (and some old upstream kernels, possibly version <4.0). And this is optional in most other cases though.

Regarding to the problem you have encountered, it looks very likely caused by the IOAPIC version issue. Please try to boost it to 0x20 and retry. Ideally, the bug should disappear itself.

Comment 7 Pei Zhang 2016-08-25 05:48:06 UTC
(In reply to Peter Xu from comment #6)
> In general, IR testing can be based on the following three aspects:
> 
> 1. IOAPIC interrupts (e.g., e1000)
> 2. MSI/MSIX interrupts (e.g., virtio-net-pci)
> 3. vhost backends (e.g., tap+virtio-net-pci, with vhostforce)
> 
> So basically you were testing exactly all simple cases, and I think that
> should cover this bz verification. 
> 
> One more thing is that, we'd better at least make sure interrupts are
> delivered correctly in guest for each test case. For network cards, it means:
> 
> 1. we can see interrupts for specific device in /proc/interrupts (should be
> non-zero at least on one vCPU)
> 2. very basic functionality test (e.g., for net cards, we can just test ssh
> or ping, after setting up an IP for specific port)
> 
Thanks Peter.

As below new bug[1] exists, so will continue verify with virtio-net-pci/vhotuser after bug[1] is fixed.
[1]Bug 1370005 - Fail to get network device info(eth0) in guest with virtio-net-pci/vhostforce 

Re-verifying with e1000 and virtio-net-pci, they both work as expected.
Steps:
1. Boot guest with 'kernel-irqchip=split' and 'intremap=true', test with e1000, virtio-net-pci.
Note: <qemu-command-line1>:
/usr/libexec/qemu-kvm -name rhel7.3 \
-cpu host -m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-spice port=5901,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \
-monitor stdio \
-device ahci,id=ahci0 \
-drive file=/home/pezhang/rhel7.3.qcow2,format=qcow2,if=none,id=drive-system-disk,werror=stop,rerror=stop \
-device ide-drive,bus=ahci0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
-usbdevice tablet \

(1) With e1000, works well.
<qemu-command-line1> \
-M q35,kernel-irqchip=split \
-device intel-iommu,intremap=true \
-netdev tap,id=hostnet0 \
-device e1000,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:61 \
-global ioapic.version=0x20 \

1)interrupts can been found  
# ifconfig
enp0s2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.73.75.192  netmask 255.255.252.0  broadcast 10.73.75.255
        inet6 2620:52:0:4948:1054:ff:fe5c:8861  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::1054:ff:fe5c:8861  prefixlen 64  scopeid 0x20<link>
        ether 12:54:00:5c:88:61  txqueuelen 1000  (Ethernet)
        RX packets 6426  bytes 483215 (471.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 388  bytes 55131 (53.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

# cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
...
 22:       1748       1745       1742       1753  IR-IO-APIC-fasteoi   enp0s2
...

2) ssh and ping works well.

(2) With virtio-net-pci, works well.
<qemu-command-line1> \
-M q35,kernel-irqchip=split \
-device intel-iommu,intremap=true \
-netdev tap,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:61 \
-global ioapic.version=0x20 \

1)interrupts can been found   
# lspci | grep Virtio
00:02.0 Ethernet controller: Red Hat, Inc Virtio network device

# cat /proc/interrupts | grep virtio
 27:          0          0          0          0  IR-PCI-MSI-edge      virtio0-config
 28:        674        668        672        676  IR-PCI-MSI-edge      virtio0-input.0
 29:          0          1          0          0  IR-PCI-MSI-edge      virtio0-output.0

2) ssh and ping works well.


> One thing to mention is that, we'd better provide the following parameter
> all the time when IR is enabled:
> 
>   -global ioapic.version=0x20
> 
> This will boost emulated IOAPIC version to 0x20. This is required when
> IOAPIC interrupts are used in RHEL guests (and some old upstream kernels,
> possibly version <4.0). And this is optional in most other cases though.
> 
> Regarding to the problem you have encountered, it looks very likely caused
> by the IOAPIC version issue. Please try to boost it to 0x20 and retry.
> Ideally, the bug should disappear itself.

Yes, with this option, now the e1000 network works. Thanks for pointing out this. 


Thank you,
Pei

Comment 8 Peter Xu 2016-08-26 02:44:01 UTC
Pei, 

One thing to mention is that we can still use kernel-irqchip=off for IR. It should work, but I don't think we need to add these into QE workflow, since after all "kernel-irqchip=off" is just something for debugging and something "good to have". Just to make sure you know this, in case it may help one day.

(PS. Thank you for the verification work. :)

-- peterx

Comment 9 Pei Zhang 2016-09-22 08:20:33 UTC
(In reply to Pei Zhang from comment #7)
> [...]
> As below new bug[1] exists, so will continue verify with
> virtio-net-pci/vhotuser after bug[1] is fixed.
> [1]Bug 1370005 - Fail to get network device info(eth0) in guest with
> virtio-net-pci/vhostforce 
> [...]
Bug 1370005 has been fixed. So continue verification with virtio-net-pci/vhotuser.

Versions:
Host:
3.10.0-510.rt56.415.el7.x86_64
qemu-kvm-rhev-2.6.0-26.el7.x86_64

Guest:
3.10.0-510.rt56.415.el7.x86_64

Steps:
1. Boot slirp as background
/usr/libexec/qemu-kvm \
-net none \
-net socket,vlan=0,udp=localhost:4444,localaddr=localhost:5555 \
-net user,vlan=0

2. Boot guest with  virtio-net-pci/vhostforce 
/usr/libexec/qemu-kvm -name rhel7.3 -M q35,kernel-irqchip=split \
-device intel-iommu,intremap=true \
-cpu IvyBridge -m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-spice port=5901,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \
-monitor stdio \
-device ahci,id=ahci0 \
-drive file=/home/pezhang/rhel7.3.qcow2,format=qcow2,if=none,id=drive-system-disk,werror=stop,rerror=stop \
-device ide-drive,bus=ahci0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
-chardev socket,id=char0,path=/tmp/vubr.sock,server \
-device virtio-net-pci,netdev=mynet1,mac=54:52:00:1a:2c:01 \
-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-serial unix:/tmp/monitor,server,nowait \

3. Check interrupts and network.

# journalctl -k | grep remap
Sep 22 16:05:59 localhost.localdomain kernel: DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
Sep 22 16:05:59 localhost.localdomain kernel: DMAR-IR: Enabled IRQ remapping in x2apic mode

(1) interrupts shows as expected
# lspci | grep Virtio
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device

# cat /proc/interrupts | grep virtio
 26:          0          0          0          0  IR-PCI-MSI-edge      virtio0-config
 27:       1594       1736       1662       1615  IR-PCI-MSI-edge      virtio0-input.0
 28:          0          0          0          0  IR-PCI-MSI-edge      virtio0-output.0

(2) network works well, guest can get IP and #wget works.
# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.2.15  netmask 255.255.255.0  broadcast 10.0.2.255
        inet6 fe80::5652:ff:fe1a:2c01  prefixlen 64  scopeid 0x20<link>
        inet6 fec0::5652:ff:fe1a:2c01  prefixlen 64  scopeid 0x40<site>
        ether 54:52:00:1a:2c:01  txqueuelen 1000  (Ethernet)
        RX packets 43  bytes 6306 (6.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 81  bytes 9591 (9.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

# wget http://download.eng.bos.redhat.com/brewroot/packages/kernel-rt/3.10.0/510.rt56.415.el7/x86_64/kernel-rt-devel-3.10.0-510.rt56.415.el7.x86_64.rpm
...
Saving to: ‘kernel-rt-devel-3.10.0-510.rt56.415.el7.x86_64.rpm’

100%[======================================>] 9,529,792    253KB/s   in 26s    

2016-09-22 16:16:31 (354 KB/s) - ‘kernel-rt-devel-3.10.0-510.rt56.415.el7.x86_64.rpm’ saved [9529792/9529792]

Comment 10 Pei Zhang 2016-09-22 08:22:52 UTC
Set this bug as 'VERIFIED' as Comment 6, Comment 7, Comment 8 and Comment 9.

Comment 12 errata-xmlrpc 2016-11-07 21:24:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html


Note You need to log in before you can comment on or make changes to this bug.