Bug 1026178 - irqbalance service not work properly with 82599EB PF/VF
irqbalance service not work properly with 82599EB PF/VF
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Bandan Das
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-04 01:26 EST by Xu Han
Modified: 2014-01-21 17:38 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-21 17:38:34 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Xu Han 2013-11-04 01:26:38 EST
Description of problem:
irqbalance service not work properly with 82599EB PF/VF.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-1.5.3-10.el7.x86_64
kernel-3.10.0-40.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. boot guest with 82599EB PF/VF.
# /usr/libexec/qemu-kvm -nodefaults -M pc -m 2G -cpu Nehalem -smp 4,cores=2,threads=2,sockets=1 -boot menu=on -monitor stdio -vga qxl -spice disable-ticketing,port=5931 -drive file=/home/vfio-RHEL7.0-64.qcow2_v3,id=guest-img,if=none,cache=none,aio=native -device virtio-blk-pci,scsi=off,drive=guest-img,id=os-disk,bootindex=1 -device virtio-balloon-pci,id=balloon -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -qmp tcp:0:5555,server,nowait -serial unix:/tmp/guest-sock,server,nowait \
-device vfio-pci,host=05:10.0,id=vf0

2. start irqbalance.
# service irqbalance start

3. check interrupts and smp_affinity on guest.
# cat /proc/interrupts | grep ens5; \
  cat /proc/irq/42/smp_affinity; \
  sleep 120; \
  cat /proc/interrupts | grep ens5; \
  cat /proc/irq/42/smp_affinity


Actual results:
step 2:
# service irqbalance status
Redirecting to /bin/systemctl status  irqbalance.service
irqbalance.service - irqbalance daemon
   Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; enabled)
   Active: active (running) since Sun 2013-11-03 22:49:52 MST; 21min ago
 Main PID: 603 (irqbalance)
   CGroup: /system.slice/irqbalance.service
           └─603 /usr/sbin/irqbalance --foreground

Nov 03 22:49:52 localhost.localdomain systemd[1]: Started irqbalance daemon.
Nov 03 23:00:15 localhost.localdomain systemd[1]: Started irqbalance daemon.
Nov 03 23:01:49 localhost.localdomain systemd[1]: Started irqbalance daemon.

step 3:
 42:         13         13         34     193847   PCI-MSI-edge      ens5-TxRx-0
 43:          5          7          7          5   PCI-MSI-edge      ens5
8 <-- smp_affinity
 42:         13         13         34     975747   PCI-MSI-edge      ens5-TxRx-0
 43:          5          7          7          5   PCI-MSI-edge      ens5
8 <-- smp_affinity

# cat /proc/irq/42/affinity_hint 
0


Expected results:
irqbalance service could work properly.

Additional info:
# lspci -vvv -s 00:05.0 | grep -i MSI
	Capabilities: [70] MSI-X: Enable+ Count=3 Masked-
	Capabilities: [a0] Express (v0) Endpoint, MSI 00
Comment 3 Alex Williamson 2013-11-04 09:22:38 EST
(In reply to xuhan from comment #0)
> step 3:
>  42:         13         13         34     193847   PCI-MSI-edge     
> ens5-TxRx-0
>  43:          5          7          7          5   PCI-MSI-edge      ens5
> 8 <-- smp_affinity

smp_affinity is a bitmap, so 8 means CPU3 is targeted for the interrupt.

>  42:         13         13         34     975747   PCI-MSI-edge     
> ens5-TxRx-0
>  43:          5          7          7          5   PCI-MSI-edge      ens5
> 8 <-- smp_affinity

Tada, only CPU3's interrupt count increased.

> # cat /proc/irq/42/affinity_hint 
> 0

Seems like you're making an assumption that affinity_hint should be showing something else.  What do you think it should be showing?  What does it show on bare metal?

> Expected results:
> irqbalance service could work properly.

I don't see how it's not working, please double check the results and clarify exactly where it's not working.
Comment 4 Xu Han 2013-11-08 03:37:28 EST
Currently, I have no environment to do more test. Will update the results after test finish.
Comment 5 Alex Williamson 2013-11-08 09:09:15 EST
Re-adding needinfo
Comment 6 Xu Han 2013-11-19 03:25:59 EST
Test again with qemu-kvm-1.5.3-19.el7.x86_64.

This time have observed the migrating of irq.
# while true; do cat /proc/interrupts | grep ens; sleep 1; done
...
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     116767          9        113          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     117587          9        113          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     118461          9        113          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     119410          9        113          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9        188          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9        668          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       1574          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       2421          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       3170          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       4173          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       4884          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       5530          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
 45:     120145          9       6202          4   PCI-MSI-edge      ens5-TxRx-0
 46:          6          5          6          7   PCI-MSI-edge      ens5
...

The affinity_hint value in comment 0 is just provided as an additional information. Actually I am not sure what value it should be. But I saw a non-zero value of other device before, so just think if it is related to this issue.

Anyway, as this test result showed, irqbalance service worked properly.

Thanks,
xuhan
Comment 7 Alex Williamson 2014-01-21 17:38:34 EST
Comment 6 confirms irqbalance works as expected with this device, not sure why this bz is still open.  closing.

Note You need to log in before you can comment on or make changes to this bug.