RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1029343 - irqbalance is broken in guest
Summary: irqbalance is broken in guest
Keywords:
Status: CLOSED DUPLICATE of bug 960383
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Radim Krčmář
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-12 08:32 UTC by Chao Yang
Modified: 2013-11-28 18:23 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-11-28 18:23:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Chao Yang 2013-11-12 08:32:13 UTC
Description of problem:
Booted a guest with virtio-net-pci, started irqbalance in debug mode. Stressed virtio nic by netperf. What I observed is the corresponding interrupt was distributed consistently on the 2nd vCPU.

Version-Release number of selected component (if applicable):
2.6.32-430.el6.x86_64
qemu-kvm-0.12.1.2-2.415.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
CLI:
/usr/libexec/qemu-kvm -M rhel6.5.0 -cpu host -enable-kvm -m 4096 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1 -nodefaults -monitor stdio -boot menu=on -rtc base=utc,clock=host,driftfix=slew -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0 -drive file=/home/rhel6.5.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=46:1a:4a:42:48:25,bus=pci.0 -k en-us -vga cirrus -device intel-hda,id=sound0,bus=pci.0 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0 -vnc :1


-- In guest:

# lspci | grep Eth
00:05.0 Ethernet controller: Red Hat, Inc Virtio network device
# dmesg | grep 00:05.0
pci 0000:00:05.0: reg 10: [io  0xc0c0-0xc0df]
pci 0000:00:05.0: reg 14: [mem 0xf2022000-0xf2022fff]
pci 0000:00:05.0: reg 30: [mem 0xf2030000-0xf203ffff pref]
virtio-pci 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
virtio-pci 0000:00:05.0: irq 28 for MSI/MSI-X
virtio-pci 0000:00:05.0: irq 29 for MSI/MSI-X
virtio-pci 0000:00:05.0: irq 30 for MSI/MSI-X


-- A snip from output of irqbalance in guest:
----------------------------------------------------------------------------
Package 0:  numa_node is 0 cpu mask is 00000003 (load 0)
        Cache domain 0:  numa_node is 0 cpu mask is 00000001  (load 0) 
                CPU number 0  numa_node is 0 (load 0)
                  Interrupt 30 node_num is -1 (ethernet/1) 
          Interrupt 25 node_num is -1 (storage/1) 
        Cache domain 1:  numa_node is 0 cpu mask is 00000002  (load 0) 
                CPU number 1  numa_node is 0 (load 0)
                  Interrupt 28 node_num is -1 (ethernet/1) 
          Interrupt 11 node_num is -1 (legacy/1) 
  Interrupt 26 node_num is -1 (other/1) 
Package 0:  numa_node is 0 cpu mask is 0000000c (load 13000)
        Cache domain 2:  numa_node is 0 cpu mask is 00000004  (load 26000) 
                CPU number 2  numa_node is 0 (load 26000)
                  Interrupt 29 node_num is -1 (ethernet/1) 
          Interrupt 9 node_num is -1 (legacy/1) 
        Cache domain 3:  numa_node is 0 cpu mask is 00000008  (load 0) 
                CPU number 3  numa_node is 0 (load 0)
          Interrupt 24 node_num is -1 (storage/1) 
  Interrupt 10 node_num is -1 (other/1) 
  Interrupt 27 node_num is -1 (other/1) 



-----------------------------------------------------------------------------
Package 0:  numa_node is 0 cpu mask is 00000003 (load 0)
        Cache domain 0:  numa_node is 0 cpu mask is 00000001  (load 0) 
                CPU number 0  numa_node is 0 (load 0)
                  Interrupt 30 node_num is -1 (ethernet/1) 
          Interrupt 25 node_num is -1 (storage/1) 
        Cache domain 1:  numa_node is 0 cpu mask is 00000002  (load 0) 
                CPU number 1  numa_node is 0 (load 0)
                  Interrupt 28 node_num is -1 (ethernet/1) 
          Interrupt 11 node_num is -1 (legacy/1) 
  Interrupt 26 node_num is -1 (other/1) 
Package 0:  numa_node is 0 cpu mask is 0000000c (load 16500)
        Cache domain 2:  numa_node is 0 cpu mask is 00000004  (load 33000) 
                CPU number 2  numa_node is 0 (load 33000)
                  Interrupt 29 node_num is -1 (ethernet/1) 
          Interrupt 9 node_num is -1 (legacy/1) 
        Cache domain 3:  numa_node is 0 cpu mask is 00000008  (load 0) 
                CPU number 3  numa_node is 0 (load 0)
          Interrupt 24 node_num is -1 (storage/1) 
  Interrupt 10 node_num is -1 (other/1) 
  Interrupt 27 node_num is -1 (other/1) 



-----------------------------------------------------------------------------
Package 0:  numa_node is 0 cpu mask is 00000003 (load 500)
        Cache domain 0:  numa_node is 0 cpu mask is 00000001  (load 1000) 
                CPU number 0  numa_node is 0 (load 1000)
                  Interrupt 30 node_num is -1 (ethernet/750) 
          Interrupt 25 node_num is -1 (storage/994) 
        Cache domain 1:  numa_node is 0 cpu mask is 00000002  (load 0) 
                CPU number 1  numa_node is 0 (load 0)
                  Interrupt 28 node_num is -1 (ethernet/1) 
          Interrupt 11 node_num is -1 (legacy/1) 
  Interrupt 26 node_num is -1 (other/1) 
Package 0:  numa_node is 0 cpu mask is 0000000c (load 19000)
        Cache domain 2:  numa_node is 0 cpu mask is 00000004  (load 38000) 
                CPU number 2  numa_node is 0 (load 38000)
                  Interrupt 29 node_num is -1 (ethernet/1) 
          Interrupt 9 node_num is -1 (legacy/1) 
        Cache domain 3:  numa_node is 0 cpu mask is 00000008  (load 0) 
                CPU number 3  numa_node is 0 (load 0)
          Interrupt 24 node_num is -1 (storage/1) 
  Interrupt 10 node_num is -1 (other/1) 
  Interrupt 27 node_num is -1 (other/1)

Comment 2 Chao Yang 2013-11-12 09:40:40 UTC
Reproducible with 6.4.Z kernel and qemu-kvm as well as rhel6.5 guest.

2.6.32-358.18.1.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.355.el6_4.9.x86_64

Comment 3 Radim Krčmář 2013-11-12 14:12:19 UTC
One interrupt can't be balanced across multiple cpus with '-cpu host', because it selects physical x2apic at this time. (nox2apic kernel parameter also works)
irqbalance then can't do much with just one interrupt source -- moving it to another cpu is not making anything better.

But there is a deficiency in irqbalance that the reporter might have meant:
even with priority balancing, irqbalance still sets smp_affinity to just one cpu, although we could achieve better balance if it considered wider masks.

Comment 4 Radim Krčmář 2013-11-28 18:23:55 UTC
This is expected, for reasons in comment #3; see the parent bug for behavior details.

*** This bug has been marked as a duplicate of bug 960383 ***


Note You need to log in before you can comment on or make changes to this bug.