Bug 820081 - smp affinity behavior differs between 6.3 guest and 6.3 host after cpu offline/online [NEEDINFO]
smp affinity behavior differs between 6.3 guest and 6.3 host after cpu offlin...
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.3
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Igor Mammedov
Virtualization Bugs
:
Depends On:
Blocks: 833569
  Show dependency treegraph
 
Reported: 2012-05-09 01:55 EDT by Shaolong Hu
Modified: 2014-08-29 11:41 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-08-29 11:41:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
imammedo: needinfo? (shu)


Attachments (Terms of Use)

  None (edit)
Description Shaolong Hu 2012-05-09 01:55:13 EDT
Description of problem:
------------------------
smp affinity behavior differs between 6.3 guest and 6.3 host after cpu offline/online.


Version-Release number of selected component (if applicable):
--------------------------------------------------------------
qemu-kvm-0.12.1.2-2.290.el6.x86_64
host: 2.6.32-269.el6.x86_64
guest: 2.6.32-262.el6.x86_64


How reproducible:
------------------
100%


Steps to Reproduce:
--------------------
1.on host, takes irq 17 to test, mask suggests irq 17 is routed to cpu 0 and cpu 1, and /proc/interrupts shows only under cpu 0 and cpu 1, irq 17 number increases:

[root@localhost ~]# cat /proc/irq/17/smp_affinity
3

[root@localhost ~]# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 17:       2516       1492       3576       1512   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7

2.on host, offline cpu 1, /proc/interrupts shows only under cpu 0, irq 17 number increases, at this time, mask is still 3, online cpu 1 again, irq 17 number increases both under cpu 0 and cpu 1 automatically.

[root@localhost ~]# echo 0 > /sys/devices/system/cpu/cpu1/online
[root@localhost ~]# echo 1 > /sys/devices/system/cpu/cpu1/online

3.on guest, takes irq 11 doing the same test, the mask is f at default, set it to 3 manually, offline then online cpu 1, irq 11 number increases only at cpu 0, number under cpu 1 freezes at the value before offline:

[root@localhost ~]# echo 3 > /proc/irq/11/smp_affinity

[root@localhost ~]# cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
 11:         54         58         56         60   IO-APIC-fasteoi   uhci_hcd:usb1

4.at this point, if set mask of irq 11 as 3 again, number under cpu 1 begins to increase, but number under cpu 0 freezes.


Expected results:
-------------------
in guest, after offline/online, irq can be routed to cpus allowed by mask correctly.
Comment 1 Shaolong Hu 2012-05-09 01:59:29 EDT
host has 4 processors:

[root@localhost ~]# cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz


guest cmd:

/usr/libexec/qemu-kvm -enable-kvm -M rhel6.3.0 -m 4G -name rhel6.3-64 -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=nfs/RHEL-Server-6.3-64-virtio.qcow2,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0 -netdev tap,id=netdev0,fd=6 6<>/dev/tap6 -device virtio-net-pci,netdev=netdev0,id=device-net0,mac=02:a3:a6:ed:4b:94 -boot order=cd -monitor stdio -usb -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -vnc :10 -monitor tcp::1234,server,nowait -qmp tcp:0:5555,server,nowait -chardev socket,id=qmp_monitor_id_qmpmonitor1,path=/tmp/qmp,server,nowait -mon chardev=qmp_monitor_id_qmpmonitor1,mode=control -smp 4
Comment 4 RHEL Product and Program Management 2012-07-10 02:49:04 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 5 RHEL Product and Program Management 2012-07-10 22:07:15 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 8 Igor Mammedov 2013-04-11 08:40:35 EDT
There were several related patches on rhkml, it might be fixed by now.
Please retest.

pls move to 6.6 due to limited capacity and still being tech-preview
Comment 11 juzhang 2014-07-24 21:52:25 EDT
Hi Shu,

Can you reply comment10?

Best Regards,
Junyi
Comment 12 Igor Mammedov 2014-08-29 11:41:33 EDT
I've tested with 2.6.32-498.el6.x86_64 kernel on AMD/Intel hosts and VM using
ahci IRQ. 

As result behaviour between physical hosts and VM was the same. (Make sure that irqbalance service is turned off so that it won't screw up results)

test env: 4 core host/VM with nox2apic option on kernel command line to make sure
that APIC would be in flat mode and do RR IRQ balancing.

1. set smp_affinity for ahci IRQ to 'f'
2. create load: dd =if=/dev/sda of=/dev/null
 * watch ahci IRQ load RR distributed among 4 CPUs
3. offline CPU1
4. online CPU1
 * ahci IRQ is not distributed to CPU1 (phys hosts and VM)
5. re-et smp_affinity for ahci IRQ to 'f'
 * that retriggers IRQ distribution and it's handled on all 4 CPUs again

Conclusion:
 * kernel doesn't distribe IRQ to hotplugged CPU when doing RR in flat mode, it never worked ()regardles if it's VM or baremetal) and need to be fixed ustream if someone asks for it.
 * to force IRQ routing change after CPU hotplug re-set smp-affinity for IRQ
 * closing bug as worksforme (baremetal and VM behave identically)

Note You need to log in before you can comment on or make changes to this bug.