Bug 820081

Summary: smp affinity behavior differs between 6.3 guest and 6.3 host after cpu offline/online
Product: Red Hat Enterprise Linux 6 Reporter: Shaolong Hu <shu>
Component: qemu-kvmAssignee: Igor Mammedov <imammedo>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.3CC: acathrow, areis, bsarathy, chayang, juzhang, knoel, michen, mkenneth, qzhang, shuang, shu, virt-bugs, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-29 15:41:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 833569    

Description Shaolong Hu 2012-05-09 05:55:13 UTC
Description of problem:
------------------------
smp affinity behavior differs between 6.3 guest and 6.3 host after cpu offline/online.


Version-Release number of selected component (if applicable):
--------------------------------------------------------------
qemu-kvm-0.12.1.2-2.290.el6.x86_64
host: 2.6.32-269.el6.x86_64
guest: 2.6.32-262.el6.x86_64


How reproducible:
------------------
100%


Steps to Reproduce:
--------------------
1.on host, takes irq 17 to test, mask suggests irq 17 is routed to cpu 0 and cpu 1, and /proc/interrupts shows only under cpu 0 and cpu 1, irq 17 number increases:

[root@localhost ~]# cat /proc/irq/17/smp_affinity
3

[root@localhost ~]# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 17:       2516       1492       3576       1512   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7

2.on host, offline cpu 1, /proc/interrupts shows only under cpu 0, irq 17 number increases, at this time, mask is still 3, online cpu 1 again, irq 17 number increases both under cpu 0 and cpu 1 automatically.

[root@localhost ~]# echo 0 > /sys/devices/system/cpu/cpu1/online
[root@localhost ~]# echo 1 > /sys/devices/system/cpu/cpu1/online

3.on guest, takes irq 11 doing the same test, the mask is f at default, set it to 3 manually, offline then online cpu 1, irq 11 number increases only at cpu 0, number under cpu 1 freezes at the value before offline:

[root@localhost ~]# echo 3 > /proc/irq/11/smp_affinity

[root@localhost ~]# cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
 11:         54         58         56         60   IO-APIC-fasteoi   uhci_hcd:usb1

4.at this point, if set mask of irq 11 as 3 again, number under cpu 1 begins to increase, but number under cpu 0 freezes.


Expected results:
-------------------
in guest, after offline/online, irq can be routed to cpus allowed by mask correctly.

Comment 1 Shaolong Hu 2012-05-09 05:59:29 UTC
host has 4 processors:

[root@localhost ~]# cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz


guest cmd:

/usr/libexec/qemu-kvm -enable-kvm -M rhel6.3.0 -m 4G -name rhel6.3-64 -rtc base=utc,clock=host,driftfix=slew -no-kvm-pit-reinjection -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=nfs/RHEL-Server-6.3-64-virtio.qcow2,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0 -netdev tap,id=netdev0,fd=6 6<>/dev/tap6 -device virtio-net-pci,netdev=netdev0,id=device-net0,mac=02:a3:a6:ed:4b:94 -boot order=cd -monitor stdio -usb -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -vnc :10 -monitor tcp::1234,server,nowait -qmp tcp:0:5555,server,nowait -chardev socket,id=qmp_monitor_id_qmpmonitor1,path=/tmp/qmp,server,nowait -mon chardev=qmp_monitor_id_qmpmonitor1,mode=control -smp 4

Comment 4 RHEL Program Management 2012-07-10 06:49:04 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 5 RHEL Program Management 2012-07-11 02:07:15 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 8 Igor Mammedov 2013-04-11 12:40:35 UTC
There were several related patches on rhkml, it might be fixed by now.
Please retest.

pls move to 6.6 due to limited capacity and still being tech-preview

Comment 11 juzhang 2014-07-25 01:52:25 UTC
Hi Shu,

Can you reply comment10?

Best Regards,
Junyi

Comment 12 Igor Mammedov 2014-08-29 15:41:33 UTC
I've tested with 2.6.32-498.el6.x86_64 kernel on AMD/Intel hosts and VM using
ahci IRQ. 

As result behaviour between physical hosts and VM was the same. (Make sure that irqbalance service is turned off so that it won't screw up results)

test env: 4 core host/VM with nox2apic option on kernel command line to make sure
that APIC would be in flat mode and do RR IRQ balancing.

1. set smp_affinity for ahci IRQ to 'f'
2. create load: dd =if=/dev/sda of=/dev/null
 * watch ahci IRQ load RR distributed among 4 CPUs
3. offline CPU1
4. online CPU1
 * ahci IRQ is not distributed to CPU1 (phys hosts and VM)
5. re-et smp_affinity for ahci IRQ to 'f'
 * that retriggers IRQ distribution and it's handled on all 4 CPUs again

Conclusion:
 * kernel doesn't distribe IRQ to hotplugged CPU when doing RR in flat mode, it never worked ()regardles if it's VM or baremetal) and need to be fixed ustream if someone asks for it.
 * to force IRQ routing change after CPU hotplug re-set smp-affinity for IRQ
 * closing bug as worksforme (baremetal and VM behave identically)

Comment 13 Red Hat Bugzilla 2023-09-14 01:28:58 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days