Bug 805362

Summary: guest kernel call trace when hotplug vcpu
Product: Red Hat Enterprise Linux 6 Reporter: Shaolong Hu <shu>
Component: qemu-kvmAssignee: Igor Mammedov <imammedo>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 6.3CC: acathrow, bsarathy, imammedo, juzhang, michen, mkenneth, tburke, virt-maint, wdai, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.267.el6 Doc Type: Bug Fix
Doc Text:
No documentation needed
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 11:45:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Shaolong Hu 2012-03-21 02:41:54 UTC
Description of problem:
------------------------
This bug is to track the guest kernel warning when hostplug vcpu, from:
https://bugzilla.redhat.com/show_bug.cgi?id=562886#c56

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
host: qemu-kvm-250rhev/kernel-252
guest: kernel-252

How reproducible:
-----------------
100%

Steps to Reproduce:
--------------------
1.boot guest with "-smp 1,maxcpus=4"

2.in qemu monitor, plug the first vcpu, guest kernel call trace:
(qemu) cpu_set 1 online

3.plug the second and third vcpu, no more call trace

4.call trace:

irq 9: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-252.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff810dbfeb>] ? __report_bad_irq+0x2b/0xa0
 [<ffffffff810dc1ec>] ? note_interrupt+0x18c/0x1d0
 [<ffffffff8102e1b9>] ? ack_apic_level+0x79/0x1b0
 [<ffffffff810dc90d>] ? handle_fasteoi_irq+0xcd/0xf0
 [<ffffffff8100df09>] ? handle_irq+0x49/0xa0
 [<ffffffff814ffddc>] ? do_IRQ+0x6c/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 [<ffffffff81072713>] ? __do_softirq+0x73/0x1d0
 [<ffffffff812ca1c2>] ? acpi_irq+0x16/0x31
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81072545>] ? irq_exit+0x85/0x90
 [<ffffffff814ffde5>] ? do_IRQ+0x75/0xf0
 [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff810377cb>] ? native_safe_halt+0xb/0x10
 [<ffffffff810149dd>] ? default_idle+0x4d/0xb0
 [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
 [<ffffffff814de93a>] ? rest_init+0x7a/0x80
 [<ffffffff81c21f7b>] ? start_kernel+0x424/0x430
 [<ffffffff81c2133a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff81c21438>] ? x86_64_start_kernel+0xfa/0x109
handlers:
[<ffffffff812ca1ac>] (acpi_irq+0x0/0x31)
Disabling IRQ #9

Comment 2 Igor Mammedov 2012-03-25 12:01:10 UTC
I've investigated it and found root cause, now it doesn't look harmless now, so I think that it is better to fix in 6.3.

Call trace caused by irq storm that occurs when qemu doesn't lower sci interrupt line after rising it in:

  qemu_system_cpu_hot_add
    -> pm_update_sci
      -> qemu_set_irq(s->irq, sci_level /* =1 */);

in upstream qemu-kvm interrupt issue is fixed by 
 633aa0acfe2c4d3 "Fix pci hotplug to generate level triggered interrupt"
...
@@ -514,7 +520,9 @@ static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val)
             break;
         default:
             break;
-   }
+    }
+
+    pm_update_sci(s);
 
     PIIX4_DPRINTF("gpe write %x <== %d\n", addr, val);
 }
...
which I've forgot to backport.

Since, fix touches pci hotplug, I need re-test all hot-plug including windows guest before posting fix.

Comment 5 FuXiangChun 2012-04-09 03:38:29 UTC
according to below case testing result, kernel don't show call trace during hotplug vcpu, so this bug is fixed. 

https://tcms.engineering.redhat.com/run/35769/#caserun_964525
rpm -qa|grep qemu
qemu-kvm-0.12.1.2-2.267.el6.x86_64

Comment 7 Dor Laor 2012-04-22 11:32:20 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation needed

Comment 8 errata-xmlrpc 2012-06-20 11:45:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0746.html