Bug 616296

Summary:	guest kernel panic when boot with nmi_watchdog=1
Product:	Red Hat Enterprise Linux 6	Reporter:	Suqin Huang <shuang>
Component:	kernel	Assignee:	Karen Noel <knoel>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	medium	Docs Contact:
Priority:	low
Version:	6.1	CC:	ddumas, dhoward, dzickus, hateya, Jes.Sorensen, knoel, lihuang, llim, michen, mkenneth, ndai, plyons, tburke, virt-maint
Target Milestone:	rc	Keywords:	Reopened, RHELNAK, ZStream
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	kernel-2.6.32-83.el6	Doc Type:	Bug Fix
Doc Text:	While not mandated by any specification, Linux systems rely on NMIs (Non-maskable Interrupts) being blocked by an IF-enabling (Interrupt Flag) STI instruction (an x86 instruction that enables interrupts; Set Interrupts); this is also the common behavior of all known hardware. Prior to this update, kernel panic could occur on guests using NMIs extensively (for example, a Linux system with the nmi_watchdog kernel parameter enabled). With this update, an NMI is disallowed when interrupts are blocked by an STI. This is done by checking for the condition and requesting an interrupt window exit if it occurs. As a result, kernel panic no longer occurs.	Story Points:	---
Clone Of:
Clones:	651343 (view as bug list)		Environment:
Last Closed:	2011-05-23 20:42:49 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	562808, 580953, 651343, 683783

Description Suqin Huang 2010-07-20 04:59:28 UTC

Description of problem:


Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.96.el6.x86_64

How reproducible:
sometimes

Steps to Reproduce:
1. boot rhel5.5-64 guest
2. update kernel to 2.6.18-207.el5
3. add nmi_watchdog=1 to kernel line, and reboot guest.
  
Actual results:


Expected results:


Additional info:

CPU 0 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.18-207.el5 #1
RIP: 0010:[<ffffffff8005d67c>]  [<ffffffff8005d67c>] iret_label+0x0/0x2
RSP: 0000:ffffffff804a1fd8  EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffffffff8006bdae RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8030f738
RBP: 0000000000090000 R08: ffffffff80452000 R09: 000000000000003e
R10: ffff81011fc58038 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff80421000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff80452000, task ffffffff8030db60)
Stack:  ffffffff8005d67c 0000000000000010 0000000000010086 ffffffff804a1fd8
 0000000000000000
Call Trace:
 <NMI>  [<ffffffff8005d67c>] iret_label+0x0/0x2
 <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2


Code: 48 cf 0f ba e2 03 73 1b fb 57 e8 35 4d 00 00 5f 65 48 8b 0c 
Kernel panic - not syncing: nmi watchdog
 BUG: warning at kernel/panic.c:137/panic() (Not tainted)

Call Trace:
 <NMI>  [<ffffffff8009278b>] panic+0x1da/0x1eb
 [<ffffffff8006c53d>] _show_stack+0xdb/0xea
 [<ffffffff8006c630>] show_registers+0xe4/0x100
 [<ffffffff800652c5>] die_nmi+0x66/0xa3
 [<ffffffff80065a8e>] nmi_watchdog_tick+0x157/0x1d3
 [<ffffffff80065629>] default_do_nmi+0x81/0x225
 [<ffffffff80065919>] do_nmi+0x43/0x61
 [<ffffffff80064eef>] nmi+0x7f/0x88
 [<ffffffff8006bdae>] default_idle+0x0/0x50
 [<ffffffff8005d67c>] iret_label+0x0/0x2
 <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2

BUG: warning at drivers/input/serio/i8042.c:846/i8042_panic_blink() (Not tainted)

Call Trace:
 <NMI>  [<ffffffff8020d0f6>] i8042_panic_blink+0x112/0x2a5
 [<ffffffff80092731>] panic+0x180/0x1eb
 [<ffffffff8006c53d>] _show_stack+0xdb/0xea
 [<ffffffff8006c630>] show_registers+0xe4/0x100
 [<ffffffff800652c5>] die_nmi+0x66/0xa3
 [<ffffffff80065a8e>] nmi_watchdog_tick+0x157/0x1d3
 [<ffffffff80065629>] default_do_nmi+0x81/0x225
 [<ffffffff80065919>] do_nmi+0x43/0x61
 [<ffffffff80064eef>] nmi+0x7f/0x88
 [<ffffffff8006bdae>] default_idle+0x0/0x50
 [<ffffffff8005d67c>] iret_label+0x0/0x2
 <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2

BUG: warning at drivers/input/serio/i8042.c:849/i8042_panic_blink() (Not tainted)

Call Trace:
 <NMI>  [<ffffffff8020d1df>] i8042_panic_blink+0x1fb/0x2a5
 [<ffffffff80092731>] panic+0x180/0x1eb
 [<ffffffff8006c53d>] _show_stack+0xdb/0xea
 [<ffffffff8006c630>] show_registers+0xe4/0x100
 [<ffffffff800652c5>] die_nmi+0x66/0xa3
 [<ffffffff80065a8e>] nmi_watchdog_tick+0x157/0x1d3
 [<ffffffff80065629>] default_do_nmi+0x81/0x225
 [<ffffffff80065919>] do_nmi+0x43/0x61
 [<ffffffff80064eef>] nmi+0x7f/0x88
 [<ffffffff8006bdae>] default_idle+0x0/0x50
 [<ffffffff8005d67c>] iret_label+0x0/0x2
 <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2

BUG: warning at drivers/input/serio/i8042.c:851/i8042_panic_blink() (Not tainted)

Call Trace:
 <NMI>  [<ffffffff8020d25c>] i8042_panic_blink+0x278/0x2a5
 [<ffffffff80092731>] panic+0x180/0x1eb
 [<ffffffff8006c53d>] _show_stack+0xdb/0xea
 [<ffffffff8006c630>] show_registers+0xe4/0x100
 [<ffffffff800652c5>] die_nmi+0x66/0xa3
 [<ffffffff80065a8e>] nmi_watchdog_tick+0x157/0x1d3
 [<ffffffff80065629>] default_do_nmi+0x81/0x225
 [<ffffffff80065919>] do_nmi+0x43/0x61
 [<ffffffff80064eef>] nmi+0x7f/0x88
 [<ffffffff8006bdae>] default_idle+0x0/0x50
 [<ffffffff8005d67c>] iret_label+0x0/0x2
 <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2

Comment 1 Suqin Huang 2010-07-20 05:01:10 UTC

cmd:
qemu -name 'vm1' -monitor stdio -drive file='/RHEL-Server-5.5-64.qcow2',if=none,id=drive-virtio-disk1,media=disk,cache=none,boot=on,format=qcow2 \
-device virtio-blk-pci,drive=drive-virtio-disk1,id=virtio-disk1 \
-net nic,vlan=0,netdev=idW3JQnj,model=virtio,macaddr='02:30:25:46:a2:d1' -netdev tap,id=idW3JQnj,ifname='virtio_0_8000',script='/qemu-ifup',downscript='no',vhost=on \
-m 4096 -smp 2 -vnc :0 -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host -M rhel6.0.0

Comment 3 RHEL Program Management 2010-07-20 05:17:41 UTC

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 4 Suqin Huang 2010-07-20 08:06:09 UTC

host kernel: 2.6.32-44.1.el6.x86_64

Comment 5 Suqin Huang 2010-07-28 08:08:05 UTC

this issue also happen when run nmi testing in rhel5 host.
kvm: kvm-83-164.el5_5.17  
kernel: 2.6.18-194.10.1.el5

guest: rhel5.5-64

Comment 6 Dor Laor 2010-07-28 12:52:37 UTC

*** Bug 616337 has been marked as a duplicate of this bug. ***

Comment 7 Don Zickus 2010-07-28 14:55:25 UTC

I'll post what I posted in bz616337:

That is not expected to work for KVM (or any virt guests really) as the ioapic
timer interrupts aren't emulated as nmi's to the guest.

Though I wouldn't expect it to panic because with nmi_watchdog autodetection,
and the fact there is no emulated performance counters, it defaults to the
equivalent of nmi_watchdog=1.  In that case you just see an error message and
the kernel proceeds forward.

So the priority should be low for this as it isn't expected to be a valid user
configuration.  But it might be a bug that will need to be fixed somewhere.

Can you attach a boot log?  I am curious to see where this fails in the boot sequence.

Cheers,
Don

Comment 8 Jes Sorensen 2010-07-28 15:01:35 UTC

I agree, the main issue will be to make sure that we don't have qemu-kvm
exit with an unhandled KVM exit.

When I tested it, it seemed to be a little random when precisely it died,
this was with nmi_watchdog testing. It completed booting pretty far, then
suddenly died.

Will have a closer look shortly.

Jes

Comment 9 Jes Sorensen 2010-07-29 16:12:06 UTC

I was able to reproduce this the other day, but with the latest rhel6 updates I
can no longer reproduce this bug. Could you please try and retest with at
least:

qemu-kvm-0.12.1.2-2.104.el6.x86_64
kernel-2.6.32-54.el6.x86_64

I am also running the -194 kernel in the RHEL5 guest.

Thanks,
Jes

Comment 10 Jes Sorensen 2010-07-29 17:43:56 UTC

Urgh, ignore my previous comment, I can reproduce it again now :( I tried a bunch of times before without problems, then one last try and it came back.

Jes

Comment 11 Linda Wang 2010-08-02 15:04:21 UTC

Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
In kvm guest, this is not expected to work for KVM (or any virt guests really) as the ioapic
timer interrupts aren't emulated as nmi's to the guest.

Comment 15 Ryan Lerch 2010-09-28 02:51:09 UTC

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,2 +1 @@
-In kvm guest, this is not expected to work for KVM (or any virt guests really) as the ioapic
+I/O Advanced Programmable Interrupt Controller (I/O APIC) timer interrupts are not emulated as non-maskable interrupts (NMIs) to virtualized guests. Consequently, if a virtualized guest uses the kernel parameter <command>nmi_watchdog=1</command>, the guest kernel will panic on boot.-timer interrupts aren't emulated as nmi's to the guest.

Comment 16 Ryan Lerch 2010-09-28 03:01:40 UTC

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-I/O Advanced Programmable Interrupt Controller (I/O APIC) timer interrupts are not emulated as non-maskable interrupts (NMIs) to virtualized guests. Consequently, if a virtualized guest uses the kernel parameter <command>nmi_watchdog=1</command>, the guest kernel will panic on boot.+I/O Advanced Programmable Interrupt Controller (I/O APIC) timer interrupts are not emulated as non-maskable interrupts (NMIs) to virtualized guests. Consequently, if a virtualized guest uses the kernel parameter nmi_watchdog=1, the guest kernel will panic on boot.

Comment 17 Gleb Natapov 2010-11-09 10:02:23 UTC

*** Bug 643844 has been marked as a duplicate of this bug. ***

Comment 18 Avi Kivity 2010-11-09 10:05:09 UTC

We should fix this for 6.0.z as well.

Comment 19 Dor Laor 2010-11-09 10:42:00 UTC

(In reply to comment #18)
> We should fix this for 6.0.z as well.

Why? is that a common use case?

Comment 20 Avi Kivity 2010-11-09 14:46:26 UTC

Yes (I think).  Users won't know to disable the nmi watchdog unless they read the release notes (which they won't).

Comment 21 RHEL Program Management 2010-11-12 15:19:41 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 22 Aristeu Rozanski 2010-11-17 19:47:33 UTC

Patch(es) available on kernel-2.6.32-83.el6

Comment 24 Suqin Huang 2011-01-12 02:40:11 UTC

I can reproduce this bug:

1. host
kernel: 2.6.32-94.el6.x86_64 
qemu: qemu-kvm-0.12.1.2-2.128.el6.x86_64

cpu: 
processor	: 11
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 8
model name	: Six-Core AMD Opteron(tm) Processor 2427
stepping	: 0
cpu MHz		: 800.000
cache size	: 512 KB
physical id	: 1
siblings	: 6
core id		: 5
cpu cores	: 6
apicid		: 13
initial apicid	: 13
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
bogomips	: 4399.65
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate


2. guest:
rhel5.5-64 (2.6.18-194.32.1.el5)
rhel5.6-64 (2.6.18-235.el5)

Command line: ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200 console=tty0 nmi_watchdog=1

3. cmd:
/usr/libexec/qemu-kvm -drive file='/usr/images/RHEL-Server-5.6-64-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=on,format=qcow2,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 \
-device virtio-net-pci,mac=9a:65:49:e6:22:35,netdev=idnlDWl9,id=ndev00idnlDWl9,bus=pci.0,addr=0x3 -netdev tap,id=idnlDWl9,script='/usr/scripts/qemu-ifup-switch',downscript='no' \
-m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=none -M rhel6.0.0 -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm

Comment 25 Suqin Huang 2011-01-12 02:40:51 UTC

4. 2.6.18-235.el5 call trace info:

Unable to handle kernel NULL pointer dereference at 0000000000000d89 RIP:
 [<ffffffff8006539a>] do_debug+0x71/0x138
PGD 0
Oops: 0000 [1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-235.el5 #1
RIP: 0010:[<ffffffff8006539a>]  [<ffffffff8006539a>] do_debug+0x71/0x138
RSP: 0000:ffffffff804a8e88  EFLAGS: 00010097
RAX: 0000000000000000 RBX: 0000000000000cf8 RCX: 0000000000000000
RDX: ffffffff804a8f08 RSI: 0000000000000003 RDI: ffffffff80311748
RBP: 00000000ffff4ff0 R08: 0000000000000004 R09: 00000000f4002000
R10: 0000000000000018 R11: ffffffff8022841e R12: ffff81007ff8f7a0
R13: 0000000000000016 R14: 0000000000000004 R15: 0000000000000010
FS:  0000000000000000(0000) GS:ffffffff80424000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000d89 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff81007ff9e000, task ffff81007ff8f7a0)
Stack:  0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
 <#DB>  [<ffffffff80064d73>] debug+0x93/0x9f
 [<ffffffff8022841e>] pci_conf1_write+0x0/0xdb
 [<ffffffff8005d67c>] iret_label+0x0/0x2
 <<EOE>>  <NMI>  [<ffffffff801755e0>] vgacon_set_cursor_size+0x6d/0xce
 <<EOE>>  [<ffffffff8009308c>] __call_console_drivers+0x5b/0x69
 [<ffffffff800938dc>] printk+0x52/0xbd
 [<ffffffff800938dc>] printk+0x52/0xbd
 [<ffffffff8001730f>] release_console_sem+0x1a5/0x1f9
 [<ffffffff800aa386>] kallsyms_lookup+0xc2/0x17b
 [<ffffffff800aa386>] kallsyms_lookup+0xc2/0x17b
 [<ffffffff800aa386>] kallsyms_lookup+0xc2/0x17b
 [<ffffffff800aa386>] kallsyms_lookup+0xc2/0x17b
 [<ffffffff8006c6fc>] printk_address+0x9f/0xab
 [<ffffffff800938dc>] printk+0x52/0xbd
 [<ffffffff800938dc>] printk+0x52/0xbd
 [<ffffffff800a7d4e>] module_text_address+0x33/0x3c
 [<ffffffff800a0e5e>] kernel_text_address+0x1a/0x26
 [<ffffffff8006c3e2>] dump_trace+0x206/0x22f
 [<ffffffff8006c43f>] show_trace+0x34/0x47
 [<ffffffff8006c544>] _show_stack+0xdb/0xea
 [<ffffffff8006c5e0>] show_registers+0x8d/0x100
 [<ffffffff800650ce>] __die+0xad/0xff
 [<ffffffff8006748d>] do_page_fault+0x74d/0x874
 [<ffffffff8005dde9>] error_exit+0x0/0x84
 [<ffffffff8022841e>] pci_conf1_write+0x0/0xdb
 [<ffffffff8006539a>] do_debug+0x71/0x138
 [<ffffffff8006538f>] do_debug+0x66/0x138
 [<ffffffff80064d73>] debug+0x93/0x9f
 [<ffffffff8022841e>] pci_conf1_write+0x0/0xdb
 [<ffffffff8005d67c>] iret_label+0x0/0x2


Code: f6 83 91 00 00 00 02 74 01 fb 40 f6 c5 0f 74 0b 49 83 bc 24
RIP  [<ffffffff8006539a>] do_debug+0x71/0x138
 RSP <ffffffff804a8e88>
CR2: 0000000000000d89
 <0>Kernel panic - not syncing: Fatal exception

Comment 26 Suqin Huang 2011-01-12 02:41:30 UTC

5. 2.6.18-194.32.1.el5 call trace info:


 Pid: 2207, comm: S56rawdevices Not tainted 2.6.18-194.32.1.el5 #1
 RIP: 0010:[<ffffffff8005d67c>]  [<ffffffff8005d67c>] iret_label+0x0/0x2
 RSP: 0000:ffff8100026cffd8  EFLAGS: 00000086
 RAX: 0000000000000280 RBX: ffff81007e8a5b08 RCX: 0000000000000000
 RDX: ffff8100745d6280 RSI: ffff81007f867558 RDI: ffff81007e150540
 RBP: ffff810000000000 R08: 0000000000000001 R09: 0000000000000003
 R10: 0000000000000000 R11: 00002ad84a035000 R12: 00002ad84a12a674
 R13: ffff81007f867558 R14: 00003ffffffff000 R15: ffff81007f4cc040
 FS:  00002ad84a365e10(0000) GS:ffff81007ff91840(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00002ad84a12a674 CR3: 0000000074eb7000 CR4: 00000000000006e0
 Process S56rawdevices (pid: 2207, threadinfo ffff810073924000, task ffff81007f4cc040)
 Stack:  ffffffff8005d67c 0000000000000010 0000000000000086 ffff8100026cffd8
  0000000000000000 0000000000000000 0000000000000000 0000000000000000
  0000000000000000 0000000000000000 0000000000000000 0000000000000000
 Call Trace:
  <NMI>  [<ffffffff8005d67c>] iret_label+0x0/0x2
  <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2
 
 
 Code: 48 cf 0f ba e2 03 73 1b fb 57 e8 35 4d 00 00 5f 65 48 8b 0c
 Kernel panic - not syncing: nmi watchdog
  BUG: warning at kernel/panic.c:137/panic() (Not tainted)
 
 Call Trace:
  <NMI>  [<ffffffff80091cb3>] panic+0x1da/0x1eb
  [<ffffffff8006bad1>] _show_stack+0xdb/0xea
  [<ffffffff8006bbc4>] show_registers+0xe4/0x100
  [<ffffffff800652c5>] die_nmi+0x66/0xa3
  [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3
  [<ffffffff80065629>] default_do_nmi+0x81/0x225
  [<ffffffff80065896>] do_nmi+0x43/0x61
  [<ffffffff80064eef>] nmi+0x7f/0x88
  [<ffffffff8005d67c>] iret_label+0x0/0x2
  <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2
 
 BUG: warning at drivers/input/serio/i8042.c:846/i8042_panic_blink() (Not tainted)
 
 Call Trace:
  <NMI>  [<ffffffff8020b11a>] i8042_panic_blink+0x112/0x2a5
  [<ffffffff80091c59>] panic+0x180/0x1eb
  [<ffffffff8006bad1>] _show_stack+0xdb/0xea
  [<ffffffff8006bbc4>] show_registers+0xe4/0x100
  [<ffffffff800652c5>] die_nmi+0x66/0xa3
  [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3
  [<ffffffff80065629>] default_do_nmi+0x81/0x225
  [<ffffffff80065896>] do_nmi+0x43/0x61
  [<ffffffff80064eef>] nmi+0x7f/0x88
  [<ffffffff8005d67c>] iret_label+0x0/0x2
  <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2
 
 BUG: warning at drivers/input/serio/i8042.c:849/i8042_panic_blink() (Not tainted)
 
 Call Trace:
  <NMI>  [<ffffffff8020b203>] i8042_panic_blink+0x1fb/0x2a5
  [<ffffffff80091c59>] panic+0x180/0x1eb
  [<ffffffff8006bad1>] _show_stack+0xdb/0xea
  [<ffffffff8006bbc4>] show_registers+0xe4/0x100
  [<ffffffff800652c5>] die_nmi+0x66/0xa3
  [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3
  [<ffffffff80065629>] default_do_nmi+0x81/0x225
  [<ffffffff80065896>] do_nmi+0x43/0x61
  [<ffffffff80064eef>] nmi+0x7f/0x88
  [<ffffffff8005d67c>] iret_label+0x0/0x2
  <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2
 
 BUG: warning at drivers/input/serio/i8042.c:851/i8042_panic_blink() (Not tainted)
 
 Call Trace:
  <NMI>  [<ffffffff8020b280>] i8042_panic_blink+0x278/0x2a5
  [<ffffffff80091c59>] panic+0x180/0x1eb
  [<ffffffff8006bad1>] _show_stack+0xdb/0xea
  [<ffffffff8006bbc4>] show_registers+0xe4/0x100
  [<ffffffff800652c5>] die_nmi+0x66/0xa3
  [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3
  [<ffffffff80065629>] default_do_nmi+0x81/0x225
  [<ffffffff80065896>] do_nmi+0x43/0x61
  [<ffffffff80064eef>] nmi+0x7f/0x88
  [<ffffffff8005d67c>] iret_label+0x0/0x2
  <<EOE>>  [<ffffffff8005d67c>] iret_label+0x0/0x2

Comment 27 Avi Kivity 2011-01-12 09:39:08 UTC

Suqin, your reproducers are all on AMD, yes?  If so, that's probably Bug 612436.  This bug is Intel specific.

Comment 28 Suqin Huang 2011-01-14 05:30:07 UTC

(In reply to comment #27)
> Suqin, your reproducers are all on AMD, yes?  If so, that's probably Bug
> 612436.  This bug is Intel specific.

yes test on AMD
can not reproduce on Intel machine

Comment 29 Avi Kivity 2011-02-24 09:55:59 UTC

So, this bug can be closed?  As far as I can tell, the reopen was for a different bug (which is now in POST).

Comment 30 Suqin Huang 2011-02-24 10:11:23 UTC

close this bug according to #comment 27 and #comment 28

Comment 31 Suqin Huang 2011-02-24 10:16:13 UTC

sorry for the mistake, change status to verified.

Comment 33 Martin Prpič 2011-04-12 12:45:39 UTC

    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-I/O Advanced Programmable Interrupt Controller (I/O APIC) timer interrupts are not emulated as non-maskable interrupts (NMIs) to virtualized guests. Consequently, if a virtualized guest uses the kernel parameter nmi_watchdog=1, the guest kernel will panic on boot.+While not mandated by any specification, Linux systems rely on NMIs (Non-maskable Interrupts) being blocked by an IF-enabling (Interrupt Flag) STI instruction (an x86 instruction that enables interrupts; Set Interrupts); this is also the common behavior of all known hardware. Prior to this update, kernel panic could occur on guests using NMIs extensively (for example, a Linux system with the nmi_watchdog kernel parameter enabled). With this update, an NMI is disallowed when interrupts are blocked by an STI. This is done by checking for the condition and requesting an interrupt window exit if it occurs. As a result, kernel panic no longer occurs.

Comment 34 errata-xmlrpc 2011-05-23 20:42:49 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html