Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 691717

Summary:	guest kernel panic when run dd on the guest boot with -cpu model_name
Product:	Red Hat Enterprise Linux 6	Reporter:	Suqin Huang <shuang>
Component:	qemu-kvm	Assignee:	john cooper <john.cooper>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.1	CC:	ehabkost, gleb, juzhang, michen, mkenneth, mshao, nobody, rhod, tburke, virt-maint
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-06-09 12:59:37 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Suqin Huang 2011-03-29 10:33:41 UTC

Description of problem:
guest kernel panic when run dd on the guest booted with -cpu Nehalem 

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.152.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. cmd:
/usr/libexec/qemu-kvm -monitor stdio -drive file=/home/images/images/RHEL-Server-6.0-64-virtio.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idOpKnu6,mac=9a:c7:11:c1:0e:1a,netdev=idOpKnu6,id=ndev00idOpKnu6,bus=pci.0,addr=0x3 -netdev tap,id=idOpKnu6,vhost=on,script='/usr/Auto/autotest/client/tests/kvm/scripts/qemu-ifup',downscript='no' -m 4096 -smp 4,cores=2,threads=1,sockets=2 -cpu Penryn -spice port=8000,disable-ticketing -vga qxl -rtc base=localtime,clock=host,driftfix=none  -boot order=cdn,once=c,menu=off   -usbdevice tablet -enable-kvm
2. #mount -t tmpfs -o size=4G tmpfs /mnt
3. #for i in `seq 100`; do  dd if=/dev/zero of=/mnt/mem bs=100M count=40; done
  
Actual results:


Expected results:


Additional info:

1. guest RHEL6.0-x86_64 (2.6.32-71.18.2.el6.x86_64)

2. crash info

BUG: NMI Watchdog detected LOCKUP on CPU0, ip ffffffff8103bfca, registers:
CPU 0 
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]

Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]
Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff8103bfca>]  [<ffffffff8103bfca>] native_write_msr_safe+0xa/0x10
RSP: 0018:ffff880028203c60  EFLAGS: 00000002
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000080b
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000080b
RBP: ffff880028203c60 R08: ffffffff8170e000 R09: ffff880028203dc0
R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff817312ac
R13: 0000000000000000 R14: ffff880028203d18 R15: ffff8800282141a0
FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007ff337e09000 CR3: 0000000117cf3000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff8170e000, task ffffffff817320c0)
Stack:
 ffff880028203c70 ffffffff810353d7 ffff880028203ca0 ffffffff8103273d
<0> ffffffff817312ac ffffffff81731240 ffff880028203d18 ffffffff81731240
<0> ffff880028203ce0 ffffffff810dafea ffff880028203cc0 ffffffff81adee60
Call Trace:
 <IRQ> 
 [<ffffffff810353d7>] native_apic_msr_write+0x37/0x40
 [<ffffffff8103273d>] ack_apic_edge+0x3d/0x50
 [<ffffffff810dafea>] handle_edge_irq+0x6a/0x160
 [<ffffffff81015fb9>] handle_irq+0x49/0xa0
 [<ffffffff814d063c>] do_IRQ+0x6c/0xf0
 [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
 [<ffffffff81407311>] ? __netdev_alloc_skb+0x1/0x60
 [<ffffffffa006800a>] ? virtnet_poll+0x23a/0x8b0 [virtio_net]
 [<ffffffff814108f3>] ? net_rx_action+0x103/0x210
 [<ffffffff81073ce7>] ? __do_softirq+0xb7/0x1e0
 [<ffffffff810d8960>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff81073ae5>] ? irq_exit+0x85/0x90
 [<ffffffff814d0645>] ? do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
 <EOI> 
 [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
 [<ffffffff814b172a>] ? rest_init+0x7a/0x80
 [<ffffffff818c3f19>] ? start_kernel+0x413/0x41f
 [<ffffffff818c333a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff818c3438>] ? x86_64_start_kernel+0xfa/0x109
Code: c0 89 c7 48 89 d0 44 89 06 48 c1 e0 20 89 f9 48 09 c8 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 89 f0 89 f9 48 89 e5 0f 30 <31> c0 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 89 f9 0f 33 89 c7 
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-71.18.2.el6.x86_64 (mockbuild.bos.redhat.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Wed Mar 2 14:17:40 EST 2011
Command line: ro root=/dev/mapper/vg_virtlab6685146-lv_root rd_LVM_LV=vg_virtlab6685146/lv_root rd_LVM_LV=vg_virtlab6685146/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us console=tty0 console=ttyS0,115200 rhgb irqpoll maxcpus=1 reset_devices cgroup_disable=memory  memmap=exactmap memmap=640K@0K memmap=131436K@33408K elfcorehdr=164844K memmap=64K$960K memmap=12K$3670004K memmap=272K$4194032K
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 000000000009dc00 (usable)
 BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000dfffd000 (usable)
 BIOS-e820: 00000000dfffd000 - 00000000e0000000 (reserved)
 BIOS-e820: 00000000fffbc000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
last_pfn = 0x120000 max_arch_pfn = 0x400000000
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 00000000000f0000 - 0000000000100000 (reserved)
 user: 00000000020a0000 - 000000000a0fb000 (usable)
 user: 00000000dfffd000 - 00000000e0000000 (reserved)
 user: 00000000fffbc000 - 0000000100000000 (reserved)
DMI 2.4 present.
last_pfn = 0xa0fb max_arch_pfn = 0x400000000
x86 PAT enabled: cpu 0, old 0x70106, new 0x7010600070106
x2apic enabled by BIOS, switching to x2apic ops
init_memory_mapping: 0000000000000000-000000000a0fb000
RAMDISK: 09cad000 - 0a0eeb00
ACPI: RSDP 00000000000fdac0 00014 (v00 BOCHS )
ACPI: RSDT 00000000dfffd4e0 00030 (v01 BOCHS  BXPCRSDT 00000001 BXPC 00000001)
ACPI: FACP 00000000dffffda0 00074 (v01 BOCHS  BXPCFACP 00000001 BXPC 00000001)
ACPI: DSDT 00000000dfffd7e0 0255C (v01   BXPC   BXDSDT 00000001 INTL 20090123)
ACPI: FACS 00000000dffffd40 00040
ACPI: SSDT 00000000dfffd610 001C1 (v01 BOCHS  BXPCSSDT 00000001 BXPC 00000001)
ACPI: APIC 00000000dfffd510 0008A (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
Setting APIC routing to cluster x2apic.
No NUMA configuration found
Faking a node at 0000000000000000-000000000a0fb000
Bootmem setup node 0 0000000000000000-000000000a0fb000
  NODE_DATA [0000000000009000 - 000000000003cfff]
  bootmap [000000000003d000 -  000000000003e41f] pages 2
(7 early reservations) ==> bootmem [0000000000 - 000a0fb000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
  #2 [0003000000 - 0003ca0ff8]    TEXT DATA BSS ==> [0003000000 - 0003ca0ff8]
  #3 [0009cad000 - 000a0eeb00]          RAMDISK ==> [0009cad000 - 000a0eeb00]
  #4 [000009dc00 - 0000100000]    BIOS reserved ==> [000009dc00 - 0000100000]
  #5 [0003ca1000 - 0003ca1051]              BRK ==> [0003ca1000 - 0003ca1051]
  #6 [0000008000 - 0000009000]          PGTABLE ==> [0000008000 - 0000009000]
found SMP MP-table at [ffff8800000fdb10] fdb10
kvm-clock: Using msrs 12 and 11
kvm-clock: cpu 0, msr 0:38c1901, boot clock
Zone PFN ranges:
  DMA      0x00000001 -> 0x00001000
  DMA32    0x00001000 -> 0x00100000
  Normal   0x00100000 -> 0x00100000
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
    0: 0x00000001 -> 0x000000a0
    0: 0x000020a0 -> 0x0000a0fb
ACPI: PM-Timer IO Port: 0xb008
Setting APIC routing to cluster x2apic.

3. qemu output
KVM internal error. Suberror: 2
extra data[0]: 80000402
extra data[1]: 80000b0d
rax 0000000000000000 rbx 0000000000000000 rcx 0000000000000000 rdx 0000000000000600
rsi 0000000000000000 rdi 0000000000000000 rsp 0000000000000000 rbp 0000000000000000
r8  0000000000000000 r9  0000000000000000 r10 0000000000000000 r11 0000000000000000
r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000
rip 00000000ffffffff rflags 00010002
cs 1000 (00010000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr 0000 (feffd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
gdt 0/ffff
idt 0/ffff
cr0 60000010 cr2 0 cr3 0 cr4 0 cr8 0 efer 0
handle_dev_input: stop
[Thread 0x7f52bffff700 (LWP 2909) exited]

4. host
1). kernel: 2.6.32-125.el6.x86_64
2). cpuinfo:
processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
stepping	: 4
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5319.08
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual

Comment 2 Suqin Huang 2011-03-29 12:31:24 UTC

1. can reproduce when boot guest with -cpu Penryn
2. cannot reproduce when boot guest with -cpu cpu64-rhel6,+x2apic

Comment 3 Dor Laor 2011-03-29 20:48:17 UTC

Is the reproduce consistent?
Gleb, any ideas about the above stack trace or is it the nmi watch dog only?

Comment 4 john cooper 2011-03-29 22:07:14 UTC

(In reply to comment #0)
> Description of problem:
> guest kernel panic when run dd on the guest booted with -cpu Nehalem 
> 
> Version-Release number of selected component (if applicable):
> qemu-kvm-0.12.1.2-2.152.el6.x86_64
> 
> How reproducible:
> 100%

Suqin, Is the stack trace consistent between repeated failure reproductions?
If you have logs of other failures could you attach them as I haven't
reproduced this yet.  Also were there any diagnostic messages from kvm
on the host when the failure occurred (unhandled/ignored access complaints
to MSR 0x80b, etc..) ?

(In reply to comment #3)
> Is the reproduce consistent?
> Gleb, any ideas about the above stack trace or is it the nmi watch dog only?

Don't know if the above is coherent but cpu #0 looks to be
"stuck" at the wrmsr instruction of native_write_msr_safe().
I don't immediately see how that could happen given the
unhandled MSR code.  Looks like msr 0x80b which Intel defines
as:

80BH IA32_X2APIC_EOI x2APIC EOI Register. (W/O) If (CPUID.01H:ECX.[bit 21] = 1 )

And the corresponding cpuid 0000_0001 data which flags the existence of
this MSR:

ecx:21 x2APIC Extended xAPIC Support The processor supports x2APIC feature.

So we may need support of this MSR in some form, perhaps just tossing
the write access as apparently the current x2apic emulation isn't
needing this EOI handshake.  No clue yet how/why a lockup was detected.

Comment 5 Suqin Huang 2011-03-30 02:11:50 UTC

Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]
Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff8146dc40>]  [<ffffffff8146dc40>] icmp_send+0x0/0x780
RSP: 0018:ffff8800282039d8  EFLAGS: 00000246
RAX: 0000000000000006 RBX: ffff880028203a40 RCX: 0000000000000000
RDX: 000000000000000a RSI: 0000000000000003 RDI: ffff88011ac86380
RBP: ffffffff81013ad3 R08: ffff8801158ce800 R09: ffff880028203c68
R10: ffff880116b9c350 R11: 0000000000000001 R12: ffff880028203950
R13: ffff880116b9c000 R14: ffff880028203b30 R15: ffff88011ac86380
FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f5f47706000 CR3: 00000001163b0000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>  [<ffffffffa02483b1>] ? reject_tg+0x351/0x414 [ipt_REJECT]
 [<ffffffffa022b1eb>] ? ipt_do_table+0x3cb/0x678 [ip_tables]
 [<ffffffff8140a03d>] ? __skb_checksum_complete_head+0x1d/0x70
 [<ffffffff8115788c>] ? __kmalloc+0x20c/0x220
 [<ffffffff8140a0a1>] ? __skb_checksum_complete+0x11/0x20
 [<ffffffffa01f0eb8>] ? nf_conntrack_in+0x318/0x9f0 [nf_conntrack]
 [<ffffffffa0233123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [<ffffffff81436edc>] ? nf_iterate+0x6c/0xb0
 [<ffffffff81441ae0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81436f94>] ? nf_hook_slow+0x74/0x100
 [<ffffffff81441ae0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81441e0a>] ? ip_local_deliver+0x5a/0xa0
 [<ffffffff8144130d>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff81441895>] ? ip_rcv+0x275/0x350
 [<ffffffff8140fd6b>] ? netif_receive_skb+0x38b/0x670
 [<ffffffffa0068395>] ? virtnet_poll+0x5c5/0x8b0 [virtio_net]
 [<ffffffff814108f3>] ? net_rx_action+0x103/0x210
 [<ffffffff81073ce7>] ? __do_softirq+0xb7/0x1e0
 [<ffffffff810d8960>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff81073ae5>] ? irq_exit+0x85/0x90
 [<ffffffff814d0645>] ? do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
 [<ffffffff814b172a>] ? rest_init+0x7a/0x80
 [<ffffffff818c3f19>] ? start_kernel+0x413/0x41f
 [<ffffffff818c333a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff818c3438>] ? x86_64_start_kernel+0xfa/0x109
BUG: NMI Watchdog detected LOCKUP on CPU0, ip ffffffff8103bfca, registers:
CPU 0 
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]

Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]
Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff8103bfca>]  [<ffffffff8103bfca>] native_write_msr_safe+0xa/0x10
RSP: 0018:ffff880028203b80  EFLAGS: 00000006
RAX: 00000000000000ef RBX: 0000000000000001 RCX: 0000000000000830
RDX: 0000000000000001 RSI: 00000000000000ef RDI: 0000000000000830
RBP: ffff880028203b80 R08: ffffffff81adee60 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000400 R12: 000000000000ea60
R13: ffffffff81adee60 R14: 00000000000000ef R15: 00000000000000ef
FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f5f47706000 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff8170e000, task ffffffff817320c0)
Stack:
 ffff880028203be0 ffffffff810357bc 0000000000000000 ffffffff81adee60
<0> ffffffff81adee60 0000000000000086 ffff880028203bc0 ffffffff81adee60
<0> 0000000000011f60 0000000000000000 0000000000000000 ffffffff81adc1c0
Call Trace:
 <IRQ> 
 [<ffffffff810357bc>] x2apic_send_IPI_mask+0x6c/0xb0
 [<ffffffff8102f5fb>] lapic_timer_broadcast+0x1b/0x20
 [<ffffffff810a01a5>] tick_do_broadcast+0x85/0x90
 [<ffffffff810a0331>] tick_do_periodic_broadcast+0x41/0x50
 [<ffffffff810a0354>] tick_handle_periodic_broadcast+0x14/0x50
 [<ffffffff8101681f>] timer_interrupt+0x1f/0x30
 [<ffffffff810d8960>] handle_IRQ_event+0x60/0x170
 [<ffffffff810353d7>] ? native_apic_msr_write+0x37/0x40
 [<ffffffff810db046>] handle_edge_irq+0xc6/0x160
 [<ffffffff81015fb9>] handle_irq+0x49/0xa0
 [<ffffffff814d063c>] do_IRQ+0x6c/0xf0
 [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
 [<ffffffff814cb7b7>] ? _spin_unlock_irqrestore+0x17/0x20
 [<ffffffff8107e726>] ? mod_timer+0x146/0x230
 [<ffffffff8112f0c0>] ? sync_supers_timer_fn+0x0/0x20
 [<ffffffff8112f0b1>] ? bdi_arm_supers_timer+0x41/0x50
 [<ffffffff8112f0da>] ? sync_supers_timer_fn+0x1a/0x20
 [<ffffffff8107df17>] ? run_timer_softirq+0x197/0x340
 [<ffffffff81073ce7>] ? __do_softirq+0xb7/0x1e0
 [<ffffffff810d8960>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff81073ae5>] ? irq_exit+0x85/0x90
 [<ffffffff814d0645>] ? do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
 <EOI> 
 [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
 [<ffffffff814b172a>] ? rest_init+0x7a/0x80
 [<ffffffff818c3f19>] ? start_kernel+0x413/0x41f
 [<ffffffff818c333a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff818c3438>] ? x86_64_start_kernel+0xfa/0x109
Code: c0 89 c7 48 89 d0 44 89 06 48 c1 e0 20 89 f9 48 09 c8 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 89 f0 89 f9 48 89 e5 0f 30 <31> c0 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 89 f9 0f 33 89 c7 
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-71.18.2.el6.x86_64 (mockbuild.bos.redhat.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Wed Mar 2 14:17:40 EST 2011
Command line: ro root=/dev/mapper/vg_virtlab6685146-lv_root rd_LVM_LV=vg_virtlab6685146/lv_root rd_LVM_LV=vg_virtlab6685146/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us console=tty0 console=ttyS0,115200 rhgb irqpoll maxcpus=1 reset_devices cgroup_disable=memory  memmap=exactmap memmap=640K@0K memmap=131436K@33408K elfcorehdr=164844K memmap=64K$960K memmap=12K$3670004K memmap=272K$4194032K

Comment 6 Suqin Huang 2011-03-30 03:12:17 UTC

host dmesg:

__ratelimit: 6 callbacks suppressed
kvm: 5083: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0x0
kvm: 5083: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 5083: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffd769dc
kvm: 5083: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079
tap0: no IPv6 routers present
kvm: 5083: cpu1 unimplemented perfctr wrmsr: 0xc1 data 0x0
kvm: 5083: cpu1 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 5083: cpu1 unimplemented perfctr wrmsr: 0xc1 data 0xffd769dc
kvm: 5083: cpu1 unimplemented perfctr wrmsr: 0x186 data 0x530079
kvm: 5083: cpu2 unimplemented perfctr wrmsr: 0xc1 data 0x0
kvm: 5083: cpu2 unimplemented perfctr wrmsr: 0x186 data 0x130079
switch: port 2(tap0) entering disabled state
device tap0 left promiscuous mode
switch: port 2(tap0) entering disabled state
device tap0 entered promiscuous mode
switch: port 2(tap0) entering forwarding state
tap0: no IPv6 routers present
Access APIC ARBPRI register which is for P6
switch: port 2(tap0) entering disabled state
device tap0 left promiscuous mode
switch: port 2(tap0) entering disabled state
device tap0 entered promiscuous mode
switch: port 2(tap0) entering forwarding state
tap0: no IPv6 routers present
__ratelimit: 6 callbacks suppressed
kvm: 7208: cpu0 unhandled wrmsr: 0x198 data 0
kvm: 7208: cpu1 unhandled wrmsr: 0x198 data 0
kvm: 7208: cpu2 unhandled wrmsr: 0x198 data 0
kvm: 7208: cpu3 unhandled wrmsr: 0x198 data 0
switch: port 2(tap0) entering disabled state
device tap0 left promiscuous mode
switch: port 2(tap0) entering disabled state

Comment 7 Gleb Natapov 2011-03-30 09:36:19 UTC

(In reply to comment #3)
> Is the reproduce consistent?
> Gleb, any ideas about the above stack trace or is it the nmi watch dog only?
Looks like nmi watchdog activates at random times. We should work on disabling all timer base watchdogs in all relevant guests, not waste our time debugging something that can't work. I am more concerned with KVM internal error that qemu produces.

Comment 8 Gleb Natapov 2011-03-30 09:40:18 UTC

(In reply to comment #4)
> (In reply to comment #0)
> > Description of problem:
> > guest kernel panic when run dd on the guest booted with -cpu Nehalem 
> > 
> > Version-Release number of selected component (if applicable):
> > qemu-kvm-0.12.1.2-2.152.el6.x86_64
> > 
> > How reproducible:
> > 100%
> 
> Suqin, Is the stack trace consistent between repeated failure reproductions?
> If you have logs of other failures could you attach them as I haven't
> reproduced this yet.  Also were there any diagnostic messages from kvm
> on the host when the failure occurred (unhandled/ignored access complaints
> to MSR 0x80b, etc..) ?
This is one of x2apic MSRs and we emulate it, so no error message should be seen in dmesgae about the msr.

> 
> (In reply to comment #3)
> > Is the reproduce consistent?
> > Gleb, any ideas about the above stack trace or is it the nmi watch dog only?
> 
> Don't know if the above is coherent but cpu #0 looks to be
> "stuck" at the wrmsr instruction of native_write_msr_safe().
> I don't immediately see how that could happen given the
> unhandled MSR code.  Looks like msr 0x80b which Intel defines
> as:
> 
> 80BH IA32_X2APIC_EOI x2APIC EOI Register. (W/O) If (CPUID.01H:ECX.[bit 21] = 1
> )
> 
> And the corresponding cpuid 0000_0001 data which flags the existence of
> this MSR:
> 
> ecx:21 x2APIC Extended xAPIC Support The processor supports x2APIC feature.
> 
> So we may need support of this MSR in some form, perhaps just tossing
> the write access as apparently the current x2apic emulation isn't
> needing this EOI handshake.  No clue yet how/why a lockup was detected.
We fully emulate x2apic, otherwise guest wouldn't even issue this msr write.

Comment 9 john cooper 2011-03-30 13:16:08 UTC

(In reply to comment #7)
> (In reply to comment #3)
> > Is the reproduce consistent?
> > Gleb, any ideas about the above stack trace or is it the nmi watch dog only?
> Looks like nmi watchdog activates at random times. We should work on disabling
> all timer base watchdogs in all relevant guests, not waste our time debugging
> something that can't work.

The MSR access and soft lockup appear coincidental in the first
dump given the second stack dump above.

> I am more concerned with KVM internal error that
> qemu produces.

Which appears related to the NMI handling in the first scenario.
Could the NMI handling somehow be nesting?

Comment 10 Dor Laor 2011-03-30 14:36:53 UTC

(In reply to comment #7)
> (In reply to comment #3)
> > Is the reproduce consistent?
> > Gleb, any ideas about the above stack trace or is it the nmi watch dog only?
> Looks like nmi watchdog activates at random times. We should work on disabling
> all timer base watchdogs in all relevant guests, not waste our time debugging
> something that can't work. I am more concerned with KVM internal error that
> qemu produces.

Will just disabling nmi in release notes will make this go away?

Comment 11 Gleb Natapov 2011-03-30 14:44:21 UTC

(In reply to comment #10)
> (In reply to comment #7)
> > (In reply to comment #3)
> > > Is the reproduce consistent?
> > > Gleb, any ideas about the above stack trace or is it the nmi watch dog only?
> > Looks like nmi watchdog activates at random times. We should work on disabling
> > all timer base watchdogs in all relevant guests, not waste our time debugging
> > something that can't work. I am more concerned with KVM internal error that
> > qemu produces.
> 
> Will just disabling nmi in release notes will make this go away?
It will make this go away obviously, but in case this is genuine hang then guest will stuck instead. Bug reporter needs to test this.

Comment 12 john cooper 2011-03-30 22:42:08 UTC

(In reply to comment #11)
> (In reply to comment #10)
    :
> > Will just disabling nmi in release notes will make this go away?
> It will make this go away obviously, but in case this is genuine hang then
> guest will stuck instead. Bug reporter needs to test this.

Worth a quick try to see if the guest actually recovers or not.
Still the apparent correlation to the above cpu models vs. cpu64-rhel6,+x2apic
is disturbing.

Suqin could you repeat your test for this expected failure mode
after disabling /proc/sys/kernel/nmi_watchdog to see if the guest
eventually makes progress?

Comment 13 Suqin Huang 2011-03-31 07:16:40 UTC

1. guest didn't crash if disable nmi_watchdog, but still have call trace

BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]
CPU 0:
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log ppdev parport_pc parport sg i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_net sr_mod cdrom virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: speedstep_lib]
Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff81073c9b>]  [<ffffffff81073c9b>] __do_softirq+0x6b/0x1e0
RSP: 0018:ffff880028203ed0  EFLAGS: 00000206
RAX: 00000000000104a0 RBX: ffff880028203f30 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81731240 RDI: ffffffff817312ac
RBP: ffffffff81013ad3 R08: ffffffff81adee60 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff880028203e50
R13: ffffffff8170fe18 R14: 0000000000000000 R15: 0000000000000202
FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f3ab38a9b08 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>  [<ffffffff810d8960>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810353d7>] ? native_apic_msr_write+0x37/0x40
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff81073ae5>] ? irq_exit+0x85/0x90
 [<ffffffff814d0645>] ? do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
 [<ffffffff814b172a>] ? rest_init+0x7a/0x80
 [<ffffffff818c3f19>] ? start_kernel+0x413/0x41f
 [<ffffffff818c333a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff818c3438>] ? x86_64_start_kernel+0xfa/0x109
INFO: task kswapd0:50 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0       D ffff8801177a34e0     0    50      2 0x00000000
 ffff8801177a5610 0000000000000046 ffff8801ffffffff 00000610b38e2653
 0000000000000000 ffff88011adf3750 0000000000d7ae2e ffffffffb26b86fa
 ffff8801177a3a98 ffff8801177a5fd8 0000000000010518 ffff8801177a3a98
Call Trace:
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff81242d2d>] get_request_wait+0xcd/0x1a0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81239eeb>] ? elv_merge+0x1cb/0x200
 [<ffffffff81243413>] __make_request+0x93/0x4c0
 [<ffffffff81241882>] generic_make_request+0x1b2/0x4f0
 [<ffffffff81241c4f>] submit_bio+0x8f/0x120
 [<ffffffff81144b44>] swap_writepage+0x94/0xe0
 [<ffffffff81129a3c>] shmem_writepage+0x1cc/0x250
 [<ffffffff811242db>] pageout.clone.1+0x12b/0x300
 [<ffffffff811247e6>] shrink_page_list.clone.0+0x336/0x540
 [<ffffffff81122627>] ? ____pagevec_lru_add+0x167/0x180
 [<ffffffff81013c8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff81124ce5>] shrink_inactive_list+0x2f5/0x740
 [<ffffffff81059dc2>] ? finish_task_switch+0x42/0xd0
 [<ffffffff81125f3f>] shrink_zone+0x38f/0x520
 [<ffffffff81127429>] balance_pgdat+0x6a9/0x770
 [<ffffffff81127860>] ? isolate_pages_global+0x0/0x380
 [<ffffffff81127624>] kswapd+0x134/0x370
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811274f0>] ? kswapd+0x0/0x370
 [<ffffffff81091a86>] kthread+0x96/0xa0
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff810919f0>] ? kthread+0x0/0xa0
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
INFO: task jbd2/dm-0-8:401 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/dm-0-8   D ffff88011fc23280     0   401      2 0x00000000
 ffff880115a9dd10 0000000000000046 ffff880115a9dcd8 ffff880115a9dcd4
 ffff880115a9dcc0 ffff88011fc23280 ffff880028216980 0000000100673f23
 ffff8801162db0e8 ffff880115a9dfd8 0000000000010518 ffff8801162db0e8
Call Trace:
 [<ffffffffa006f8f0>] jbd2_journal_commit_transaction+0x1c0/0x14e0 [jbd2]
 [<ffffffff810566e0>] ? __dequeue_entity+0x30/0x50
 [<ffffffff810116e0>] ? __switch_to+0xd0/0x320
 [<ffffffff8107d6bc>] ? lock_timer_base+0x3c/0x70
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa00760b8>] kjournald2+0xb8/0x220 [jbd2]
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0076000>] ? kjournald2+0x0/0x220 [jbd2]
 [<ffffffff81091a86>] kthread+0x96/0xa0
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff810919f0>] ? kthread+0x0/0xa0
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
INFO: task auditd:1294 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
auditd        D ffff88011fc23080     0  1294      1 0x00000000
 ffff8801182b19f8 0000000000000082 0000000000000000 ffff880115444518
 ffff88011541c408 ffffffffa00c9a00 ffff88011b360678 0000000100673f13
 ffff8801170fba98 ffff8801182b1fd8 0000000000010518 ffff8801170fba98
Call Trace:
 [<ffffffffa006e082>] start_this_handle+0x262/0x4e0 [jbd2]
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa006e4e5>] jbd2_journal_start+0xb5/0x100 [jbd2]
 [<ffffffff8101187e>] ? __switch_to+0x26e/0x320
 [<ffffffffa00afce8>] ext4_journal_start_sb+0x58/0x90 [ext4]
 [<ffffffffa009b6b1>] ext4_da_write_begin+0x121/0x290 [ext4]
 [<ffffffff8110cb1e>] generic_file_buffered_write+0x10e/0x2a0
 [<ffffffff810722c7>] ? current_fs_time+0x27/0x30
 [<ffffffffa00c91fb>] ? ext4_xattr_security_get+0x2b/0x30 [ext4]
 [<ffffffff8110e470>] __generic_file_aio_write+0x250/0x480
 [<ffffffff8110e70f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa0091169>] ext4_file_write+0x39/0xb0 [ext4]
 [<ffffffff8116cf2a>] do_sync_write+0xfa/0x140
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8120c93b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811ffda6>] ? security_file_permission+0x16/0x20
 [<ffffffff8116d228>] vfs_write+0xb8/0x1a0
 [<ffffffff8116dc61>] sys_write+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
INFO: task rsyslogd:1312 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rsyslogd      D ffff88011fc23280     0  1312      1 0x00000080
 ffff880119dff2a8 0000000000000086 0000000000000000 ffff880119dff26c
 0000000000000000 ffff88011fc23280 ffff880028216980 000000010067431f
 ffff8801172e3a58 ffff880119dfffd8 0000000000010518 ffff8801172e3a58
Call Trace:
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff81242d2d>] get_request_wait+0xcd/0x1a0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81239eeb>] ? elv_merge+0x1cb/0x200
 [<ffffffff81243413>] __make_request+0x93/0x4c0
 [<ffffffff81241882>] generic_make_request+0x1b2/0x4f0
 [<ffffffff81241c4f>] submit_bio+0x8f/0x120
 [<ffffffff81144b44>] swap_writepage+0x94/0xe0
 [<ffffffff81129a3c>] shmem_writepage+0x1cc/0x250
 [<ffffffff811242db>] pageout.clone.1+0x12b/0x300
 [<ffffffff811247e6>] shrink_page_list.clone.0+0x336/0x540
 [<ffffffff8124d330>] ? blkiocg_update_dispatch_stats+0x90/0xe0
 [<ffffffff81013c8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff81124ce5>] shrink_inactive_list+0x2f5/0x740
 [<ffffffff8124d431>] ? blkiocg_update_io_add_stats+0x61/0x90
 [<ffffffff81125f3f>] shrink_zone+0x38f/0x520
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff811261ce>] do_try_to_free_pages+0xfe/0x500
 [<ffffffff8111cf4c>] ? get_page_from_freelist+0x15c/0x820
 [<ffffffff811267bf>] try_to_free_pages+0x9f/0x130
 [<ffffffff81127860>] ? isolate_pages_global+0x0/0x380
 [<ffffffff8111e6ee>] __alloc_pages_nodemask+0x3ee/0x850
 [<ffffffff81013ace>] ? common_interrupt+0xe/0x13
 [<ffffffff81150b8a>] alloc_pages_current+0x9a/0x100
 [<ffffffff8110c567>] __page_cache_alloc+0x87/0x90
 [<ffffffff811212bb>] __do_page_cache_readahead+0xdb/0x210
 [<ffffffff81121411>] ra_submit+0x21/0x30
 [<ffffffff8110e111>] filemap_fault+0x4b1/0x510
 [<ffffffff81135534>] __do_fault+0x54/0x500
 [<ffffffff81013c8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff81135ad7>] handle_pte_fault+0xf7/0xad0
 [<ffffffff8106d0c3>] ? do_syslog+0x3d3/0x4c0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8113669d>] handle_mm_fault+0x1ed/0x2b0
 [<ffffffff814ce203>] do_page_fault+0x123/0x3a0
 [<ffffffff814cbc75>] page_fault+0x25/0x30
INFO: task rs:main Q:Reg:2046 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rs:main Q:Reg D ffff88011fc23280     0  2046      1 0x00000080
 ffff88000d72baa8 0000000000000086 0000000000000000 0000000000000000
 ffff88011ac89700 0000000000000002 ffff8801190314d8 0000000100674013
 ffff880116279068 ffff88000d72bfd8 0000000000010518 ffff880116279068
Call Trace:
 [<ffffffffa006e082>] start_this_handle+0x262/0x4e0 [jbd2]
 [<ffffffff811568fb>] ? cache_alloc_refill+0x15b/0x240
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa006e4e5>] jbd2_journal_start+0xb5/0x100 [jbd2]
 [<ffffffffa00afce8>] ext4_journal_start_sb+0x58/0x90 [ext4]
 [<ffffffffa009778a>] ext4_dirty_inode+0x2a/0x60 [ext4]
 [<ffffffff811961fb>] __mark_inode_dirty+0x3b/0x160
 [<ffffffff81186c42>] file_update_time+0xf2/0x170
 [<ffffffff8110e440>] __generic_file_aio_write+0x220/0x480
 [<ffffffff8110e70f>] generic_file_aio_write+0x6f/0xe0
 [<ffffffffa0091169>] ext4_file_write+0x39/0xb0 [ext4]
 [<ffffffff8116cf2a>] do_sync_write+0xfa/0x140
 [<ffffffff81144b90>] ? end_swap_bio_write+0x0/0x80
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8120c93b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811ffda6>] ? security_file_permission+0x16/0x20
 [<ffffffff8116d228>] vfs_write+0xb8/0x1a0
 [<ffffffff810d42c2>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8116dc61>] sys_write+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
INFO: task irqbalance:1362 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
irqbalance    D ffff88011fc23680     0  1362      1 0x00000080
 ffff880117dcf468 0000000000000082 ffff880117dcf430 ffff880117dcf42c
 0000000000000000 ffff88011fc23680 ffff880028316980 00000001006743d7
 ffff8801172dbad8 ffff880117dcffd8 0000000000010518 ffff8801172dbad8
Call Trace:
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff81242d2d>] get_request_wait+0xcd/0x1a0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81239eeb>] ? elv_merge+0x1cb/0x200
 [<ffffffff81243413>] __make_request+0x93/0x4c0
 [<ffffffff81241882>] generic_make_request+0x1b2/0x4f0
 [<ffffffff81241c4f>] submit_bio+0x8f/0x120
 [<ffffffff81144b44>] swap_writepage+0x94/0xe0
 [<ffffffff81129a3c>] shmem_writepage+0x1cc/0x250
 [<ffffffff811242db>] pageout.clone.1+0x12b/0x300
 [<ffffffff811247e6>] shrink_page_list.clone.0+0x336/0x540
 [<ffffffff8112d3ed>] ? zone_statistics+0x7d/0xa0
 [<ffffffff81124ce5>] shrink_inactive_list+0x2f5/0x740
 [<ffffffff8111f9fa>] ? determine_dirtyable_memory+0x1a/0x30
 [<ffffffff8111faa7>] ? get_dirty_limits+0x27/0x2f0
 [<ffffffff81125f3f>] shrink_zone+0x38f/0x520
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff811261ce>] do_try_to_free_pages+0xfe/0x500
 [<ffffffff8111cf4c>] ? get_page_from_freelist+0x15c/0x820
 [<ffffffff811267bf>] try_to_free_pages+0x9f/0x130
 [<ffffffff81127860>] ? isolate_pages_global+0x0/0x380
 [<ffffffff8111e6ee>] __alloc_pages_nodemask+0x3ee/0x850
 [<ffffffff811560e2>] kmem_getpages+0x62/0x170
 [<ffffffff81156cfa>] fallback_alloc+0x1ba/0x270
 [<ffffffff8115674f>] ? cache_grow+0x2cf/0x320
 [<ffffffff81156a79>] ____cache_alloc_node+0x99/0x160
 [<ffffffff811573db>] kmem_cache_alloc+0x11b/0x190
 [<ffffffff81179a3b>] getname+0x3b/0x250
 [<ffffffff8116a302>] do_sys_open+0x32/0x140
 [<ffffffff8116a450>] sys_open+0x20/0x30
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
INFO: task hald:1584 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
hald          D ffff88011fc23680     0  1584      1 0x00000080
 ffff880117d19148 0000000000000082 0000000000000000 00000610b38e2653
 0000000000000000 ffff880115970210 0000000000d7ae44 0000000100674325
 ffff880116b150a8 ffff880117d19fd8 0000000000010518 ffff880116b150a8
Call Trace:
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff81242d2d>] get_request_wait+0xcd/0x1a0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81239eeb>] ? elv_merge+0x1cb/0x200
 [<ffffffff81243413>] __make_request+0x93/0x4c0
 [<ffffffff81241882>] generic_make_request+0x1b2/0x4f0
 [<ffffffff81013c8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff81241c4f>] submit_bio+0x8f/0x120
 [<ffffffff81144b44>] swap_writepage+0x94/0xe0
 [<ffffffff81129a3c>] shmem_writepage+0x1cc/0x250
 [<ffffffff811242db>] pageout.clone.1+0x12b/0x300
 [<ffffffff811247e6>] shrink_page_list.clone.0+0x336/0x540
 [<ffffffff81124ce5>] shrink_inactive_list+0x2f5/0x740
 [<ffffffff8111f9fa>] ? determine_dirtyable_memory+0x1a/0x30
 [<ffffffff8111faa7>] ? get_dirty_limits+0x27/0x2f0
 [<ffffffff81125f3f>] shrink_zone+0x38f/0x520
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff811261ce>] do_try_to_free_pages+0xfe/0x500
 [<ffffffff8111cf4c>] ? get_page_from_freelist+0x15c/0x820
 [<ffffffff811267bf>] try_to_free_pages+0x9f/0x130
 [<ffffffff81127860>] ? isolate_pages_global+0x0/0x380
 [<ffffffff8111e6ee>] __alloc_pages_nodemask+0x3ee/0x850
 [<ffffffff81150b8a>] alloc_pages_current+0x9a/0x100
 [<ffffffff8111cc6e>] __get_free_pages+0xe/0x50
 [<ffffffff81182414>] __pollwait+0xb4/0xf0
 [<ffffffff811d0bba>] mounts_poll+0x6a/0x70
 [<ffffffff8118274b>] do_sys_poll+0x29b/0x520
 [<ffffffff81182360>] ? __pollwait+0x0/0xf0
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff81182450>] ? pollwake+0x0/0x60
 [<ffffffff8103be2c>] ? kvm_clock_read+0x1c/0x20
 [<ffffffff8103be39>] ? kvm_clock_get_cycles+0x9/0x10
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff81181b5d>] ? poll_select_set_timeout+0x8d/0xa0
 [<ffffffff81182bcc>] sys_poll+0x7c/0x110
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
INFO: task master:1714 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
master        D ffff88011fc23480     0  1714      1 0x00000084
 ffff8801181e9958 0000000000000086 0000000000000000 00000610b1c7f913
 0000000000000000 ffff880117227980 0000000000d77d42 0000000100672ab9
 ffff88011726da58 ffff8801181e9fd8 0000000000010518 ffff88011726da58
Call Trace:
 [<ffffffff8119e820>] ? sync_buffer+0x0/0x50
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff8119e860>] sync_buffer+0x40/0x50
 [<ffffffff814c9e7a>] __wait_on_bit_lock+0x5a/0xc0
 [<ffffffff8119e820>] ? sync_buffer+0x0/0x50
 [<ffffffff814c9f58>] out_of_line_wait_on_bit_lock+0x78/0x90
 [<ffffffff81091e30>] ? wake_bit_function+0x0/0x50
 [<ffffffff8119ea06>] __lock_buffer+0x36/0x40
 [<ffffffffa006f2b3>] do_get_write_access+0x473/0x500 [jbd2]
 [<ffffffffa006f491>] jbd2_journal_get_write_access+0x31/0x50 [jbd2]
 [<ffffffffa00bacc8>] __ext4_journal_get_write_access+0x38/0x80 [ext4]
 [<ffffffffa0097443>] ext4_reserve_inode_write+0x73/0xa0 [ext4]
 [<ffffffffa00974bc>] ext4_mark_inode_dirty+0x4c/0x1d0 [ext4]
 [<ffffffffa00977a0>] ext4_dirty_inode+0x40/0x60 [ext4]
 [<ffffffff811961fb>] __mark_inode_dirty+0x3b/0x160
 [<ffffffff81186c42>] file_update_time+0xf2/0x170
 [<ffffffff81176df2>] pipe_write+0x2d2/0x650
 [<ffffffff8116cf2a>] do_sync_write+0xfa/0x140
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8120c93b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811ffda6>] ? security_file_permission+0x16/0x20
 [<ffffffff8116d228>] vfs_write+0xb8/0x1a0
 [<ffffffff810d42c2>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8116dc61>] sys_write+0x51/0x90
 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
INFO: task qmgr:1724 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qmgr          D ffff88011fc23680     0  1724   1714 0x00000080
 ffff88011994f2a8 0000000000000086 0000000000000000 ffff88011994f26c
 0000000000000000 ffff88011fc23680 ffff880028216980 0000000100675221
 ffff880116881a58 ffff88011994ffd8 0000000000010518 ffff880116881a58
Call Trace:
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff81242d2d>] get_request_wait+0xcd/0x1a0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81239eeb>] ? elv_merge+0x1cb/0x200
 [<ffffffff81243413>] __make_request+0x93/0x4c0
 [<ffffffff81241882>] generic_make_request+0x1b2/0x4f0
 [<ffffffff81241c4f>] submit_bio+0x8f/0x120
 [<ffffffff81144b44>] swap_writepage+0x94/0xe0
 [<ffffffff81129a3c>] shmem_writepage+0x1cc/0x250
 [<ffffffff811242db>] pageout.clone.1+0x12b/0x300
 [<ffffffff811247e6>] shrink_page_list.clone.0+0x336/0x540
 [<ffffffff81013c8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff81124ce5>] shrink_inactive_list+0x2f5/0x740
 [<ffffffff81125f3f>] shrink_zone+0x38f/0x520
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff811261ce>] do_try_to_free_pages+0xfe/0x500
 [<ffffffff8111cf4c>] ? get_page_from_freelist+0x15c/0x820
 [<ffffffff811267bf>] try_to_free_pages+0x9f/0x130
 [<ffffffff81127860>] ? isolate_pages_global+0x0/0x380
 [<ffffffff8111e6ee>] __alloc_pages_nodemask+0x3ee/0x850
 [<ffffffff81150b8a>] alloc_pages_current+0x9a/0x100
 [<ffffffff8110c567>] __page_cache_alloc+0x87/0x90
 [<ffffffff811212bb>] __do_page_cache_readahead+0xdb/0x210
 [<ffffffff81121411>] ra_submit+0x21/0x30
 [<ffffffff8110e111>] filemap_fault+0x4b1/0x510
 [<ffffffff81135534>] __do_fault+0x54/0x500
 [<ffffffff81135ad7>] handle_pte_fault+0xf7/0xad0
 [<ffffffff8101d13f>] ? save_i387_xstate+0x1cf/0x220
 [<ffffffff8101247c>] ? do_signal+0x18c/0x800
 [<ffffffff8113669d>] handle_mm_fault+0x1ed/0x2b0
 [<ffffffff814ce203>] do_page_fault+0x123/0x3a0
 [<ffffffff814cbc75>] page_fault+0x25/0x30
INFO: task crond:1733 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
crond         D ffff8801160c14a0     0  1733      1 0x00000080
 ffff880118f393b8 0000000000000082 ffff8801ffffffff 00000610b03cee7d
 0000000000000000 ffff88011732c6e0 0000000000d7ae1a ffffffffb26b86fa
 ffff8801160c1a58 ffff880118f39fd8 0000000000010518 ffff8801160c1a58
Call Trace:
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff814c9753>] io_schedule+0x73/0xc0
 [<ffffffff81242d2d>] get_request_wait+0xcd/0x1a0
 [<ffffffff81091df0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81239eeb>] ? elv_merge+0x1cb/0x200
 [<ffffffff81243413>] __make_request+0x93/0x4c0
 [<ffffffff81241882>] generic_make_request+0x1b2/0x4f0
 [<ffffffff81241c4f>] submit_bio+0x8f/0x120
 [<ffffffff81144b44>] swap_writepage+0x94/0xe0
 [<ffffffff81129a3c>] shmem_writepage+0x1cc/0x250
 [<ffffffff811242db>] pageout.clone.1+0x12b/0x300
 [<ffffffff811247e6>] shrink_page_list.clone.0+0x336/0x540
 [<ffffffff81013c8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff81124ce5>] shrink_inactive_list+0x2f5/0x740
 [<ffffffff8111f9fa>] ? determine_dirtyable_memory+0x1a/0x30
 [<ffffffff8111faa7>] ? get_dirty_limits+0x27/0x2f0
 [<ffffffff81125f3f>] shrink_zone+0x38f/0x520
 [<ffffffff8109baf9>] ? ktime_get_ts+0xa9/0xe0
 [<ffffffff811261ce>] do_try_to_free_pages+0xfe/0x500
 [<ffffffff8111cf4c>] ? get_page_from_freelist+0x15c/0x820
 [<ffffffff811267bf>] try_to_free_pages+0x9f/0x130
 [<ffffffff81127860>] ? isolate_pages_global+0x0/0x380
 [<ffffffff8111e6ee>] __alloc_pages_nodemask+0x3ee/0x850
 [<ffffffff81150c83>] alloc_pages_vma+0x93/0x150
 [<ffffffff81134d04>] do_wp_page+0xf4/0x8d0
 [<ffffffff81059dd2>] ? finish_task_switch+0x52/0xd0
 [<ffffffff81135d2d>] handle_pte_fault+0x34d/0xad0
 [<ffffffff810961ff>] ? hrtimer_try_to_cancel+0x3f/0xd0
 [<ffffffff810962b2>] ? hrtimer_cancel+0x22/0x30
 [<ffffffff8103ccd8>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff8113669d>] handle_mm_fault+0x1ed/0x2b0
 [<ffffffff8103be2c>] ? kvm_clock_read+0x1c/0x20
 [<ffffffff814ce203>] do_page_fault+0x123/0x3a0
 [<ffffffff814cbc75>] page_fault+0x25/0x30




2. host dmesg:

"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000001     0 13396   2313 0x00000080
 ffff880128b71a68 0000000000000086 0000000000000000 0000000000000004
 ffff880000035b08 0000000000000001 ffff880128b70000 000000010a619415
 ffff88010cce30f8 ffff880128b71fd8 000000000000f558 ffff88010cce30f8
Call Trace:
 [<ffffffff814da6b5>] schedule_timeout+0x215/0x2e0
 [<ffffffff81267a9d>] ? rb_insert_color+0x9d/0x160
 [<ffffffff811437f9>] ? vmap_page_range_noflush+0x279/0x370
 [<ffffffff814da333>] wait_for_common+0x123/0x180
 [<ffffffff8105cb10>] ? default_wake_function+0x0/0x20
 [<ffffffff814da44d>] wait_for_completion+0x1d/0x20
 [<ffffffff8108a938>] synchronize_sched+0x58/0x60
 [<ffffffff8108a8c0>] ? wakeme_after_rcu+0x0/0x20
 [<ffffffffa03a822d>] kvm_set_irq_routing+0x29d/0x2c0 [kvm]
 [<ffffffffa03a61c1>] kvm_vm_ioctl+0xb51/0xee0 [kvm]
 [<ffffffffa03aedd7>] ? kvm_arch_vcpu_put+0x47/0x60 [kvm]
 [<ffffffffa03a4b78>] ? vcpu_put+0x28/0x30 [kvm]
 [<ffffffffa03b7fc0>] ? kvm_arch_vcpu_ioctl_run+0x550/0xec0 [kvm]
 [<ffffffff8120b5d1>] ? avc_has_perm+0x71/0x90
 [<ffffffff8120cf64>] ? inode_has_perm+0x54/0xa0
 [<ffffffff81183b52>] vfs_ioctl+0x22/0xa0
 [<ffffffff8118401a>] do_vfs_ioctl+0x3aa/0x580
 [<ffffffff81184271>] sys_ioctl+0x81/0xa0
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b

Comment 14 Gleb Natapov 2011-03-31 07:43:15 UTC

(In reply to comment #13)
> 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> 
> BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
I thought we disabled soft lockup reporting when kernel runs as a guest. May be in rhel6.1 and not in rhel6.0?

If teh guest make progress they are harmless.

> 2. host dmesg:
> 
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Now this is worrying. I don't see how kvm may be responsible for it though. Looks like more general system wide problem.

> qemu-kvm      D 0000000000000001     0 13396   2313 0x00000080
>  ffff880128b71a68 0000000000000086 0000000000000000 0000000000000004
>  ffff880000035b08 0000000000000001 ffff880128b70000 000000010a619415
>  ffff88010cce30f8 ffff880128b71fd8 000000000000f558 ffff88010cce30f8
> Call Trace:
>  [<ffffffff814da6b5>] schedule_timeout+0x215/0x2e0
>  [<ffffffff81267a9d>] ? rb_insert_color+0x9d/0x160
>  [<ffffffff811437f9>] ? vmap_page_range_noflush+0x279/0x370
>  [<ffffffff814da333>] wait_for_common+0x123/0x180
>  [<ffffffff8105cb10>] ? default_wake_function+0x0/0x20
>  [<ffffffff814da44d>] wait_for_completion+0x1d/0x20
>  [<ffffffff8108a938>] synchronize_sched+0x58/0x60
>  [<ffffffff8108a8c0>] ? wakeme_after_rcu+0x0/0x20
>  [<ffffffffa03a822d>] kvm_set_irq_routing+0x29d/0x2c0 [kvm]
>  [<ffffffffa03a61c1>] kvm_vm_ioctl+0xb51/0xee0 [kvm]
>  [<ffffffffa03aedd7>] ? kvm_arch_vcpu_put+0x47/0x60 [kvm]
>  [<ffffffffa03a4b78>] ? vcpu_put+0x28/0x30 [kvm]
>  [<ffffffffa03b7fc0>] ? kvm_arch_vcpu_ioctl_run+0x550/0xec0 [kvm]
>  [<ffffffff8120b5d1>] ? avc_has_perm+0x71/0x90
>  [<ffffffff8120cf64>] ? inode_has_perm+0x54/0xa0
>  [<ffffffff81183b52>] vfs_ioctl+0x22/0xa0
>  [<ffffffff8118401a>] do_vfs_ioctl+0x3aa/0x580
>  [<ffffffff81184271>] sys_ioctl+0x81/0xa0
>  [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b

Comment 15 Dor Laor 2011-03-31 11:00:19 UTC

(In reply to comment #14)
> (In reply to comment #13)
> > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > 
> > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> in rhel6.1 and not in rhel6.0?
> 
> If teh guest make progress they are harmless.

QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by default?
(that's another time, not the softlockup regular one)

Can QE re-try w/o qxl device?

> 
> > 2. host dmesg:
> > 
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Now this is worrying. I don't see how kvm may be responsible for it though.
> Looks like more general system wide problem.

The host trace looks like additional bug, ain't it?

Comment 16 Gleb Natapov 2011-03-31 11:16:56 UTC

(In reply to comment #15)
> > > 2. host dmesg:
> > > 
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Now this is worrying. I don't see how kvm may be responsible for it though.
> > Looks like more general system wide problem.
> 
> The host trace looks like additional bug, ain't it?

The host trace can explain the guest trace. If qemu process does not make progress guest code will not make progress too.

Comment 17 john cooper 2011-03-31 18:43:31 UTC

(In reply to comment #15)

> QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by
> default?
> (that's another time, not the softlockup regular one)

It appears softlockup_thresh is enabled @ 60s by default for a 6.0 guest.

(In reply to comment #16)
> (In reply to comment #15)
> > > > 2. host dmesg:
> > > > 
> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > Now this is worrying. I don't see how kvm may be responsible for it though.
> > > Looks like more general system wide problem.
> > 
> > The host trace looks like additional bug, ain't it?
> 
> The host trace can explain the guest trace. If qemu process does not make
> progress guest code will not make progress too.

But in that case I'm wondering how wide/narrow the window is for
the above scenario to occur.  It seems the guest would still need
to see differential progress of lockup detection alone relative to
the periodic kick to create the appearance of a lockup.  IOW I'd
expect lockup time reset and threshold detect to generally track
one another as the guest execution schedules and stalls (sans
acute memory pressure, etc..)  Also I didn't see anywhere absolute
host time was entering into lockup detection though I may have missed
something.

(In reply to comment #14)
> (In reply to comment #13)
> > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > 
> > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> in rhel6.1 and not in rhel6.0?
> 
> If teh guest make progress they are harmless.
> 
> > 2. host dmesg:
> > 
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Now this is worrying. I don't see how kvm may be responsible for it though.
> Looks like more general system wide problem.

Could be.  But I'd assume the host recovered to the extent the dmesg could
be captured after the guest terminated.  So I'm unsure kvm+qemu can be ruled
out as some sort of cause/contribution just yet.

Comment 18 john cooper 2011-04-01 02:11:47 UTC

(In reply to comment #17)
> (In reply to comment #15)

> > > The host trace looks like additional bug, ain't it?
> > 
> > The host trace can explain the guest trace. If qemu process does not make
> > progress guest code will not make progress too.
> 
> But in that case I'm wondering how wide/narrow the window is for
> the above scenario to occur.  It seems the guest would still need
> to see differential progress of lockup detection alone relative to
> the periodic kick to create the appearance of a lockup.

Which upon a further thought is likely what is happening as the
detection occurs in timer expiration context and the watchdog
reset in task scheduled context.  So for a stalled guest it
seems quite feasible for backlogged detection to overrun a
scheduled watchdog reset, when the host scheduler resumes the
guest.

Comment 19 Suqin Huang 2011-04-01 03:08:07 UTC

(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > > 
> > > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> > I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> > in rhel6.1 and not in rhel6.0?
> > 
> > If teh guest make progress they are harmless.
> 
> QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by
> default?
it is enabled default

> (that's another time, not the softlockup regular one)
> 
> Can QE re-try w/o qxl device?

call trace without qxl&spice


Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el6.x86_64 #1 KVM
RIP: 0010:[<ffffffff81420453>]  [<ffffffff81420453>] sk_run_filter+0x43/0x910
RSP: 0018:ffff880028203c50  EFLAGS: 00000206
RAX: 0000000000000002 RBX: ffff880028203cf0 RCX: 000000000000002e
RDX: 0000000000000015 RSI: ffff8801194fc358 RDI: ffff88011684fd80
RBP: ffffffff81013ad3 R08: 0000000000000000 R09: 0000000000000800
R10: 0000000000000000 R11: ffff880028203cbc R12: ffff880028203bd0
R13: ffff8801194fc358 R14: 000000000000000b R15: 0000000000000800
FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f50b5978000 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>  [<ffffffff814a12da>] ? packet_rcv+0xea/0x440
 [<ffffffff81157b4b>] ? __kmalloc_node+0x7b/0x100
 [<ffffffff8140fcbb>] ? netif_receive_skb+0x2db/0x670
 [<ffffffffa004d395>] ? virtnet_poll+0x5c5/0x8b0 [virtio_net]
 [<ffffffff814108f3>] ? net_rx_action+0x103/0x210
 [<ffffffff81073ce7>] ? __do_softirq+0xb7/0x1e0
 [<ffffffff810d8960>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
 [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
 [<ffffffff81073ae5>] ? irq_exit+0x85/0x90
 [<ffffffff814d0645>] ? do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
 <EOI>  [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
 [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
 [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
 [<ffffffff814b172a>] ? rest_init+0x7a/0x80
 [<ffffffff818c3f19>] ? start_kernel+0x413/0x41f
 [<ffffffff818c333a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff818c3438>] ? x86_64_start_kernel+0xfa/0x109

Comment 20 Gleb Natapov 2011-04-03 07:18:03 UTC

(In reply to comment #19)
> (In reply to comment #15)
> > (In reply to comment #14)
> > > (In reply to comment #13)
> > > > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > > > 
> > > > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> > > I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> > > in rhel6.1 and not in rhel6.0?
> > > 
> > > If teh guest make progress they are harmless.
> > 
> > QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by
> > default?
> it is enabled default
> 
> > (that's another time, not the softlockup regular one)
> > 
> > Can QE re-try w/o qxl device?
> 
> call trace without qxl&spice
> 
Is this softlockup message? You dropped first line. If yes, is this on a host or in a guest? I do not care about softlockups in a guest as long as the guest continue running. Can you reliable reproduce host lockup?

> 
> Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el6.x86_64 #1 KVM
> RIP: 0010:[<ffffffff81420453>]  [<ffffffff81420453>] sk_run_filter+0x43/0x910
> RSP: 0018:ffff880028203c50  EFLAGS: 00000206
> RAX: 0000000000000002 RBX: ffff880028203cf0 RCX: 000000000000002e
> RDX: 0000000000000015 RSI: ffff8801194fc358 RDI: ffff88011684fd80
> RBP: ffffffff81013ad3 R08: 0000000000000000 R09: 0000000000000800
> R10: 0000000000000000 R11: ffff880028203cbc R12: ffff880028203bd0
> R13: ffff8801194fc358 R14: 000000000000000b R15: 0000000000000800
> FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00007f50b5978000 CR3: 0000000001001000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>  [<ffffffff814a12da>] ? packet_rcv+0xea/0x440
>  [<ffffffff81157b4b>] ? __kmalloc_node+0x7b/0x100
>  [<ffffffff8140fcbb>] ? netif_receive_skb+0x2db/0x670
>  [<ffffffffa004d395>] ? virtnet_poll+0x5c5/0x8b0 [virtio_net]
>  [<ffffffff814108f3>] ? net_rx_action+0x103/0x210
>  [<ffffffff81073ce7>] ? __do_softirq+0xb7/0x1e0
>  [<ffffffff810d8960>] ? handle_IRQ_event+0x60/0x170
>  [<ffffffff810142cc>] ? call_softirq+0x1c/0x30
>  [<ffffffff81015f35>] ? do_softirq+0x65/0xa0
>  [<ffffffff81073ae5>] ? irq_exit+0x85/0x90
>  [<ffffffff814d0645>] ? do_IRQ+0x75/0xf0
>  [<ffffffff81013ad3>] ? ret_from_intr+0x0/0x11
>  <EOI>  [<ffffffff8103be8b>] ? native_safe_halt+0xb/0x10
>  [<ffffffff8101bd3d>] ? default_idle+0x3d/0x90
>  [<ffffffff81011e96>] ? cpu_idle+0xb6/0x110
>  [<ffffffff814b172a>] ? rest_init+0x7a/0x80
>  [<ffffffff818c3f19>] ? start_kernel+0x413/0x41f
>  [<ffffffff818c333a>] ? x86_64_start_reservations+0x125/0x129
>  [<ffffffff818c3438>] ? x86_64_start_kernel+0xfa/0x109

Comment 21 RHEL Program Management 2011-04-04 01:46:35 UTC

Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 22 Suqin Huang 2011-04-07 10:40:22 UTC

(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #15)
> > > (In reply to comment #14)
> > > > (In reply to comment #13)
> > > > > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > > > > 
> > > > > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> > > > I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> > > > in rhel6.1 and not in rhel6.0?
> > > > 
> > > > If teh guest make progress they are harmless.
> > > 
> > > QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by
> > > default?
> > it is enabled default
> > 
> > > (that's another time, not the softlockup regular one)
> > > 
> > > Can QE re-try w/o qxl device?
> > 
> > call trace without qxl&spice
> > 
> Is this softlockup message? You dropped first line. If yes, is this on a host
> or in a guest? I do not care about softlockups in a guest as long as the guest
> continue running. Can you reliable reproduce host lockup?

Can not reproduce host lockup without qxl&spice

Comment 23 Gleb Natapov 2011-04-07 11:17:24 UTC

(In reply to comment #22)
> (In reply to comment #20)
> > (In reply to comment #19)
> > > (In reply to comment #15)
> > > > (In reply to comment #14)
> > > > > (In reply to comment #13)
> > > > > > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > > > > > 
> > > > > > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> > > > > I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> > > > > in rhel6.1 and not in rhel6.0?
> > > > > 
> > > > > If teh guest make progress they are harmless.
> > > > 
> > > > QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by
> > > > default?
> > > it is enabled default
> > > 
> > > > (that's another time, not the softlockup regular one)
> > > > 
> > > > Can QE re-try w/o qxl device?
> > > 
> > > call trace without qxl&spice
> > > 
> > Is this softlockup message? You dropped first line. If yes, is this on a host
> > or in a guest? I do not care about softlockups in a guest as long as the guest
> > continue running. Can you reliable reproduce host lockup?
> 
> Can not reproduce host lockup without qxl&spice

And with?

Comment 24 Suqin Huang 2011-04-08 08:00:40 UTC

(In reply to comment #23)
> (In reply to comment #22)
> > (In reply to comment #20)
> > > (In reply to comment #19)
> > > > (In reply to comment #15)
> > > > > (In reply to comment #14)
> > > > > > (In reply to comment #13)
> > > > > > > 1. guest didn't crash if disable nmi_watchdog, but still have call trace
> > > > > > > 
> > > > > > > BUG: soft lockup - CPU#0 stuck for 269s! [swapper:0]
> > > > > > I thought we disabled soft lockup reporting when kernel runs as a guest. May be
> > > > > > in rhel6.1 and not in rhel6.0?
> > > > > > 
> > > > > > If teh guest make progress they are harmless.
> > > > > 
> > > > > QE, did someone add nmi_watchdog=1 to the guest cmdline or is it enabled by
> > > > > default?
> > > > it is enabled default
> > > > 
> > > > > (that's another time, not the softlockup regular one)
> > > > > 
> > > > > Can QE re-try w/o qxl device?
> > > > 
> > > > call trace without qxl&spice
> > > > 
> > > Is this softlockup message? You dropped first line. If yes, is this on a host
> > > or in a guest? I do not care about softlockups in a guest as long as the guest
> > > continue running. Can you reliable reproduce host lockup?
> > 
> > Can not reproduce host lockup without qxl&spice
> 
> And with?
with vnc

Comment 25 Gleb Natapov 2011-04-08 09:22:22 UTC

(In reply to comment #24)
> > 
> > And with?
> with vnc

I mean can you reproduce host lockup with qxl&spice?

Comment 26 Suqin Huang 2011-04-12 05:55:07 UTC

(In reply to comment #25)
> (In reply to comment #24)
> > > 
> > > And with?
> > with vnc
> 
> I mean can you reproduce host lockup with qxl&spice?

re-test several times with qxl&spice, can not reproduce host lockup, only get guest call trace

Comment 27 Ronen Hod 2011-06-09 12:59:37 UTC

Opened a https://bugzilla.redhat.com/show_bug.cgi?id=712076 instead
Reasons:
1. The original bug was fixed
2. Since then other bugs and solutions were mentioned in comments
3. The relevant one is in the new bug (Need to disable timer watchdogs in guests)
4. The leftover (host lockup) cannot be reproduced