Bug 620512
Summary: | RHEL5 KVM guest crashes sometimes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Rik van Riel <riel> |
Component: | kernel | Assignee: | Karen Noel <knoel> |
Status: | CLOSED WONTFIX | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 5.5 | CC: | dgibson, gcosta |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-08-11 09:01:56 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Rik van Riel
2010-08-02 18:17:07 UTC
Ok, my first wild hunch on this, is that it is interrupt-related, and not directly kvmclock. On RHEL5-64, due to lack of clocksources impl, we still call kvmclock only when we get timer interrupts. Are you seeing any other strange stuff around it? Does it happen, by any chance, with other clocks ? I got another hang overnight. And yes, I can believe this is related to interrupts, or a memory crunch, or something along those lines. BUG: soft lockup - CPU#1 stuck for 10s! [swapper:0] CPU 1: Modules linked in: lockd sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy virtio_net ata_piix libata scsi_mod i2c_piix4 i2c_core ide_cd serio_raw cdrom pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Not tainted 2.6.18-164.11.1.el5 #1 RIP: 0010:[<ffffffff8006b31d>] [<ffffffff8006b31d>] default_idle+0x29/0x50 RSP: 0018:ffff810037c9fef0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff80302698 RBP: ffff810037c802f0 R08: ffff810037c9e000 R09: 000000000000003f R10: ffff81000277c008 R11: ffff81006f2b2d30 R12: 0000000005f2c07d R13: 00025e7d77b5531f R14: ffff8100776fb040 R15: ffff810037c80100 FS: 0000000000000000(0000) GS:ffff81007ffbc7c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b56f3f17e34 CR3: 00000000761da000 CR4: 00000000000006e0 Call Trace: [<ffffffff8004947b>] cpu_idle+0x95/0xb8 [<ffffffff80077474>] start_secondary+0x498/0x4a7 BUG: soft lockup - CPU#0 stuck for 15s! [kjournald:1219] Kernel panic - not syncing: softlockup: hung tasks CPU 0: Modules linked in: lockd sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy virtio_net ata_piix libata scsi_mod i2c_piix4 i2c_core ide_cd serio_raw cdrom pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 1219, comm: kjournald Not tainted 2.6.18-164.11.1.el5 #1 RIP: 0010:[<ffffffff800122e6>] [<ffffffff800122e6>] __do_softirq+0x51/0x133 RSP: 0000:ffffffff8043df60 EFLAGS: 00000206 RAX: 0000000000000022 RBX: 0000000000000022 RCX: 7ffffffffffffffe RDX: ffff8100784bffd8 RSI: 0000000000000046 RDI: ffff810077972860 RBP: ffffffff8043dee0 R08: ffffffff8005e2fc R09: ffff810081f42000 R10: ffffffff8043df98 R11: ffffffff8808b53d R12: ffffffff8005dc8e R13: 0000000000000046 R14: ffffffff80077d08 R15: ffffffff8043dee0 FS: 0000000000000000(0000) GS:ffffffff803c1000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b56f3d13000 CR3: 0000000078bce000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006cb3c>] do_softirq+0x2c/0x85 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8808b53d>] :virtio_pci:vp_notify+0x0/0x1c [<ffffffff8005aa91>] generic_unplug_device+0x31/0x32 [<ffffffff880a5c2c>] :dm_mod:dm_table_unplug_all+0x3f/0x83 [<ffffffff880a4955>] :dm_mod:dm_request+0x11d/0x124 [<ffffffff880a3d80>] :dm_mod:dm_unplug_all+0x1d/0x28 [<ffffffff800153c1>] sync_buffer+0x36/0x3f [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e [<ffffffff8001538b>] sync_buffer+0x0/0x3f [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff800a00e5>] wake_bit_function+0x0/0x23 [<ffffffff88033da4>] :jbd:journal_commit_transaction+0x942/0x106a [<ffffffff8003dad1>] lock_timer_base+0x1b/0x3c [<ffffffff8803758c>] :jbd:kjournald+0xc1/0x213 [<ffffffff800a00b7>] autoremove_wake_function+0x0/0x2e [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff880374cb>] :jbd:kjournald+0x0/0x213 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032950>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032852>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Kernel panic - not syncing: softlockup: hung tasks BUG: soft lockup - CPU#2 stuck for 13s! [swapper:0] CPU 2: Modules linked in: lockd sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy virtio_net ata_piix libata scsi_mod i2c_piix4 i2c_core ide_cd serio_raw cdrom pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Not tainted 2.6.18-164.11.1.el5 #1 RIP: 0010:[<ffffffff8006b31d>] [<ffffffff8006b31d>] default_idle+0x29/0x50 RSP: 0018:ffff810002757ef0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff80302698 RBP: ffff810037c8f270 R08: ffff810002756000 R09: 000000000000003e R10: ffff81000277c038 R11: ffff81006c971c90 R12: 000000001ae5412b R13: 00025e7d6f25c33b R14: ffff810037fee860 R15: ffff810037c8f080 FS: 0000000000000000(0000) GS:ffff810037c8ee40(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b56f3c3775d CR3: 000000006bc7d000 CR4: 00000000000006e0 Call Trace: [<ffffffff8004947b>] cpu_idle+0x95/0xb8 [<ffffffff80077474>] start_secondary+0x498/0x4a7 Kernel panic - not syncing: softlockup: hung tasks <3>BUG: scheduling while atomic: kjournald/0x10010100/1219 Call Trace: <IRQ> [<ffffffff8006243d>] __sched_text_start+0x7d/0xbd6 [<ffffffff80091d2d>] printk+0x52/0xbd [<ffffffff800914dd>] __call_console_drivers+0x5b/0x69 [<ffffffff8008df39>] __cond_resched+0x1c/0x44 [<ffffffff800630cb>] cond_resched+0x37/0x42 [<ffffffff801ac263>] do_unblank_screen+0x16/0x133 [<ffffffff80081cba>] bust_spinlocks+0x1c/0x46 [<ffffffff8009131f>] panic+0x88/0x1eb [<ffffffff8009e654>] kernel_text_address+0x1a/0x26 [<ffffffff8006b921>] dump_trace+0x206/0x22f [<ffffffff8006b97e>] show_trace+0x34/0x47 [<ffffffff800b9f8c>] softlockup_tick+0x15b/0x167 [<ffffffff8001538b>] sync_buffer+0x0/0x3f [<ffffffff80097a6a>] update_process_times+0x4e/0x78 [<ffffffff800776cb>] smp_local_timer_interrupt+0x2f/0x66 [<ffffffff8001538b>] sync_buffer+0x0/0x3f [<ffffffff80077d08>] smp_apic_timer_interrupt+0x41/0x47 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c [<ffffffff8808b53d>] :virtio_pci:vp_notify+0x0/0x1c [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff800122e6>] __do_softirq+0x51/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006cb3c>] do_softirq+0x2c/0x85 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8808b53d>] :virtio_pci:vp_notify+0x0/0x1c [<ffffffff8005aa91>] generic_unplug_device+0x31/0x32 [<ffffffff880a5c2c>] :dm_mod:dm_table_unplug_all+0x3f/0x83 [<ffffffff880a4955>] :dm_mod:dm_request+0x11d/0x124 [<ffffffff880a3d80>] :dm_mod:dm_unplug_all+0x1d/0x28 [<ffffffff800153c1>] sync_buffer+0x36/0x3f [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e [<ffffffff8001538b>] sync_buffer+0x0/0x3f [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff800a00e5>] wake_bit_function+0x0/0x23 [<ffffffff88033da4>] :jbd:journal_commit_transaction+0x942/0x106a [<ffffffff8003dad1>] lock_timer_base+0x1b/0x3c [<ffffffff8803758c>] :jbd:kjournald+0xc1/0x213 [<ffffffff800a00b7>] autoremove_wake_function+0x0/0x2e [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff880374cb>] :jbd:kjournald+0x0/0x213 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032950>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032852>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Likely a Fedora 12 bug; unsupported. Could this be a duplicate of bug 750237? |