Bug 1019584 - apic_timer_interrupt kernel panic in kvm [NEEDINFO]
apic_timer_interrupt kernel panic in kvm
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Vivek Goyal
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-16 02:19 EDT by Attila Fazekas
Modified: 2014-03-10 10:40 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-03-10 10:40:05 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
jforbes: needinfo?


Attachments (Terms of Use)
serial_console.log (66.27 KB, text/octet-stream)
2013-10-16 02:19 EDT, Attila Fazekas
no flags Details
libvirt.xml (1.51 KB, text/xml)
2013-10-16 02:20 EDT, Attila Fazekas
no flags Details

  None (edit)
Description Attila Fazekas 2013-10-16 02:19:27 EDT
Created attachment 812768 [details]
serial_console.log

Description of problem:

Kernel panic on IRQ handling.

The instance ran an openstack test suite (tempest), the instance did various operations,
including iscsi operations and running soft qemu processes.

I do not have a reproducer.

The full serial console log attached.

Version-Release number of selected component (if applicable):
Not tainted 3.11.3-201.fc19.x86_64.debug

Additional info:
Host system:
kernel: 3.10.14-100.fc18.x86_64 
qemu-kvm-1.2.2-14.fc18.x86_64
Comment 1 Attila Fazekas 2013-10-16 02:20:19 EDT
Created attachment 812770 [details]
libvirt.xml
Comment 2 Attila Fazekas 2013-10-16 06:19:59 EDT
virsh dump created.

It has the full memory content of the virtual machine.

The 3.8GiB original file is compressed to 557_864_352 byte by xz.
https://docs.google.com/file/d/0B7DSkY_fWI88RzNjR1JET2pRbnc

Note:
The xfs_buf_iodone_work in the serial console probably not a related issue. I tried to mount a not formatted loopback file, and it obviously did not worked.
Comment 3 Josh Boyer 2013-10-16 08:32:16 EDT
[ 2252.329039] general protection fault: 0000 [#1] SMP 
[ 2252.330008] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ebt_arp ebt_ip xt_nat xfs libcrc32c iptable_mangle kvm nbd iptable_nat nf_nat_ipv4 tun ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE nf_nat xt_CHECKSUM bridge stp llc joydev microcode nf_conntrack_ipv4 nf_defrag_ipv4 virtio_balloon xt_conntrack serio_raw virtio_net nf_conntrack cirrus ttm drm_kms_helper drm mperf i2c_piix4 i2c_core nfsd auth_rpcgss nfs_acl lockd binfmt_misc sunrpc crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_blk ata_generic pata_acpi [last unloaded: iptable_mangle]
[ 2252.330008] CPU: 0 PID: 20896 Comm: nova-conductor Not tainted 3.11.3-201.fc19.x86_64.debug #1
[ 2252.340295] Hardware name: Fedora Project OpenStack Nova, BIOS Bochs 01/01/2011
[ 2252.340295] task: ffff88005266a4b0 ti: ffff88003ad72000 task.ti: ffff88003ad72000
[ 2252.340295] RIP: 0010:[<ffffffff810e9754>]  [<ffffffff810e9754>] __lock_acquire+0x54/0x1b20
[ 2252.340295] RSP: 0000:ffff88011b203d20  EFLAGS: 00010046
[ 2252.340295] RAX: 0000000000000046 RBX: 0000000000000002 RCX: 0000000000000000
[ 2252.340295] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b83
[ 2252.340295] RBP: ffff88011b203dd0 R08: 0000000000000002 R09: 0000000000000001
[ 2252.340295] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88005266a4b0
[ 2252.340295] R13: 0000000000000000 R14: 6b6b6b6b6b6b6b83 R15: 0000000000000000
[ 2252.340295] FS:  00007fbf1d0b3740(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
[ 2252.340295] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2252.340295] CR2: 00007ffdad4d2000 CR3: 000000003cb63000 CR4: 00000000000407f0
[ 2252.359358] Stack:
[ 2252.359358]  ffff88005266a4b0 0000000000000269 ffffffff81c47d40 0000000000000000
[ 2252.359358]  ffff88011b203df8 ffffffff810e99f5 ffffffff810b77ff ffff88005266a4b0
[ 2252.359358]  000000035266abb0 ffff88005266a4b0 ffffffff81731876 ffff88011b203d88
[ 2252.359358] Call Trace:
[ 2252.359358]  <IRQ> 
[ 2252.359358]  [<ffffffff810e99f5>] ? __lock_acquire+0x2f5/0x1b20
[ 2252.359358]  [<ffffffff810b77ff>] ? local_clock+0x5f/0x70
[ 2252.359358]  [<ffffffff81731876>] ? _raw_spin_unlock_irqrestore+0x36/0x70
[ 2252.359358]  [<ffffffff810565bf>] ? kvm_clock_read+0x2f/0x50
[ 2252.359358]  [<ffffffff81021859>] ? sched_clock+0x9/0x10
[ 2252.359358]  [<ffffffff810b752d>] ? sched_clock_local+0x1d/0x80
[ 2252.359358]  [<ffffffff810eba12>] lock_acquire+0xa2/0x1f0
[ 2252.359358]  [<ffffffff8135d549>] ? __blkg_release_rcu+0x79/0x280
[ 2252.359358]  [<ffffffff81731772>] _raw_spin_lock_irq+0x52/0x90
[ 2252.359358]  [<ffffffff8135d549>] ? __blkg_release_rcu+0x79/0x280
[ 2252.359358]  [<ffffffff8135d549>] __blkg_release_rcu+0x79/0x280
[ 2252.359358]  [<ffffffff8135d5c0>] ? __blkg_release_rcu+0xf0/0x280
[ 2252.359358]  [<ffffffff81132e52>] rcu_process_callbacks+0x202/0x7d0
[ 2252.359358]  [<ffffffff8107b3c7>] __do_softirq+0x107/0x410
[ 2252.359358]  [<ffffffff8107b8a5>] irq_exit+0xc5/0xd0
[ 2252.359358]  [<ffffffff8173dd85>] smp_apic_timer_interrupt+0x45/0x60
[ 2252.359358]  [<ffffffff8173c6f2>] apic_timer_interrupt+0x72/0x80
[ 2252.359358]  <EOI> 
[ 2252.359358]  [<ffffffff81732598>] ? retint_swapgs+0x13/0x1b
[ 2252.359358] Code: 85 c0 8b 05 ef b6 bb 00 41 0f 45 d8 85 c0 0f 84 0b 01 00 00 8b 05 55 12 ff 00 49 89 fe 41 89 f7 41 89 d3 85 c0 0f 84 0c 01 00 00 <49> 8b 06 ba 01 00 00 00 48 3d 60 bf 13 82 0f 44 da 41 83 ff 01 
[ 2252.410538] RIP  [<ffffffff810e9754>] __lock_acquire+0x54/0x1b20
[ 2252.410538]  RSP <ffff88011b203d20>
[ 2252.410538] ---[ end trace 125adbb6be183141 ]---
Comment 4 Attila Fazekas 2013-11-25 03:04:37 EST
The host CPU is an Ivy Bridge, "Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz".
(family 6, model 58, stepping 9, microcode 0x19)

The qemu does not supports PEBS emulation, so in the guests kernel log you can see:

'perf_event_intel: PEBS disabled due to CPU errata, please upgrade microcode'

On the host system the microcode is up to date.

Last time I just used the 'debug' kernel by accident, I haven't seen the issue with a not debug kernel yet.
Comment 5 Marcelo Tosatti 2013-12-19 14:41:52 EST


        /*
         * Lockdep should run with IRQs disabled, otherwise we could
         * get an interrupt which would want to take locks, which would
         * end up in lockdep and have you got a head-ache already?
         */
        if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
                return 0;

        if (lock->key == &__lockdep_no_validate__)
                check = 1;

   0xffffffff810e9754 <+84>:    mov    (%r14),%rax      <--- OOPS
   0xffffffff810e9757 <+87>:    mov    $0x1,%edx
   0xffffffff810e975c <+92>:    cmp    $0xffffffff8213bf60,%rax

ffffffff8213bf60 B __lockdep_no_validate__

R14: 6b6b6b6b6b6b6b83 

Kernel packages at 

http://kojipkgs.fedoraproject.org//packages/kernel/3.11.3/201.fc19/x86_64/kernel-debug-debuginfo-3.11.3-201.fc19.x86_64.rpm
Comment 6 Justin M. Forbes 2014-01-03 17:08:32 EST
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.
Comment 7 Justin M. Forbes 2014-03-10 10:40:05 EDT
*********** MASS BUG UPDATE **************

This bug has been in a needinfo state for more than 1 month and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 19, please feel free to reopen the bug and provide the additional information requested.

Note You need to log in before you can comment on or make changes to this bug.