Description of problem: Using openvswitch 1.7.1 & Ovirt 3.2 nightly: create bridge ovirtmgmt add port p2p1 to bridge ovirtmgmt restart system soft lockup Additional info: BUG: soft lockup - CPU#0 stuck for 23s! [sh:1268] Modules linked in: dm_service_time ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables rfcomm bnep xfs vfat fat vhost_net tun macvtap macvlan snd_hda_codec_hdmi snd_hda_codec_realtek arc4 snd_hda_intel snd_hda_codec snd_hwdep uvcvideo ath9k snd_seq videobuf2_vmalloc videobuf2_memops snd_seq_device videobuf2_core ath9k_common ath9k_hw snd_pcm videodev btusb iTCO_wdt coretemp ath dell_wmi media dell_laptop sparse_keymap kvm_intel iTCO_vendor_support dcdbas bluetooth mac80211 kvm snd_page_alloc snd_timer cfg80211 snd mei lpc_ich rfkill soundcore microcode i2c_i801 mfd_core uinput nouveau mxm_wmi i915 i2c_algo_bit ttm drm_kms_helper crc32c_intel sdhci_pci ghash_clmulni_intel drm sdhci r8169 mmc_core mii i2c_core wmi video dm_multipath sunrpc CPU 0 Pid: 1268, comm: sh Not tainted 3.7.2-204.fc18.x86_64 #1 Dell Inc. XPS L521X/029M77 RIP: 0010:[<ffffffff8106e397>] [<ffffffff8106e397>] lock_timer_base.isra.37+0x27/0x70 RSP: 0018:ffff88024cfc5cd8 EFLAGS: 00000246 RAX: ffff88024cfc5fd8 RBX: 0000000000000000 RCX: 0000000000000080 RDX: 0000000000000000 RSI: ffff88024cfc5d10 RDI: ffff88024c5f94e8 RBP: ffff88024cfc5cf8 R08: ffff88025f216a40 R09: 00000000ffffffff R10: 0000000000016a40 R11: ffffffff812dd0e2 R12: ffff88025f5edb08 R13: 0000000000000000 R14: ffffffff00000000 R15: 0000000000000000 FS: 00007f7e580b2740(0000) GS:ffff88025f200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000038c8a74100 CR3: 000000024b86e000 CR4: 00000000001407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sh (pid: 1268, threadinfo ffff88024cfc4000, task ffff88024cbd4560) Stack: ffff88024c5f94d0 ffff88024c5f9400 ffff88024ce807c8 ffff88024c5f9520 ffff88024cfc5d28 ffffffff8106eca0 ffff88024cfc5e18 ffffffff8113901d ffff88024c5f94d0 ffff88024c5f9400 ffff88024cfc5d48 ffffffff8106ed4a Call Trace: [<ffffffff8106eca0>] try_to_del_timer_sync+0x20/0x70 [<ffffffff8113901d>] ? __alloc_pages_nodemask+0x18d/0x990 [<ffffffff8106ed4a>] del_timer_sync+0x5a/0x70 [<ffffffff812c145e>] ? elevator_alloc.isra.7+0x8e/0x100 [<ffffffff812e23b6>] cfq_exit_queue+0x36/0xf0 [<ffffffff812c11b8>] elevator_exit+0x38/0x60 [<ffffffff812c1a78>] elevator_change+0x148/0x230 [<ffffffff812c265b>] elv_iosched_store+0x2b/0x60 [<ffffffff812ca7f4>] queue_attr_store+0x64/0xc0 [<ffffffff81207558>] sysfs_write_file+0xd8/0x150 [<ffffffff811950fc>] vfs_write+0xac/0x180 [<ffffffff81195442>] sys_write+0x52/0xa0 [<ffffffff8163989e>] ? do_page_fault+0xe/0x10 [<ffffffff8163ded9>] system_call_fastpath+0x16/0x1b Code: 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 4c 89 6d f0 4c 89 75 f8 49 89 fd 48 89 5d e0 4c 89 65 e8 49 89 f6 49 8b 5d 00 <49> 89 dc 49 83 e4 fc 74 31 4c 89 e7 e8 f8 74 5c 00 49 89 06 49
boot in various kernels: 3.7.5-201.fc18.x86_64, 3.7.5-201.fc18.x86_64, and older ones also. The booting process can stop during plymouth or after X starts Package: kernel OS Release: Fedora release 18 (Spherical Cow)
seems to be fixed with kernel 3.7.8-202.fc18.x64_86
I put in the usb port (USB3.0) usb key. Package: kernel OS Release: Fedora release 18 (Spherical Cow)
*** Bug 923840 has been marked as a duplicate of this bug. ***
I experience it in kernel-3.8.3-203.fc18.x86_64
I hit a similar problem today, while copying 500GB of data with scp. The bug appeared on the VM receiving the data. Strangely (for me!) the process didn't crash and the copy still continues. (I'll check the files once the copy ends) I run kernel 3.8.5-201.fc18.x86_64 on both host and vm. I run a single VM with 20 CPU, the host has 4. Here is the trace: [56008.198033] BUG: soft lockup - CPU#1 stuck for 29s! [scp:6597] [56008.198033] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables virtio_net i2c_piix4 i2c_core virtio_balloon microcode crc32c_intel ghash_clmulni_intel virtio_blk [56008.198033] CPU 1 [56008.198033] Pid: 6597, comm: scp Not tainted 3.8.5-201.fc18.x86_64 #1 Bochs Bochs [56008.198033] RIP: 0010:[<ffffffff813019e0>] [<ffffffff813019e0>] copy_user_generic_string+0x30/0x40 [56008.198033] RSP: 0018:ffff8804ff641b58 EFLAGS: 00010246 [56008.198033] RAX: 00007f68c3c51680 RBX: ffff8804fe34b500 RCX: 0000000000000200 [56008.198033] RDX: 0000000000000000 RSI: 00007f68c3c51680 RDI: ffff880062600000 [56008.198033] RBP: ffff8804ff641b68 R08: ffff880062600000 R09: 00000000fffff000 [56008.198033] R10: ffffea0001898000 R11: 0000000000000000 R12: 000000001a29c000 [56008.198033] R13: ffff8804d93c6aa0 R14: 0000000000000004 R15: ffff8804ff641ad8 [56008.198033] FS: 00007f68c1a30800(0000) GS:ffff88050a640000(0000) knlGS:0000000000000000 [56008.198033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [56008.198033] CR2: 00007f1d0d2e8000 CR3: 00000004fe287000 CR4: 00000000000407e0 [56008.198033] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [56008.198033] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [56008.198033] Process scp (pid: 6597, threadinfo ffff8804ff640000, task ffff8804fdc1dd80) [56008.198033] Stack: [56008.198033] ffffffff81131b9a 000000001a29c000 ffff8804ff641c38 ffffffff81131d98 [56008.198033] ffff8804ff641be0 ffffffff811c627f ffff8804ff641e58 000000001a29c000 [56008.198033] 0000000000001000 ffff8804fdc1dd80 ffff8804ff641fd8 00000000811b6591 [56008.198033] Call Trace: [56008.198033] [<ffffffff81131b9a>] ? iov_iter_copy_from_user_atomic+0x6a/0xa0 [56008.198033] [<ffffffff81131d98>] generic_file_buffered_write+0x148/0x280 [56008.198033] [<ffffffff811c627f>] ? __mark_inode_dirty+0x3f/0x200 [56008.198033] [<ffffffff81133da1>] __generic_file_aio_write+0x1d1/0x3d0 [56008.198033] [<ffffffff8113401f>] generic_file_aio_write+0x7f/0x100 [56008.198033] [<ffffffff8121c509>] ext4_file_write+0xa9/0x430 [56008.198033] [<ffffffff8119d347>] do_sync_write+0xa7/0xe0 [56008.198033] [<ffffffff8119d9fc>] vfs_write+0xac/0x180 [56008.198033] [<ffffffff8119dd42>] sys_write+0x52/0xa0 [56008.198033] [<ffffffff810df1ac>] ? __audit_syscall_exit+0x20c/0x2c0 [56008.198033] [<ffffffff81658c19>] system_call_fastpath+0x16/0x1b [56008.198033] Code: 74 30 83 fa 08 72 27 89 f9 83 e1 07 74 15 83 e9 08 f7 d9 29 ca 8a 06 88 07 48 ff c6 48 ff c7 ff c9 75 f2 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 31 c0 66 66 90 c3 0f 1f 00 66 66 90 21 d2 [56496.033889] [sched_delayed] sched: RT throttling activated
Copying/pasting some discussion which is happening in a mail thread. There's not much to go on, but my first guess would be someone who plays in the timer code. Not sure why a _try_ function would hang! -Jeff
Yep. Looks like lock_timer_base() is hung for long time. May be spinning on spin_lock_irqsave(&base->lock, *flags); So tom, looks like it might not even be a CFQ issue but a timer related thing. Somebody who knows timer code will be a better person to look at it. Thanks Vivek
Nope. But this isn't likely a timer_base issue IMO. The code path to get here is (AFAICT) cfq_exit_queue() which attempts to shutdown the timer by calling cfq_shutdown_timer_wq(cfqd) which does del_timer_sync(&cfqd->idle_slice_timer); del_timer_sync() waits for the timer to become idle, that is that the timer is no longer executing it's function and is an a "sleep" state so in theory it is safe to kill. cfqd->idle_slice_timer's function is cfq_idle_slice_timer() which appears to call spin_lock_irqsave(cfqd->queue->queue_lock, flags). That's the lock I would look at. If the process has acquired this lock, then the next function I would look at is cfq_clear_cfqq_deep() which is called later on in cfq_idle_slice_timer(). IMO. ... I can't remember if RHEL7/upstream has /proc/sys/kernel/softlockup_panic or not. It would be useful to flip that on and get a vmcore. Prarit
(In reply to comment #9) > Nope. But this isn't likely a timer_base issue IMO. > > The code path to get here is (AFAICT) > > cfq_exit_queue() > > which attempts to shutdown the timer by calling > > cfq_shutdown_timer_wq(cfqd) > > which does > > del_timer_sync(&cfqd->idle_slice_timer); > > del_timer_sync() waits for the timer to become idle, that is that the timer > is > no longer executing it's function and is an a "sleep" state so in theory it > is > safe to kill. True. But backtrace also shows, lock_timer_base() as top level function. [<ffffffff8106e397>] lock_timer_base.isra.37+0x27/0x70 And wait for handler to finish happens outside that. May be it is just by chance that we are waiting for idle_slice_timer handler to finish in a loop, and calling lock_timer_base() in a loop and soft lockup detector caught it at wrong time. So I am not sure. But it opens two possibilities. - Idle slice handler is taking too long to finish - Something wrong with lock_timer_base() > > cfqd->idle_slice_timer's function is cfq_idle_slice_timer() which appears to > call spin_lock_irqsave(cfqd->queue->queue_lock, flags). That's the lock I > would > look at. If the process has acquired this lock, then the next function I > would > look at is cfq_clear_cfqq_deep() which is called later on in > cfq_idle_slice_timer(). > In cfq_exit_queue() path we deliberately don't take queue lock so that if handler is running somewhere, it can take the queue lock and finish. And by now we have flushed the queue. So there should not be any requests which will be processed by CFQ. So it is strange why handler will take such a long time to finish. > IMO. > > ... I can't remember if RHEL7/upstream has /proc/sys/kernel/softlockup_panic > or > not. It would be useful to flip that on and get a vmcore. I checked that softlockup_panic is present in F18. Reporter of the bz, can you please set that can capture the vmcore. Also are there easy steps to reproduce this problem. I have regular F18 installed in a kvm virtual machine. No ovirt, openswitch stuff. I am able to change the scheduler. Vivek
Just upgraded kernel to version 3.9.2-200.fc18.x86_64 problem persists...
(In reply to Ohad Basan from comment #11) > Just upgraded kernel to version 3.9.2-200.fc18.x86_64 > problem persists... Confirmed. Problem persists. Is there an older kernel where this bug doesn't occur?
This bug triggers for me on Fedora 18 ovirt nodes when I tried to add an iSCSI domain.
(In reply to Dax Kelson from comment #13) > This bug triggers for me on Fedora 18 ovirt nodes when I tried to add an > iSCSI domain. Any change you can capture the kernel core dump. You need to set softlockup_panic to make sure softlockup triggers panic. Of course also need to start kdump service.
(In reply to Vivek Goyal from comment #14) > (In reply to Dax Kelson from comment #13) > > This bug triggers for me on Fedora 18 ovirt nodes when I tried to add an > > iSCSI domain. > > Any change you can capture the kernel core dump. You need to set > softlockup_panic to make sure softlockup triggers panic. Of course also need > to start kdump service. KERNEL: /usr/lib/debug/lib/modules/3.9.2-200.fc18.x86_64/vmlinux DUMPFILE: /var/crash/127.0.0.1-2013.05.23-23:53:44/vmcore [PARTIAL DUMP] CPUS: 4 DATE: Thu May 23 17:53:43 2013 UPTIME: 00:07:24 LOAD AVERAGE: 2.50, 0.70, 0.30 TASKS: 198 NODENAME: station6.example.com RELEASE: 3.9.2-200.fc18.x86_64 VERSION: #1 SMP Mon May 13 13:59:47 UTC 2013 MACHINE: x86_64 (3092 Mhz) MEMORY: 9.9 GB PANIC: "Kernel panic - not syncing: softlockup: hung tasks" PID: 1730 COMMAND: "sh" TASK: ffff88029fffc650 [THREAD_INFO: ffff88029fc1c000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash> bt PID: 1730 TASK: ffff88029fffc650 CPU: 0 COMMAND: "sh" #0 [ffff8802afa03ce0] machine_kexec at ffffffff8103f79b #1 [ffff8802afa03d50] crash_kexec at ffffffff810c95c8 #2 [ffff8802afa03e20] panic at ffffffff81656f57 #3 [ffff8802afa03ea0] watchdog_timer_fn at ffffffff810eecd5 #4 [ffff8802afa03ed0] __run_hrtimer at ffffffff81086263 #5 [ffff8802afa03f20] hrtimer_interrupt at ffffffff81086b87 #6 [ffff8802afa03f90] smp_apic_timer_interrupt at ffffffff8166b789 #7 [ffff8802afa03fb0] apic_timer_interrupt at ffffffff8166a65d --- <IRQ stack> --- #8 [ffff88029fc1dc28] apic_timer_interrupt at ffffffff8166a65d [exception RIP: lock_timer_base.isra.37+35] RIP: ffffffff8106ecf3 RSP: ffff88029fc1dcd8 RFLAGS: 00000246 RAX: ffff88029fc1dfd8 RBX: 0000000000000000 RCX: 0000000000000080 RDX: 0000000000000000 RSI: ffff88029fc1dd10 RDI: ffff88029f842ce8 RBP: ffff88029fc1dcf8 R8: ffff8802afa16e00 R9: 00000000ffffffff R10: 0000000000016e00 R11: ffffffff812f6332 R12: 0000000000000000 R13: ffff880200000041 R14: ffff8802afde3b08 R15: 0000000000000002 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #9 [ffff88029fc1dd00] try_to_del_timer_sync at ffffffff8106f600 #10 [ffff88029fc1dd30] del_timer_sync at ffffffff8106f6aa #11 [ffff88029fc1dd50] cfq_exit_queue at ffffffff812fb6a6 #12 [ffff88029fc1dd90] elevator_exit at ffffffff812d6f18 #13 [ffff88029fc1ddb0] elevator_change at ffffffff812d7788 #14 [ffff88029fc1de20] elv_iosched_store at ffffffff812d838b #15 [ffff88029fc1de50] queue_attr_store at ffffffff812e0d24 #16 [ffff88029fc1dea0] sysfs_write_file at ffffffff81214768 #17 [ffff88029fc1df00] vfs_write at ffffffff8119fd5c #18 [ffff88029fc1df30] sys_write at ffffffff811a01f2 #19 [ffff88029fc1df80] system_call_fastpath at ffffffff81669a59 RIP: 00007fba5b287740 RSP: 00007fffe8217e78 RFLAGS: 00010206 RAX: 0000000000000001 RBX: ffffffff81669a59 RCX: 0000000000000008 RDX: 0000000000000009 RSI: 00007fba5bba5000 RDI: 0000000000000001 RBP: 00007fba5bba5000 R8: 000000000000000a R9: 00007fba5bb9b740 R10: 00007fffe8217c10 R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000000009 R14: 00007fba5b554280 R15: 0000000000000009 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash>
(In reply to Vivek Goyal from comment #14) > (In reply to Dax Kelson from comment #13) > > This bug triggers for me on Fedora 18 ovirt nodes when I tried to add an > > iSCSI domain. > > Any change you can capture the kernel core dump. You need to set > softlockup_panic to make sure softlockup triggers panic. Of course also need > to start kdump service. http://go.gurulabs.com/vmcore
Thanks for the vmcore. Did "bt -a" and rest of the 3 cpus seem to be idling. So it does not look like the case of contention on spinlock. crash> bt -a PID: 1730 TASK: ffff88029fffc650 CPU: 0 COMMAND: "sh" #0 [ffff8802afa03ce0] machine_kexec at ffffffff8103f79b #1 [ffff8802afa03d50] crash_kexec at ffffffff810c95c8 #2 [ffff8802afa03e20] panic at ffffffff81656f57 #3 [ffff8802afa03ea0] watchdog_timer_fn at ffffffff810eecd5 #4 [ffff8802afa03ed0] __run_hrtimer at ffffffff81086263 #5 [ffff8802afa03f20] hrtimer_interrupt at ffffffff81086b87 #6 [ffff8802afa03f90] smp_apic_timer_interrupt at ffffffff8166b789 #7 [ffff8802afa03fb0] apic_timer_interrupt at ffffffff8166a65d --- <IRQ stack> --- #8 [ffff88029fc1dc28] apic_timer_interrupt at ffffffff8166a65d [exception RIP: lock_timer_base.isra.37+35] RIP: ffffffff8106ecf3 RSP: ffff88029fc1dcd8 RFLAGS: 00000246 RAX: ffff88029fc1dfd8 RBX: 0000000000000000 RCX: 0000000000000080 RDX: 0000000000000000 RSI: ffff88029fc1dd10 RDI: ffff88029f842ce8 RBP: ffff88029fc1dcf8 R8: ffff8802afa16e00 R9: 00000000ffffffff R10: 0000000000016e00 R11: ffffffff812f6332 R12: 0000000000000000 R13: ffff880200000041 R14: ffff8802afde3b08 R15: 0000000000000002 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #9 [ffff88029fc1dd00] try_to_del_timer_sync at ffffffff8106f600 #10 [ffff88029fc1dd30] del_timer_sync at ffffffff8106f6aa #11 [ffff88029fc1dd50] cfq_exit_queue at ffffffff812fb6a6 #12 [ffff88029fc1dd90] elevator_exit at ffffffff812d6f18 #13 [ffff88029fc1ddb0] elevator_change at ffffffff812d7788 #14 [ffff88029fc1de20] elv_iosched_store at ffffffff812d838b #15 [ffff88029fc1de50] queue_attr_store at ffffffff812e0d24 #16 [ffff88029fc1dea0] sysfs_write_file at ffffffff81214768 #17 [ffff88029fc1df00] vfs_write at ffffffff8119fd5c #18 [ffff88029fc1df30] sys_write at ffffffff811a01f2 #19 [ffff88029fc1df80] system_call_fastpath at ffffffff81669a59 RIP: 00007fba5b287740 RSP: 00007fffe8217e78 RFLAGS: 00010206 RAX: 0000000000000001 RBX: ffffffff81669a59 RCX: 0000000000000008 RDX: 0000000000000009 RSI: 00007fba5bba5000 RDI: 0000000000000001 RBP: 00007fba5bba5000 R8: 000000000000000a R9: 00007fba5bb9b740 R10: 00007fffe8217c10 R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000000009 R14: 00007fba5b554280 R15: 0000000000000009 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b PID: 0 TASK: ffff8802a0cd1770 CPU: 1 COMMAND: "swapper/1" #0 [ffff8802afa87e60] crash_nmi_callback at ffffffff81037352 #1 [ffff8802afa87e70] nmi_handle.isra.0 at ffffffff816626f9 #2 [ffff8802afa87ec0] do_nmi at ffffffff81662810 #3 [ffff8802afa87ef0] end_repeat_nmi at ffffffff81661d01 [exception RIP: intel_idle+168] RIP: ffffffff81365aa8 RSP: ffff8802a0ccde08 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffff8802a0ccde08 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81365aa8 R8: ffffffff81365aa8 R9: 0000000000000018 R10: ffff8802a0ccde08 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000000000001 R14: ffff8802a0ccdfd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff8802a0ccde08] intel_idle at ffffffff81365aa8 #5 [ffff8802a0ccde40] cpuidle_enter at ffffffff815076d9 #6 [ffff8802a0ccde50] cpuidle_wrap_enter at ffffffff81508011 #7 [ffff8802a0ccdeb0] cpuidle_enter_tk at ffffffff81508080 #8 [ffff8802a0ccdec0] cpuidle_idle_call at ffffffff81507d45 #9 [ffff8802a0ccdf00] cpu_idle at ffffffff8101d12f PID: 0 TASK: ffff8802a0cd2ee0 CPU: 2 COMMAND: "swapper/2" #0 [ffff8802afb07e60] crash_nmi_callback at ffffffff81037352 #1 [ffff8802afb07e70] nmi_handle.isra.0 at ffffffff816626f9 #2 [ffff8802afb07ec0] do_nmi at ffffffff81662810 #3 [ffff8802afb07ef0] end_repeat_nmi at ffffffff81661d01 [exception RIP: intel_idle+168] RIP: ffffffff81365aa8 RSP: ffff8802a0ccfe08 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffff8802a0ccfe08 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81365aa8 R8: ffffffff81365aa8 R9: 0000000000000018 R10: ffff8802a0ccfe08 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000000000002 R14: ffff8802a0ccffd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff8802a0ccfe08] intel_idle at ffffffff81365aa8 #5 [ffff8802a0ccfe40] cpuidle_enter at ffffffff815076d9 #6 [ffff8802a0ccfe50] cpuidle_wrap_enter at ffffffff81508011 #7 [ffff8802a0ccfeb0] cpuidle_enter_tk at ffffffff81508080 #8 [ffff8802a0ccfec0] cpuidle_idle_call at ffffffff81507d45 #9 [ffff8802a0ccff00] cpu_idle at ffffffff8101d12f PID: 0 TASK: ffff8802a0cd4650 CPU: 3 COMMAND: "swapper/3" #0 [ffff8802afb87e60] crash_nmi_callback at ffffffff81037352 #1 [ffff8802afb87e70] nmi_handle.isra.0 at ffffffff816626f9 #2 [ffff8802afb87ec0] do_nmi at ffffffff81662810 #3 [ffff8802afb87ef0] end_repeat_nmi at ffffffff81661d01 [exception RIP: intel_idle+168] RIP: ffffffff81365aa8 RSP: ffff8802a0cd9e08 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffff8802a0cd9e08 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81365aa8 R8: ffffffff81365aa8 R9: 0000000000000018 R10: ffff8802a0cd9e08 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000000000003 R14: ffff8802a0cd9fd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff8802a0cd9e08] intel_idle at ffffffff81365aa8 #5 [ffff8802a0cd9e40] cpuidle_enter at ffffffff815076d9 #6 [ffff8802a0cd9e50] cpuidle_wrap_enter at ffffffff81508011 #7 [ffff8802a0cd9eb0] cpuidle_enter_tk at ffffffff81508080 #8 [ffff8802a0cd9ec0] cpuidle_idle_call at ffffffff81507d45 #9 [ffff8802a0cd9f00] cpu_idle at ffffffff8101d12f crash> bt -a PID: 1730 TASK: ffff88029fffc650 CPU: 0 COMMAND: "sh" #0 [ffff8802afa03ce0] machine_kexec at ffffffff8103f79b #1 [ffff8802afa03d50] crash_kexec at ffffffff810c95c8 #2 [ffff8802afa03e20] panic at ffffffff81656f57 #3 [ffff8802afa03ea0] watchdog_timer_fn at ffffffff810eecd5 #4 [ffff8802afa03ed0] __run_hrtimer at ffffffff81086263 #5 [ffff8802afa03f20] hrtimer_interrupt at ffffffff81086b87 #6 [ffff8802afa03f90] smp_apic_timer_interrupt at ffffffff8166b789 #7 [ffff8802afa03fb0] apic_timer_interrupt at ffffffff8166a65d --- <IRQ stack> --- #8 [ffff88029fc1dc28] apic_timer_interrupt at ffffffff8166a65d [exception RIP: lock_timer_base.isra.37+35] RIP: ffffffff8106ecf3 RSP: ffff88029fc1dcd8 RFLAGS: 00000246 RAX: ffff88029fc1dfd8 RBX: 0000000000000000 RCX: 0000000000000080 RDX: 0000000000000000 RSI: ffff88029fc1dd10 RDI: ffff88029f842ce8 RBP: ffff88029fc1dcf8 R8: ffff8802afa16e00 R9: 00000000ffffffff R10: 0000000000016e00 R11: ffffffff812f6332 R12: 0000000000000000 R13: ffff880200000041 R14: ffff8802afde3b08 R15: 0000000000000002 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #9 [ffff88029fc1dd00] try_to_del_timer_sync at ffffffff8106f600 #10 [ffff88029fc1dd30] del_timer_sync at ffffffff8106f6aa #11 [ffff88029fc1dd50] cfq_exit_queue at ffffffff812fb6a6 #12 [ffff88029fc1dd90] elevator_exit at ffffffff812d6f18 #13 [ffff88029fc1ddb0] elevator_change at ffffffff812d7788 #14 [ffff88029fc1de20] elv_iosched_store at ffffffff812d838b #15 [ffff88029fc1de50] queue_attr_store at ffffffff812e0d24 #16 [ffff88029fc1dea0] sysfs_write_file at ffffffff81214768 #17 [ffff88029fc1df00] vfs_write at ffffffff8119fd5c #18 [ffff88029fc1df30] sys_write at ffffffff811a01f2 #19 [ffff88029fc1df80] system_call_fastpath at ffffffff81669a59 RIP: 00007fba5b287740 RSP: 00007fffe8217e78 RFLAGS: 00010206 RAX: 0000000000000001 RBX: ffffffff81669a59 RCX: 0000000000000008 RDX: 0000000000000009 RSI: 00007fba5bba5000 RDI: 0000000000000001 RBP: 00007fba5bba5000 R8: 000000000000000a R9: 00007fba5bb9b740 R10: 00007fffe8217c10 R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000000009 R14: 00007fba5b554280 R15: 0000000000000009 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b PID: 0 TASK: ffff8802a0cd1770 CPU: 1 COMMAND: "swapper/1" #0 [ffff8802afa87e60] crash_nmi_callback at ffffffff81037352 #1 [ffff8802afa87e70] nmi_handle.isra.0 at ffffffff816626f9 #2 [ffff8802afa87ec0] do_nmi at ffffffff81662810 #3 [ffff8802afa87ef0] end_repeat_nmi at ffffffff81661d01 [exception RIP: intel_idle+168] RIP: ffffffff81365aa8 RSP: ffff8802a0ccde08 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffff8802a0ccde08 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81365aa8 R8: ffffffff81365aa8 R9: 0000000000000018 R10: ffff8802a0ccde08 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000000000001 R14: ffff8802a0ccdfd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff8802a0ccde08] intel_idle at ffffffff81365aa8 #5 [ffff8802a0ccde40] cpuidle_enter at ffffffff815076d9 #6 [ffff8802a0ccde50] cpuidle_wrap_enter at ffffffff81508011 #7 [ffff8802a0ccdeb0] cpuidle_enter_tk at ffffffff81508080 #8 [ffff8802a0ccdec0] cpuidle_idle_call at ffffffff81507d45 #9 [ffff8802a0ccdf00] cpu_idle at ffffffff8101d12f PID: 0 TASK: ffff8802a0cd2ee0 CPU: 2 COMMAND: "swapper/2" #0 [ffff8802afb07e60] crash_nmi_callback at ffffffff81037352 #1 [ffff8802afb07e70] nmi_handle.isra.0 at ffffffff816626f9 #2 [ffff8802afb07ec0] do_nmi at ffffffff81662810 #3 [ffff8802afb07ef0] end_repeat_nmi at ffffffff81661d01 [exception RIP: intel_idle+168] RIP: ffffffff81365aa8 RSP: ffff8802a0ccfe08 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffff8802a0ccfe08 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81365aa8 R8: ffffffff81365aa8 R9: 0000000000000018 R10: ffff8802a0ccfe08 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000000000002 R14: ffff8802a0ccffd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff8802a0ccfe08] intel_idle at ffffffff81365aa8 #5 [ffff8802a0ccfe40] cpuidle_enter at ffffffff815076d9 #6 [ffff8802a0ccfe50] cpuidle_wrap_enter at ffffffff81508011 #7 [ffff8802a0ccfeb0] cpuidle_enter_tk at ffffffff81508080 #8 [ffff8802a0ccfec0] cpuidle_idle_call at ffffffff81507d45 #9 [ffff8802a0ccff00] cpu_idle at ffffffff8101d12f PID: 0 TASK: ffff8802a0cd4650 CPU: 3 COMMAND: "swapper/3" #0 [ffff8802afb87e60] crash_nmi_callback at ffffffff81037352 #1 [ffff8802afb87e70] nmi_handle.isra.0 at ffffffff816626f9 #2 [ffff8802afb87ec0] do_nmi at ffffffff81662810 #3 [ffff8802afb87ef0] end_repeat_nmi at ffffffff81661d01 [exception RIP: intel_idle+168] RIP: ffffffff81365aa8 RSP: ffff8802a0cd9e08 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffff8802a0cd9e08 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81365aa8 R8: ffffffff81365aa8 R9: 0000000000000018 R10: ffff8802a0cd9e08 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000000000003 R14: ffff8802a0cd9fd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff8802a0cd9e08] intel_idle at ffffffff81365aa8 #5 [ffff8802a0cd9e40] cpuidle_enter at ffffffff815076d9 #6 [ffff8802a0cd9e50] cpuidle_wrap_enter at ffffffff81508011 #7 [ffff8802a0cd9eb0] cpuidle_enter_tk at ffffffff81508080 #8 [ffff8802a0cd9ec0] cpuidle_idle_call at ffffffff81507d45 #9 [ffff8802a0cd9f00] cpu_idle at ffffffff8101d12f
I suspect that timer->base filed is null and that's we are looping in lock_timer_base(). static struct tvec_base *lock_timer_base(struct timer_list *timer, unsigned long *flags) __acquires(timer->base->lock) { struct tvec_base *base; for (;;) { struct tvec_base *prelock_base = timer->base; base = tbase_get_base(prelock_base); if (likely(base != NULL)) { spin_lock_irqsave(&base->lock, *flags); if (likely(prelock_base == timer->base)) return base; /* The timer has migrated to another CPU */ spin_unlock_irqrestore(&base->lock, *flags); } cpu_relax(); } }
Following is disassembly of code around RIP(ffffffff8106ecf3) at the exception time. crash> disassemble /m 0xffffffff8106ecf3 Dump of assembler code for function lock_timer_base: 700 static struct tvec_base *lock_timer_base(struct timer_list *timer, 0xffffffff8106ecd0 <+0>: data32 data32 data32 xchg %ax,%ax 0xffffffff8106ecd5 <+5>: push %rbp 0xffffffff8106ecd6 <+6>: mov %rsp,%rbp 0xffffffff8106ecd9 <+9>: sub $0x20,%rsp 0xffffffff8106ecdd <+13>: mov %r13,-0x10(%rbp) 0xffffffff8106ece1 <+17>: mov %r14,-0x8(%rbp) 0xffffffff8106ece5 <+21>: mov %rdi,%r13 0xffffffff8106ece8 <+24>: mov %rbx,-0x20(%rbp) 0xffffffff8106ecec <+28>: mov %r12,-0x18(%rbp) 0xffffffff8106ecf0 <+32>: mov %rsi,%r14 701 unsigned long *flags) 702 __acquires(timer->base->lock) 703 { 704 struct tvec_base *base; 705 706 for (;;) { 707 struct tvec_base *prelock_base = timer->base; 0xffffffff8106ecf3 <+35>: mov 0x0(%r13),%rbx 708 base = tbase_get_base(prelock_base); 709 if (likely(base != NULL)) { 0xffffffff8106ecf7 <+39>: mov %rbx,%r12 0xffffffff8106ecfa <+42>: and $0xfffffffffffffffc,%r12 0xffffffff8106ecfe <+46>: je 0xffffffff8106ed31 <lock_timer_base+97> 710 spin_lock_irqsave(&base->lock, *flags); 0xffffffff8106ed00 <+48>: mov %r12,%rdi 0xffffffff8106ed03 <+51>: callq 0xffffffff81661340 <_raw_spin_lock_irqsave> 0xffffffff8106ed08 <+56>: mov %rax,(%r14) 711 if (likely(prelock_base == timer->base)) 0xffffffff8106ed0b <+59>: cmp 0x0(%r13),%rbx 0xffffffff8106ed0f <+63>: jne 0xffffffff8106ed26 <lock_timer_base+86> 712 return base; 713 /* The timer has migrated to another CPU */ 714 spin_unlock_irqrestore(&base->lock, *flags); 715 } 716 cpu_relax(); 717 } 718 } 0xffffffff8106ed11 <+65>: mov %r12,%rax I am not assembly code expert but I suspect that R13 contains the pointer to timer. Look at line 707 (RIP) and we do following. 707 struct tvec_base *prelock_base = timer->base; 0xffffffff8106ecf3 <+35>: mov 0x0(%r13),%rbx backtrace says rbx is 0000000000000000. and rbx should be containing the value of timer->base. So I suspect that timer->base is null and that's why we are looping. One thing I don't understand and that if R13 is pointer to timer, then why offset of base field is 0. It is 3rd field in the timer_list structure. May be R13 contains pointer to timer->base instead of timer? R13 is ffff880200000041 and I tried to access the contents at this memory location but it is not part of vmcore. Strange, even filtering should not have removed kernel memory buffers.
The bug does appear to me in 3.6.10-4 since I upgraded that the problem showed up
The bug does not appear to me in 3.6.10-4 since I upgraded that the problem showed up
can we try a simple patch like following. This is just to verify that timer->base is null for whatever reason and that's why we are spinning in for loop. kernel/timer.c | 3 +++ 1 file changed, 3 insertions(+) Index: linux-2.6/kernel/timer.c =================================================================== --- linux-2.6.orig/kernel/timer.c 2013-05-03 22:05:19.000000000 -0400 +++ linux-2.6/kernel/timer.c 2013-05-28 23:43:38.142900050 -0400 @@ -713,6 +713,9 @@ static struct tvec_base *lock_timer_base return base; /* The timer has migrated to another CPU */ spin_unlock_irqrestore(&base->lock, *flags); + } else { + printk("timer->base is null. timer=%p timer->base=%p." + "Will retry.\n", timer, &timer->base); } cpu_relax(); }
(In reply to Ohad Basan from comment #21) > The bug does not appear to me in 3.6.10-4 > since I upgraded that the problem showed up I confirm that we have several ovirt F 18 nodes and have downgraded all the kernels to 3.6.10-4 to have them boot with no hangs. With newer kernel sometimes (I mean most of the times like 10 times on 11 boots) it hangs on boot and sometimes it doesn't. It detects soft lock on one random CPU and then it soft locks ALL the CPU's (8 core and 4 core servers), so the boot never ends and the server never gets accessible. To get passed with kernel higher than 3.6.10, we need to manually reset the server several times (no fixed number of resets unfortunately) till it gets passed. With 3.6.10 it always boots without any problem After booting this way, the node works fine without any anomaly till it is rebooted Booting with 3.6.10 it always boots without any problem Hope this bug gets fixed since it prevents unattended boots of the servers. Thank you and best regards
Sorry for reposting, but I forgot to suggest to raise the severity level of this bug to something like CRITICAL since it doesn't allow nodes to become operational Best regards
(In reply to Alessandro from comment #24) > Sorry for reposting, but I forgot to suggest to raise the severity level of > this bug to something like CRITICAL since it doesn't allow nodes to become > operational > > Best regards that would be great idea imho it's really problematic
(In reply to Vivek Goyal from comment #22) > can we try a simple patch like following. This is just to verify that > timer->base is null for whatever reason and that's why we are spinning in > for loop. I applied the patch to the latest F18 kernel as of today 3.9.4-200 Here is some dmesg output: ... [ 735.571449] scsi6 : iSCSI Initiator over TCP/IP [ 736.578398] scsi 6:0:0:0: Direct-Access LIO-ORG ovirt 4.0 PQ: 0 ANSI: 5 [ 736.578809] sd 6:0:0:0: Attached scsi generic sg2 type 0 [ 736.579566] sd 6:0:0:0: [sdb] 629137408 512-byte logical blocks: (322 GB/299 GiB) [ 736.581363] sd 6:0:0:0: [sdb] Write Protect is off [ 736.581369] sd 6:0:0:0: [sdb] Mode Sense: 43 00 10 08 [ 736.583205] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 736.598929] sdb: unknown partition table [ 736.601265] sd 6:0:0:0: [sdb] Attached SCSI disk [ 736.700529] bio: create slab <bio-2> at 2 [ 736.704973] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.704977] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.704979] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.704980] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [snip many many lines] [ 736.705912] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.705913] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.705914] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.705916] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.705917] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.705918] timer->base is null. timer=ffff88029bdc0cd0 timer->base=ffff88029bdc0ce8.Will retry. [ 736.786968] FS-Cache: Loaded [ 736.800225] Key type dns_resolver registered [ 736.820192] FS-Cache: Netfs 'nfs' registered for caching [ 736.837758] SELinux: initialized (dev 0:34, type nfs), uses genfs_contexts ...
(In reply to Dax Kelson from comment #26) > (In reply to Vivek Goyal from comment #22) > > can we try a simple patch like following. This is just to verify that > > timer->base is null for whatever reason and that's why we are spinning in > > for loop. > > I applied the patch to the latest F18 kernel as of today 3.9.4-200 > > Here is some dmesg output: So this time softlock does not happen? I was expecting that if timer->base continues to be NULL, then we will see timer base null message for long time and then we will see softlock warning. That will confirm that timer->base remains null for a long time and we are spinning in a loop and that's why softlockup warning shows up. After that question will arise why timer->base is null for so long.
(In reply to Alessandro from comment #23) > (In reply to Ohad Basan from comment #21) > > The bug does not appear to me in 3.6.10-4 > > since I upgraded that the problem showed up > > I confirm that we have several ovirt F 18 nodes and have downgraded all the > kernels to 3.6.10-4 to have them boot with no hangs. > > With newer kernel sometimes (I mean most of the times like 10 times on 11 > boots) it hangs on boot and sometimes it doesn't. > > It detects soft lock on one random CPU and then it soft locks ALL the CPU's > (8 core and 4 core servers), so the boot never ends and the server never > gets accessible. > > To get passed with kernel higher than 3.6.10, we need to manually reset the > server several times (no fixed number of resets unfortunately) till it gets > passed. Try elevator=deadline on kernel command line. That might help atleast as a workaround for the problem till a solution is found.
(In reply to Vivek Goyal from comment #27) > (In reply to Dax Kelson from comment #26) > > (In reply to Vivek Goyal from comment #22) > > > can we try a simple patch like following. This is just to verify that > > > timer->base is null for whatever reason and that's why we are spinning in > > > for loop. > > > > I applied the patch to the latest F18 kernel as of today 3.9.4-200 > > > > Here is some dmesg output: > > So this time softlock does not happen? I was expecting that if timer->base > continues to be NULL, then we will see timer base null message for long time > and then we will see softlock warning. That will confirm that timer->base > remains null for a long time and we are spinning in a loop and that's why > softlockup warning shows up. > > After that question will arise why timer->base is null for so long. Or is it possible that printk() can force scheduling hence avoid soft lockup situation. Try trace_printk() instead and then we can read traces from debugfs tracing files.
(In reply to Alessandro from comment #23) > > It detects soft lock on one random CPU and then it soft locks ALL the CPU's > (8 core and 4 core servers), so the boot never ends and the server never > gets accessible. So when all core detect softlockup, where are rest of the cores spinning? Are they spinning at same place? Do you have backtraces of softlockup warnings on other cpus.
(In reply to Vivek Goyal from comment #30) > (In reply to Alessandro from comment #23) > > > > It detects soft lock on one random CPU and then it soft locks ALL the CPU's > > (8 core and 4 core servers), so the boot never ends and the server never > > gets accessible. > > So when all core detect softlockup, where are rest of the cores spinning? > Are they spinning at same place? Do you have backtraces of softlockup > warnings > on other cpus. I don't have backtrace since the system never gets usable even using a different console: the only resort is reboot over and over again till it boots I loops sayng CPU#1 was stuck then #2 #3 #4 and #1 again I'll try the proposed kernel parameter on next reboot For now I have all the nodes "stuck" on kernel 3.6 Thank you for answering Best regads
(In reply to Vivek Goyal from comment #27) > > So this time softlock does not happen? Yes. These messages occured during bootup. Here is a softlockup that occured with my patched kernel: KERNEL: /usr/lib/debug/lib/modules/3.9.4-200.timerdebug.fc18.x86_64/vmlinux DUMPFILE: /var/tmp/abrt/vmcore-127.0.0.1-2013.06.06-21:20:29/vmcore [PARTIAL DUMP] CPUS: 4 DATE: Thu Jun 6 15:20:27 2013 UPTIME: 00:01:48 LOAD AVERAGE: 5.05, 1.36, 0.47 TASKS: 214 NODENAME: station2.example.com RELEASE: 3.9.4-200.timerdebug.fc18.x86_64 VERSION: #1 SMP Thu Jun 6 12:48:54 MDT 2013 MACHINE: x86_64 (3093 Mhz) MEMORY: 9.9 GB PANIC: "Kernel panic - not syncing: softlockup: hung tasks" PID: 1602 COMMAND: "sh" TASK: ffff88029f38c650 [THREAD_INFO: ffff88029f34c000] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> bt -a PID: 36 TASK: ffff8802a0e8ddc0 CPU: 0 COMMAND: "kworker/0:1" #0 [ffff8802afa07e60] crash_nmi_callback at ffffffff81037342 #1 [ffff8802afa07e70] nmi_handle.isra.0 at ffffffff81662bf9 #2 [ffff8802afa07ec0] do_nmi at ffffffff81662d10 #3 [ffff8802afa07ef0] end_repeat_nmi at ffffffff81662201 [exception RIP: cfb_imageblit+1197] RIP: ffffffff813625ad RSP: ffff8802a08b7a98 RFLAGS: 00000006 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000006 RDX: ffff8802a08b7a98 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffffffff813625ad R8: ffffffff813625ad R9: 0000000000000000 R10: ffff8802a08b7a98 R11: 0000000000000006 R12: ffffffffffffffff R13: 0000000000000001 R14: 0000000000000000 R15: 000000000000020a ORIG_RAX: 000000000000020a CS: 0010 SS: 0000 --- <NMI exception stack> --- #4 [ffff8802a08b7a98] cfb_imageblit at ffffffff813625ad #5 [ffff8802a08b7aa0] cpumask_next_and at ffffffff813011f3 #6 [ffff8802a08b7ac0] update_sd_lb_stats at ffffffff8109d473 #7 [ffff8802a08b7bf0] vsnprintf at ffffffff8130c001 #8 [ffff8802a08b7ce0] vt_console_print at ffffffff813cc483 #9 [ffff8802a08b7d40] call_console_drivers.constprop.18 at ffffffff8105fec9 #10 [ffff8802a08b7d70] console_unlock at ffffffff81061373 #11 [ffff8802a08b7db0] fb_flashcursor at ffffffff81355ca8 #12 [ffff8802a08b7e00] process_one_work at ffffffff8107b7c3 #13 [ffff8802a08b7e50] worker_thread at ffffffff8107d0ef #14 [ffff8802a08b7ec0] kthread at ffffffff81082c90 #15 [ffff8802a08b7f50] ret_from_fork at ffffffff81669eac PID: 1602 TASK: ffff88029f38c650 CPU: 1 COMMAND: "sh" #0 [ffff8802afa83ce0] machine_kexec at ffffffff8103f78b #1 [ffff8802afa83d50] crash_kexec at ffffffff810c9708 #2 [ffff8802afa83e20] panic at ffffffff81657417 #3 [ffff8802afa83ea0] watchdog_timer_fn at ffffffff810eefb5 #4 [ffff8802afa83ed0] __run_hrtimer at ffffffff810862b3 #5 [ffff8802afa83f20] hrtimer_interrupt at ffffffff81086bd7 #6 [ffff8802afa83f90] smp_apic_timer_interrupt at ffffffff8166bc89 #7 [ffff8802afa83fb0] apic_timer_interrupt at ffffffff8166ab5d --- <IRQ stack> --- #8 [ffff88029f34db28] apic_timer_interrupt at ffffffff8166ab5d [exception RIP: vprintk_emit+468] RIP: ffffffff810617e4 RSP: ffff88029f34dbd8 RFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000000e3 RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000246 RBP: ffff88029f34dc58 R8: ffffffff81e97160 R9: ffffffff81eac344 R10: 000000000002ae80 R11: 0000000000040000 R12: 0000000000000096 R13: ffff88029f34db58 R14: 0000000000000000 R15: 0000000000000054 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #9 [ffff88029f34dc60] printk at ffffffff81657580 #10 [ffff88029f34dcc0] lock_timer_base at ffffffff8106e6a8 #11 [ffff88029f34dd00] try_to_del_timer_sync at ffffffff8106f641 #12 [ffff88029f34dd30] del_timer_sync at ffffffff8106f6ea #13 [ffff88029f34dd50] cfq_exit_queue at ffffffff812fb9d6 #14 [ffff88029f34dd90] elevator_exit at ffffffff812d7248 #15 [ffff88029f34ddb0] elevator_change at ffffffff812d7ab8 #16 [ffff88029f34de20] elv_iosched_store at ffffffff812d86bb #17 [ffff88029f34de50] queue_attr_store at ffffffff812e1054 #18 [ffff88029f34dea0] sysfs_write_file at ffffffff81214a78 #19 [ffff88029f34df00] vfs_write at ffffffff811a005c #20 [ffff88029f34df30] sys_write at ffffffff811a04f2 #21 [ffff88029f34df80] system_call_fastpath at ffffffff81669f59 RIP: 00007f90c59c6740 RSP: 00007fff7eb5fea8 RFLAGS: 00010202 RAX: 0000000000000001 RBX: ffffffff81669f59 RCX: 0000000000000008 RDX: 0000000000000009 RSI: 00007f90c62e4000 RDI: 0000000000000001 RBP: 00007f90c62e4000 R8: 000000000000000a R9: 00007f90c62d8740 R10: 00007fff7eb5fc40 R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000000009 R14: 00007f90c5c93280 R15: 0000000000000009 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b PID: 402 TASK: ffff88029b259770 CPU: 2 COMMAND: "systemd-journal" #0 [ffff8802afb07e60] crash_nmi_callback at ffffffff81037342 #1 [ffff8802afb07e70] nmi_handle.isra.0 at ffffffff81662bf9 #2 [ffff8802afb07ec0] do_nmi at ffffffff81662d10 #3 [ffff8802afb07ef0] end_repeat_nmi at ffffffff81662201 [exception RIP: finish_open+57] RIP: ffffffff8119e2e9 RSP: ffff88029a54bcb8 RFLAGS: 00000246 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000246 RDX: ffff88029a54bcb8 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff8119e2e9 R8: ffffffff8119e2e9 R9: 0000000000000018 R10: ffff88029a54bcb8 R11: 0000000000000246 R12: ffffffffffffffff R13: ffff88029eb1ca78 R14: ffff88029f6408b8 R15: ffff88029b335400 ORIG_RAX: ffff88029b335400 CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff88029a54bcb8] finish_open at ffffffff8119e2e9 #5 [ffff88029a54bcd0] do_last at ffffffff811ae909 #6 [ffff88029a54bdb0] path_openat at ffffffff811af117 #7 [ffff88029a54be50] do_filp_open at ffffffff811af7d1 #8 [ffff88029a54bf10] do_sys_open at ffffffff8119f634 #9 [ffff88029a54bf70] sys_open at ffffffff8119f741 #10 [ffff88029a54bf80] system_call_fastpath at ffffffff81669f59 RIP: 00007f36d7da04f0 RSP: 00007fff67037568 RFLAGS: 00000246 RAX: 0000000000000002 RBX: ffffffff81669f59 RCX: ffffffffffffffff RDX: 00000000000001b6 RSI: 0000000000080000 RDI: 00007f36d9e530b0 RBP: 00007fff67037570 R8: 0000000000000000 R9: 00007f36d95bb01d R10: 0000000000000238 R11: 0000000000000246 R12: ffffffff8119f741 R13: ffff88029a54bf78 R14: 00007fff67038200 R15: 00007fff67037e28 ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b PID: 638 TASK: ffff88029d02aee0 CPU: 3 COMMAND: "rs:main Q:Reg" #0 [ffff8802afb87e60] crash_nmi_callback at ffffffff81037342 #1 [ffff8802afb87e70] nmi_handle.isra.0 at ffffffff81662bf9 #2 [ffff8802afb87ec0] do_nmi at ffffffff81662d10 #3 [ffff8802afb87ef0] end_repeat_nmi at ffffffff81662201 [exception RIP: path_get+51] RIP: ffffffff811aa9c3 RSP: ffff88029d01fd18 RFLAGS: 00000257 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000257 RDX: ffff88029d01fd18 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff811aa9c3 R8: ffffffff811aa9c3 R9: 0000000000000018 R10: ffff88029d01fd18 R11: 0000000000000257 R12: ffffffffffffffff R13: ffff8802a4f99e9c R14: 0000000000000000 R15: 00000000000000ff ORIG_RAX: 00000000000000ff CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff88029d01fd18] path_get at ffffffff811aa9c3 #5 [ffff88029d01fd30] fsnotify_create_event at ffffffff811e0718 #6 [ffff88029d01fd90] send_to_group at ffffffff811df812 #7 [ffff88029d01fe30] fsnotify at ffffffff811dfbb5 #8 [ffff88029d01ff00] vfs_write at ffffffff811a00b5 #9 [ffff88029d01ff30] sys_write at ffffffff811a04f2 #10 [ffff88029d01ff80] system_call_fastpath at ffffffff81669f59 RIP: 00007f6d751310cd RSP: 00007f6d723b4c78 RFLAGS: 00000246 RAX: 0000000000000001 RBX: ffffffff81669f59 RCX: 00007f6d759f4430 RDX: 0000000000000085 RSI: 00007f6d6c000c40 RDI: 0000000000000005 RBP: 00007f6d6c000c40 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f6d6c000c40 R13: 00007f6d723b48e0 R14: 0000000000000000 R15: 0000000000000085 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash> Here are the files you can use to do your own analysis: http://go.gurulabs.com/vmlinux.xz http://go.gurulabs.com/vmcore.xz
I am hitting something very similar Message from syslogd@tigris04 at Jul 30 10:19:24 ... kernel:[75827.673018] BUG: soft lockup - CPU#27 stuck for 23s! [sh:6189] when connecting an iscsi storage to an ovirt node this is an AMD processor and the kernel is the latest in fedora 19 it's 3.10
I submitted a patch to resolve a similar issue: http://marc.info/?l=linux-kernel&m=137752512914423&w=2 This patch fixes a race between I/O scheduler initialization and switching by sysfs knob. I appreciate if anybody having this bug can try whether the patch can solve this. Thanks,
I think above patch should fix the issue. so elevator switch will call del_timer_sync() on a timer on which init_timer() has not been done yet. Which will explain why timer->base is NULL and why lock_timer_base() is spinning infinitely.
I submitted patch v2 that may fix this issue. https://lkml.org/lkml/2013/8/30/551 Any tests/reviews/comments are appreciated.
What's the difference between v1 and v2? I compiled my kernel with v1 and so far it's working.
(In reply to Ohad Basan from comment #37) Thanks for testing! There are no big changes in the logic; split it into 2 patches for ease of reviews, added some comments and assertion for locking, and fixed a minor bug on memory allocation failure.
I have the same issue with 3.10.10-200.fc19.x86_64 on intel CPU. The trigger seems to be adding iscsi targets.
I confirm that this patch solves the problem
Does anyone know when will this be merged to the fedora kernel?
Good question #41. The problem is still here in Fedora 19 3.11.1-200 and we cannot force anyone to rebuild kernel from sorce to apply the patch or insert elevator=deadline in the kernel line. Since the patch seems to work, it wold be really nice to have it merged in the kernel tree. Hope someone gets this annoyng thing done Best regards
(In reply to Alessandro from comment #42) > Good question #41. > > The problem is still here in Fedora 19 3.11.1-200 and we cannot force anyone > to rebuild kernel from sorce to apply the patch or insert elevator=deadline > in the kernel line. > > Since the patch seems to work, it wold be really nice to have it merged in > the kernel tree. As soon as it seems set upstream, we'll backport it. Right now, there's a standing question from yesterday. It should also be CC'd to stable I would think.
Uh Josh this is a WONDERFUL news This bug is long standing and really problematic for unattended boot Thank you very much
The patches got acks upstream. I've added them to all releases.
Thank you! It's a long awaited fix coming!
kernel-3.11.2-201.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/kernel-3.11.2-201.fc19
kernel-3.10.13-101.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/kernel-3.10.13-101.fc18
Package kernel-3.11.2-201.fc19: * should fix your issue, * was pushed to the Fedora 19 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.11.2-201.fc19' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-17865/kernel-3.11.2-201.fc19 then log in and leave karma (feedback).
kernel-3.11.2-301.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.11.2-301.fc20
kernel-3.11.2-201.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.11.2-301.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.10.13-101.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report.