Bug 247470 - soft cpu lockups with heavy cpu + io
soft cpu lockups with heavy cpu + io
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
7
All Linux
low Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-09 10:59 EDT by Ra P.
Modified: 2008-01-09 10:59 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-01-09 10:59:36 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ra P. 2007-07-09 10:59:46 EDT
Running a shred on a partition (unmounted, of course), gives me a ton of lockups.

Well, not a ton, but three so far today.
The box hangs for a minute or so then continues. The disk whirrs away happily.

dmesg says:

BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c0403281>] mwait_idle_with_hints+0x3b/0x3f
 [<c04033d6>] cpu_idle+0xa3/0xc4
 =======================
BUG: soft lockup detected on CPU#0!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<f88c6788>] scsi_request_fn+0x2ce/0x31d [scsi_mod]
 [<c04d9e72>] blk_run_queue+0x37/0x63
 [<f88c5421>] scsi_next_command+0x25/0x2f [scsi_mod]
 [<f88c55e1>] scsi_end_request+0xa1/0xab [scsi_mod]
 [<f88c5791>] scsi_io_completion+0x163/0x328 [scsi_mod]
 [<f88525ca>] sd_rw_intr+0x218/0x242 [sd_mod]
 [<f903a39e>] rm_check_pci_config_space+0x1ce/0x5a0 [nvidia]
 [<f88c13db>] scsi_finish_command+0x81/0x88 [scsi_mod]
 [<f92b1798>] nv_verify_pci_config+0x47/0x91 [nvidia]
 [<c04da574>] blk_done_softirq+0x49/0x54
 [<c042b2e5>] __do_softirq+0x5d/0xba
 [<c04071b7>] do_softirq+0x59/0xb1
 [<c042b1c7>] ksoftirqd+0x0/0xc1
 [<c042b226>] ksoftirqd+0x5f/0xc1
 [<c0436da8>] kthread+0xb0/0xd8
 [<c0436cf8>] kthread+0x0/0xd8
 [<c0405b3f>] kernel_thread_helper+0x7/0x10
 =======================
Clocksource tsc unstable (delta = 60925666023 ns)
Time: acpi_pm clocksource has been installed.

*****

BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c042de8b>] process_timeout+0x0/0x5
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c042e748>] lock_timer_base+0x19/0x35
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c06009e6>] schedule_timeout+0x6b/0x8d
 [<c042de8b>] process_timeout+0x0/0x5
 [<c042007b>] find_busiest_group+0x207/0x4c5
 [<c042dcee>] run_timer_softirq+0x10a/0x17b
 [<c042de8b>] process_timeout+0x0/0x5
 [<c042a588>] it_real_fn+0x12/0x16
 [<c042b2e5>] __do_softirq+0x5d/0xba
 [<c04071b7>] do_softirq+0x59/0xb1
 [<c042b1c7>] ksoftirqd+0x0/0xc1
 [<c042b226>] ksoftirqd+0x5f/0xc1
 [<c0436da8>] kthread+0xb0/0xd8
 [<c0436cf8>] kthread+0x0/0xd8
 [<c0405b3f>] kernel_thread_helper+0x7/0x10
 =======================
BUG: soft lockup detected on CPU#0!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c060007b>] __sched_text_start+0x5eb/0x89e
 [<c047672b>] fget_light+0x36/0x66
 [<c048007d>] do_sys_poll+0x173/0x327
 [<c0480ba3>] __pollwait+0x0/0xac
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c042229f>] default_wake_function+0x0/0xc
 [<c0422295>] try_to_wake_up+0x3aa/0x3b4
 [<c0598781>] sock_aio_write+0xf6/0x102
 [<c0420a23>] __wake_up+0x32/0x43
 [<c05f8655>] unix_ioctl+0x8b/0x93
 [<c0598b47>] sock_ioctl+0x19f/0x1be
 [<c0450fcd>] audit_syscall_exit+0x294/0x2b0
 [<c0450d0f>] audit_syscall_entry+0x10d/0x137
 [<c0480265>] sys_poll+0x34/0x37
 [<c0404f70>] syscall_call+0x7/0xb
 =======================

*****

BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c0403281>] mwait_idle_with_hints+0x3b/0x3f
 [<c04033d6>] cpu_idle+0xa3/0xc4
 =======================
Comment 1 Ra P. 2007-07-09 11:01:00 EDT
Not cutting edge kernel 2.6.21-1.3228.fc7 with an nvidia binary driver loaded.

Not sure if that is at all relevant though.
Comment 2 Ra P. 2007-07-10 04:06:33 EDT
and again..

BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c041fab5>] __wake_up_common+0x32/0x55
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c0601895>] _spin_unlock_irqrestore+0x8/0x9
 [<c05f9679>] unix_write_space+0x47/0x71
 [<c059c620>] sock_wfree+0x21/0x36
 [<c059de78>] __kfree_skb+0xb5/0x113
 [<c059f345>] memcpy_toiovec+0x27/0x4a
 [<c05f804a>] unix_stream_recvmsg+0x336/0x4aa
 [<c0598889>] sock_aio_read+0xfc/0x108
 [<c047555c>] do_sync_read+0xc7/0x10a
 [<c0436e71>] autoremove_wake_function+0x0/0x35
 [<c059868b>] sock_aio_write+0x0/0x102
 [<c0475dfd>] vfs_read+0xba/0x152
 [<c047623f>] sys_read+0x41/0x67
 [<c0404f70>] syscall_call+0x7/0xb
 =======================
Comment 3 Ra P. 2007-07-10 07:43:10 EDT
and again:

BUG: soft lockup detected on CPU#0!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c042007b>] find_busiest_group+0x207/0x4c5
 [<c0404f36>] system_call+0x16/0x32
 =======================
Comment 4 Ra P. 2007-07-10 08:41:18 EDT
once more!

BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c0601895>] _spin_unlock_irqrestore+0x8/0x9
 [<f92b67d2>] os_release_sema+0x3f/0x6c [nvidia]
 [<f902fe2a>] _nv002658rm+0x12/0x1c [nvidia]
 [<f900f058>] _nv003624rm+0xa4/0xd0 [nvidia]
 [<f903a7ad>] _nv001996rm+0x3d/0x770 [nvidia]
 [<f903aaf3>] _nv001996rm+0x383/0x770 [nvidia]
 [<f92b62e5>] os_pci_read_dword+0x2b/0x34 [nvidia]
 [<f903a624>] rm_check_pci_config_space+0x454/0x5a0 [nvidia]
 [<f90383cc>] rm_ioctl+0x1c/0x24 [nvidia]
 [<c04e7070>] copy_from_user+0x3a/0x66
 [<f92b3663>] nv_kern_ioctl+0x2f4/0x365 [nvidia]
 [<f92b3709>] nv_kern_unlocked_ioctl+0x18/0x1d [nvidia]
 [<f92b36f1>] nv_kern_unlocked_ioctl+0x0/0x1d [nvidia]
 [<c047f713>] do_ioctl+0x1f/0x62
 [<c047f99a>] vfs_ioctl+0x244/0x256
 [<c047f9f8>] sys_ioctl+0x4c/0x64
 [<c0404f70>] syscall_call+0x7/0xb
 =======================
Comment 5 Ra P. 2007-07-10 08:52:03 EDT
If a bug falls in bugzilla, and there are no kernel maintainers around to
comment on it, is it still a bug?

Another spectacular lockup!

BUG: soft lockup detected on CPU#0!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c0403281>] mwait_idle_with_hints+0x3b/0x3f
 [<c04033d6>] cpu_idle+0xa3/0xc4
 [<c071bb5f>] start_kernel+0x435/0x43d
 [<c071b25a>] unknown_bootoption+0x0/0x202
 =======================
Comment 6 Ra P. 2007-07-10 09:13:54 EDT
BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 [<c0403281>] mwait_idle_with_hints+0x3b/0x3f
 [<c04033d6>] cpu_idle+0xa3/0xc4
 =======================
Comment 7 Ra P. 2007-07-10 10:05:36 EDT
A shorter one this time:

BUG: soft lockup detected on CPU#1!
 [<c0451ea2>] softlockup_tick+0xa5/0xb4
 [<c042e930>] update_process_times+0x3b/0x5e
 [<c043d298>] tick_sched_timer+0x57/0x9a
 [<c0439df5>] hrtimer_interrupt+0x12b/0x1b6
 [<c043d241>] tick_sched_timer+0x0/0x9a
 [<c0419c40>] smp_apic_timer_interrupt+0x6f/0x80
 [<c04059bc>] apic_timer_interrupt+0x28/0x30
 =======================
Comment 8 Christopher Brown 2007-09-17 17:13:32 EDT
Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

There hasn't been much activity on this bug for a while. Could you tell me if
you are still having problems with the latest kernel? From a brief look, could
you try with nohz=off as a boot parameter and tell me if you see the same errors.

If the problem no longer exists then please close this bug or I'll do so in a
few days if there is no additional information lodged.

Cheers
Chris
Comment 9 Christopher Brown 2007-09-18 10:30:46 EDT
Please could you also boot with:

nosoftlockup

as a boot parameter.

Cheers
Chris
Comment 10 Christopher Brown 2008-01-09 10:59:36 EST
As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.

Note You need to log in before you can comment on or make changes to this bug.