Bug 682661 - WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback()
Summary: WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback()
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-07 07:24 UTC by Albert Strasheim
Modified: 2011-05-03 06:27 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-05-03 06:27:06 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Albert Strasheim 2011-03-07 07:24:57 UTC
Description of problem:

WARNINGs in kernel log

Version-Release number of selected component (if applicable):

kernel-2.6.38-0.rc5.git1.1.fc15.x86_64

How reproducible:

Don't know.

Actual results:

[ 4057.148476] ------------[ cut here ]------------
[ 4057.153305] WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback+0x9b/0xa6()
[ 4057.161446] Hardware name: X8DTH-i/6/iF/6F
[ 4057.165742] Watchdog detected hard LOCKUP on cpu 13
[ 4057.170447] Modules linked in: crc32c_intel w83627ehf hwmon_vid adm1021 ipmi_devintf ipmi_si ipmi_msghandler dm_round_robin dm_multipath ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en microcode ghes hed i2c_i801 igb i7core_edac iTCO_wdt edac_core i2c_core ioatdma ses enclosure mlx4_core iTCO_vendor_support dca uas usb_storage mpt2sas scsi_transport_sas raid_class [last unloaded: scsi_wait_scan]
[ 4057.213686] Pid: 13735, comm: ext4lazyinit Not tainted 2.6.38-0.rc5.git1.1.fc15.x86_64 #1
[ 4057.222272] Call Trace:
[ 4057.224923]  <NMI>  [<ffffffff8105538e>] ? warn_slowpath_common+0x83/0x9b
[ 4057.231944]  [<ffffffff81055449>] ? warn_slowpath_fmt+0x46/0x48
[ 4057.238079]  [<ffffffff810ac241>] ? watchdog_overflow_callback+0x9b/0xa6
[ 4057.244986]  [<ffffffff810d3a2e>] ? __perf_event_overflow+0x135/0x191
[ 4057.251640]  [<ffffffff810162c2>] ? paravirt_write_msr+0xf/0x13
[ 4057.257756]  [<ffffffff810d407e>] ? perf_event_overflow+0x14/0x16
[ 4057.264057]  [<ffffffff81019865>] ? intel_pmu_handle_irq+0x38c/0x3ef
[ 4057.270617]  [<ffffffff814721f8>] ? perf_event_nmi_handler+0x67/0xb3
[ 4057.277174]  [<ffffffff81473eb3>] ? notifier_call_chain.isra.0+0x38/0x65
[ 4057.284081]  [<ffffffff81473f0c>] ? atomic_notifier_call_chain+0x18/0x1a
[ 4057.290985]  [<ffffffff81473f3c>] ? notify_die+0x2e/0x30
[ 4057.296506]  [<ffffffff81471681>] ? do_nmi+0x6d/0x217
[ 4057.301756]  [<ffffffff81471390>] ? nmi+0x20/0x30
[ 4057.306661]  [<ffffffff81080b85>] ? do_raw_spin_lock+0x21/0x25
[ 4057.312698]  <<EOE>>  [<ffffffff814708d3>] ? _raw_spin_lock_irq+0x1c/0x1e
[ 4057.319719]  [<ffffffff8146f39b>] ? wait_for_common+0x43/0x101
[ 4057.325759]  [<ffffffff8121563a>] ? submit_bio+0xde/0xfd
[ 4057.331278]  [<ffffffff81147e9e>] ? bio_alloc_bioset+0x4c/0xc3
[ 4057.337317]  [<ffffffff8146f50d>] ? wait_for_completion+0x1d/0x1f
[ 4057.343609]  [<ffffffff81217e0f>] ? blkdev_issue_flush+0x97/0xcf
[ 4057.349822]  [<ffffffff811939d2>] ? ext4_init_inode_table+0x1bd/0x243
[ 4057.356474]  [<ffffffff811a2a2c>] ? ext4_lazyinit_thread+0x173/0x343
[ 4057.363036]  [<ffffffff8106f295>] ? autoremove_wake_function+0x0/0x37
[ 4057.369680]  [<ffffffff811a28b9>] ? ext4_lazyinit_thread+0x0/0x343
[ 4057.376057]  [<ffffffff8106ebd1>] ? kthread+0x84/0x8c
[ 4057.381308]  [<ffffffff8100a9e4>] ? kernel_thread_helper+0x4/0x10
[ 4057.387607]  [<ffffffff8106eb4d>] ? kthread+0x0/0x8c
[ 4057.392772]  [<ffffffff8100a9e0>] ? kernel_thread_helper+0x0/0x10
[ 4057.399070] ---[ end trace 40b579601cdaf3c4 ]---

[ 4153.744367] ------------[ cut here ]------------
[ 4153.749226] WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback+0x9b/0xa6()
[ 4153.757426] Hardware name: X8DTH-i/6/iF/6F
[ 4153.761749] Watchdog detected hard LOCKUP on cpu 22
[ 4153.766464] Modules linked in: crc32c_intel w83627ehf hwmon_vid adm1021 ipmi_devintf ipmi_si ipmi_msghandler dm_round_robin dm_multipath ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en microcode ghes hed i2c_i801 igb i7core_edac iTCO_wdt edac_core i2c_core ioatdma ses enclosure mlx4_core iTCO_vendor_support dca uas usb_storage mpt2sas scsi_transport_sas raid_class [last unloaded: scsi_wait_scan]
[ 4153.809945] Pid: 13758, comm: loop2 Tainted: G        W   2.6.38-0.rc5.git1.1.fc15.x86_64 #1
[ 4153.818841] Call Trace:
[ 4153.821521]  <NMI>  [<ffffffff8105538e>] ? warn_slowpath_common+0x83/0x9b
[ 4153.828565]  [<ffffffff81055449>] ? warn_slowpath_fmt+0x46/0x48
[ 4153.834717]  [<ffffffff810ac241>] ? watchdog_overflow_callback+0x9b/0xa6
[ 4153.841650]  [<ffffffff810d3a2e>] ? __perf_event_overflow+0x135/0x191
[ 4153.848324]  [<ffffffff810162c2>] ? paravirt_write_msr+0xf/0x13
[ 4153.854475]  [<ffffffff810d407e>] ? perf_event_overflow+0x14/0x16
[ 4153.860798]  [<ffffffff81019865>] ? intel_pmu_handle_irq+0x38c/0x3ef
[ 4153.867386]  [<ffffffff814721f8>] ? perf_event_nmi_handler+0x67/0xb3
[ 4153.873980]  [<ffffffff81473eb3>] ? notifier_call_chain.isra.0+0x38/0x65
[ 4153.880919]  [<ffffffff81473f0c>] ? atomic_notifier_call_chain+0x18/0x1a
[ 4153.887852]  [<ffffffff81473f3c>] ? notify_die+0x2e/0x30
[ 4153.893396]  [<ffffffff81471681>] ? do_nmi+0x6d/0x217
[ 4153.898682]  [<ffffffff81471390>] ? nmi+0x20/0x30
[ 4153.903622]  [<ffffffff812ef051>] ? do_lo_send_aops+0x0/0x16c
[ 4153.909600]  [<ffffffff814708af>] ? _raw_spin_lock_irqsave+0x27/0x2f
[ 4153.916183]  <<EOE>>  [<ffffffff810425e2>] ? complete+0x1f/0x50
[ 4153.922363]  [<ffffffff81217e6c>] ? bio_end_flush+0x25/0x31
[ 4153.928171]  [<ffffffff81146e81>] ? bio_endio+0x2d/0x2f
[ 4153.933635]  [<ffffffff812ef62d>] ? loop_thread+0x200/0x22c
[ 4153.939441]  [<ffffffff8106f295>] ? autoremove_wake_function+0x0/0x37
[ 4153.946112]  [<ffffffff812ef42d>] ? loop_thread+0x0/0x22c
[ 4153.951744]  [<ffffffff8106ebd1>] ? kthread+0x84/0x8c
[ 4153.957033]  [<ffffffff8100a9e4>] ? kernel_thread_helper+0x4/0x10
[ 4153.963363]  [<ffffffff8106eb4d>] ? kthread+0x0/0x8c
[ 4153.968556]  [<ffffffff8100a9e0>] ? kernel_thread_helper+0x0/0x10
[ 4153.974878] ---[ end trace 40b579601cdaf3c5 ]---

Comment 1 Albert Strasheim 2011-03-08 09:11:23 UTC
I was using perf a few hours before this happened, so it might be a perf issue.

Comment 2 Chuck Ebbert 2011-03-08 15:38:29 UTC
I'm pretty sure this was fixed in a more recent kernel. Please try 2.6.38-rc8, which was submitted for testing today:
https://admin.fedoraproject.org/updates/kernel-2.6.38-0.rc8.git0.1.fc15

Comment 3 Albert Strasheim 2011-03-08 15:40:41 UTC
Thanks, I'm already running it. No problems yet, and I've perfed quite a bit.

Comment 4 Albert Strasheim 2011-03-10 06:30:16 UTC
I am running 2.6.38-0.rc8.git0.2 now and just had:


[17609.564972] ------------[ cut here ]------------
[17609.569598] WARNING: at arch/x86/kernel/dumpstack_64.c:129 dump_trace+0x2bc/0x308()
[17609.577246] Hardware name: X8DTH-i/6/iF/6F
[17609.581342] Perf: bad frame pointer =           (null) in callchain
[17609.587608] Modules linked in: fuse crc32c_intel oprofile w83627ehf hwmon_vid adm1021 ipmi_devintf ipmi_si ipmi_msghandler dm_round_robin dm_multipath ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en microcode ghes hed ses i2c_i801 i7core_edac ioatdma igb i2c_core enclosure edac_core dca iTCO_wdt mlx4_core iTCO_vendor_support uas mpt2sas usb_storage scsi_transport_sas raid_class [last unloaded: oprofile]
[17609.629722] Pid: 11410, comm: sysprof Not tainted 2.6.38-0.rc8.git0.2.fc14.x86_64 #1
[17609.637459] Call Trace:
[17609.639908]  <IRQ>  [<ffffffff81059357>] ? warn_slowpath_common+0x85/0x9d
[17609.646714]  [<ffffffff81059412>] ? warn_slowpath_fmt+0x46/0x48
[17609.652633]  [<ffffffff81490a26>] ? bad_to_user+0x70/0x764
[17609.658119]  [<ffffffff8100d5b4>] ? dump_trace+0x2bc/0x308
[17609.663605]  [<ffffffff8107b578>] ? timekeeping_get_ns+0x1b/0x3d
[17609.669608]  [<ffffffff8101b4a8>] ? perf_callchain_kernel+0x5a/0x5c
[17609.675874]  [<ffffffff810db732>] ? perf_prepare_sample+0xf2/0x1d2
[17609.682049]  [<ffffffff810db96f>] ? __perf_event_overflow+0x15d/0x1ab
[17609.688489]  [<ffffffff81010f89>] ? paravirt_read_tsc+0x9/0xd
[17609.694233]  [<ffffffff810114a7>] ? native_sched_clock+0x35/0x37
[17609.700235]  [<ffffffff810114b2>] ? sched_clock+0x9/0xd
[17609.705461]  [<ffffffff81078ac8>] ? sched_clock_cpu+0x42/0xc6
[17609.711206]  [<ffffffff810dbfdc>] ? perf_event_overflow+0x14/0x16
[17609.717298]  [<ffffffff810dc070>] ? perf_swevent_hrtimer+0x92/0xde
[17609.723477]  [<ffffffff810a6591>] ? audit_syscall_exit+0x0/0x14c
[17609.729482]  [<ffffffff8123f2bb>] ? timerqueue_del+0x59/0x6a
[17609.735141]  [<ffffffff810765d8>] ? __remove_hrtimer+0x62/0x6e
[17609.740969]  [<ffffffff81076854>] ? __run_hrtimer+0xb9/0x13f
[17609.746626]  [<ffffffff810dbfde>] ? perf_swevent_hrtimer+0x0/0xde
[17609.752715]  [<ffffffff81077046>] ? hrtimer_interrupt+0xd1/0x1b0
[17609.758713]  [<ffffffff81049126>] ? account_system_vtime+0x6f/0x8c
[17609.764892]  [<ffffffff8148f026>] ? smp_apic_timer_interrupt+0x79/0x8c
[17609.771418]  [<ffffffff8100b613>] ? apic_timer_interrupt+0x13/0x20
[17609.777592]  <EOI>  [<ffffffff8112bfd4>] ? vfs_read+0xbc/0xfc
[17609.783357]  [<ffffffff810a6591>] ? audit_syscall_exit+0x0/0x14c
[17609.789360]  [<ffffffff8100ad74>] ? sysret_audit+0x16/0x20
[17609.794836] ---[ end trace 14962c7de7a584a6 ]---

I was running sysprof for a while, that might have triggered it.

Comment 5 Chuck Ebbert 2011-03-12 20:41:52 UTC
(In reply to comment #4)
> I am running 2.6.38-0.rc8.git0.2 now and just had:
> 
> 
> [17609.564972] ------------[ cut here ]------------
> [17609.569598] WARNING: at arch/x86/kernel/dumpstack_64.c:129

Reported upstream:
http://lkml.org/lkml/2011/3/12/62

Comment 6 Chuck Ebbert 2011-05-03 06:27:06 UTC
I'm going to close this since the original issue reported was fixed. Bug 700718 reports the same issue as comment #4.


Note You need to log in before you can comment on or make changes to this bug.