Bug 745436 - missing warn_on() in 'Call Trace's
Summary: missing warn_on() in 'Call Trace's
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 728194
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-12 11:12 UTC by Nikola Pajkovsky
Modified: 2014-02-02 22:15 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 728194
Environment:
Last Closed: 2011-11-08 00:46:45 UTC


Attachments (Terms of Use)

Description Nikola Pajkovsky 2011-10-12 11:12:29 UTC
+++ This bug was initially created as a clone of Bug #728194 +++

abrt doesn't offer to report some of stack traces in the attacked dmesg. That is probably because they don't have the expected 'cut here' headers.

abrt-addon-kerneloops-2.0.4-1.fc16.i686

--- Additional comment from mads@kiilerich.com on 2011-08-04 07:39:59 EDT ---

Created attachment 516689 [details]
dmesg

--- Additional comment from mads@kiilerich.com on 2011-08-08 07:32:36 EDT ---

Created attachment 517180 [details]
another dmesg

Another one:

[    7.966658] irq 17: nobody cared (try booting with the "irqpoll" option)
[    7.966699] Pid: 151, comm: usb-storage Not tainted 3.0.0-3.fc16.x86_64 #1
[    7.966735] Call Trace:
[    7.966749]  <IRQ>  [<ffffffff810bef8a>] __report_bad_irq+0x37/0xc4
[    7.966790]  [<ffffffff810bf22d>] note_interrupt+0x179/0x1fc
[    7.966820]  [<ffffffff810bd736>] handle_irq_event_percpu+0x15d/0x1bc
[    7.966854]  [<ffffffff810bd7dc>] handle_irq_event+0x47/0x67
[    7.968321]  [<ffffffff814ddc74>] ? _raw_spin_lock+0x62/0x6a
[    7.969769]  [<ffffffff810bf9a3>] ? handle_fasteoi_irq+0x1e/0xad
[    7.971212]  [<ffffffff810bfa0c>] handle_fasteoi_irq+0x87/0xad
[    7.972651]  [<ffffffff8100ab60>] handle_irq+0x8b/0x91
[    7.974078]  [<ffffffff814e6b3d>] do_IRQ+0x4d/0xa5
[    7.975493]  [<ffffffff814de7d3>] common_interrupt+0x13/0x13
[    7.976897]  [<ffffffff81091106>] ? arch_local_irq_restore+0x6/0xd
[    7.978296]  [<ffffffff814de490>] ? _raw_spin_unlock_irqrestore+0x4d/0x52
[    7.979698]  [<ffffffff813222af>] ? scsi_run_queue+0x25d/0x274
[    7.981095]  [<ffffffff813233e5>] ? scsi_next_command+0x38/0x48
[    7.982473]  [<ffffffff813238a6>] ? scsi_io_completion+0x45d/0x4d7
[    7.983842]  [<ffffffff8131ba58>] ? scsi_finish_command+0xe4/0xed
[    7.985206]  [<ffffffff8132338d>] ? scsi_softirq_done+0x109/0x112
[    7.986566]  [<ffffffff8122be8f>] ? blk_done_softirq+0x79/0x8d
[    7.987915]  [<ffffffff8105f1a8>] ? __do_softirq+0xdb/0x1ec
[    7.989285]  [<ffffffff8108aafa>] ? lock_release+0x173/0x19c
[    7.989288]  [<ffffffff8100e9fd>] ? paravirt_read_tsc+0x9/0xd
[    7.989290]  [<ffffffff814e62dc>] ? call_softirq+0x1c/0x30
[    7.989292]  [<ffffffff8100abb1>] ? do_softirq+0x4b/0xa2
[    7.989305]  [<ffffffff8105f4c9>] ? irq_exit+0x5d/0xc0
[    7.989307]  [<ffffffff814e6b7e>] ? do_IRQ+0x8e/0xa5
[    7.989309]  [<ffffffff814de7d3>] ? common_interrupt+0x13/0x13
[    7.989310]  <EOI>  [<ffffffff81091122>] ? arch_local_irq_enable+0x8/0xd
[    7.989314]  [<ffffffff814de43f>] ? _raw_spin_unlock_irq+0x32/0x36
[    7.989320]  [<ffffffffa0052e96>] ? usb_stor_control_thread+0x1cd/0x237 [usb_storage]
[    7.989323]  [<ffffffffa0052cc9>] ? fill_inquiry_response+0xf3/0xf3 [usb_storage]
[    7.989326]  [<ffffffff81075e19>] ? kthread+0xa8/0xb0
[    7.989328]  [<ffffffff814e61e4>] ? kernel_thread_helper+0x4/0x10
[    7.989331]  [<ffffffff814de894>] ? retint_restore_args+0x13/0x13
[    7.989333]  [<ffffffff81075d71>] ? __init_kthread_worker+0x5a/0x5a
[    7.989335]  [<ffffffff814e61e0>] ? gs_change+0x13/0x13
[    7.989337] handlers:
[    7.989339] [<ffffffffa00ce036>] sdhci_irq
[    7.989341] Disabling IRQ #17

--- Additional comment from mads@kiilerich.com on 2011-10-11 15:54:04 EDT ---

Wouldn't it be possible for you to push the kernel team towards consistently using some kind of more structured reporting of "Call Traces" with begin/end markers? That would ensure that you are ready for the rare and unexpected bug too.

Comment 1 Josh Boyer 2011-11-02 16:59:26 UTC
Moving this discussion to rawhide.  It's likely to not be fixed in f16, as this is really an upstream issue.

That traceback is because the function calls dump_stack instead of WARN_ON or BUG_ON.  There are numerous places in the kernel that just call dump_stack for various reasons.

Moving the "------------[ cut here ]------------\n" inside of dump_stack isn't really feasible either, because often other information is desired in the cut portion that dump_stack doesn't provide (like the list of modules, or a register dump, or some other informational message).

Comment 2 Josh Boyer 2011-11-08 00:46:45 UTC
I think we're going to close this out as CANTFIX because there doesn't really seem to be a suitable all-around solution for the various usages.

Comment 3 Mads Kiilerich 2011-11-08 11:17:28 UTC
What do you recommend abrt should do? How can it generate reports that is (slightly) useful for you as kernel guy?

Should it just add handling of new kinds of traces as they are reported incorrectly? And perhaps fall back to report 20 lines before an unknown "Call Trace:"?

I assume there is a kind of a privacy issue here and it thus can't report too much of the log automatically.

Comment 4 Jiri Moskovcak 2011-11-08 13:59:39 UTC
I discussed it with another kernel developer and I was told that if there is no "BUG: " or "WARNING: " or other known oops delimiter we (ABRT) should just ignore it.

Comment 5 Mads Kiilerich 2011-11-08 14:07:25 UTC
It seems strange that the kernel should emit stack traces as a part of its normal operation. I would assume that it implied that something should be fixed.

Ok, there is no reason to report a 'initrd not found' trace, but I guess that is a strange exception.


Note You need to log in before you can comment on or make changes to this bug.