Bug 745436

Summary: missing warn_on() in 'Call Trace's
Product: [Fedora] Fedora Reporter: Nikola Pajkovsky <npajkovs>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CANTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: anton, dhoward, dvlasenk, gansalmon, iprikryl, itamar, jmoskovc, jonathan, kernel-maint, kklic, madhu.chinakonda, mads, mtoman, npajkovs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 728194 Environment:
Last Closed: 2011-11-08 00:46:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 728194    
Bug Blocks:    

Description Nikola Pajkovsky 2011-10-12 11:12:29 UTC
+++ This bug was initially created as a clone of Bug #728194 +++

abrt doesn't offer to report some of stack traces in the attacked dmesg. That is probably because they don't have the expected 'cut here' headers.

abrt-addon-kerneloops-2.0.4-1.fc16.i686

--- Additional comment from mads on 2011-08-04 07:39:59 EDT ---

Created attachment 516689 [details]
dmesg

--- Additional comment from mads on 2011-08-08 07:32:36 EDT ---

Created attachment 517180 [details]
another dmesg

Another one:

[    7.966658] irq 17: nobody cared (try booting with the "irqpoll" option)
[    7.966699] Pid: 151, comm: usb-storage Not tainted 3.0.0-3.fc16.x86_64 #1
[    7.966735] Call Trace:
[    7.966749]  <IRQ>  [<ffffffff810bef8a>] __report_bad_irq+0x37/0xc4
[    7.966790]  [<ffffffff810bf22d>] note_interrupt+0x179/0x1fc
[    7.966820]  [<ffffffff810bd736>] handle_irq_event_percpu+0x15d/0x1bc
[    7.966854]  [<ffffffff810bd7dc>] handle_irq_event+0x47/0x67
[    7.968321]  [<ffffffff814ddc74>] ? _raw_spin_lock+0x62/0x6a
[    7.969769]  [<ffffffff810bf9a3>] ? handle_fasteoi_irq+0x1e/0xad
[    7.971212]  [<ffffffff810bfa0c>] handle_fasteoi_irq+0x87/0xad
[    7.972651]  [<ffffffff8100ab60>] handle_irq+0x8b/0x91
[    7.974078]  [<ffffffff814e6b3d>] do_IRQ+0x4d/0xa5
[    7.975493]  [<ffffffff814de7d3>] common_interrupt+0x13/0x13
[    7.976897]  [<ffffffff81091106>] ? arch_local_irq_restore+0x6/0xd
[    7.978296]  [<ffffffff814de490>] ? _raw_spin_unlock_irqrestore+0x4d/0x52
[    7.979698]  [<ffffffff813222af>] ? scsi_run_queue+0x25d/0x274
[    7.981095]  [<ffffffff813233e5>] ? scsi_next_command+0x38/0x48
[    7.982473]  [<ffffffff813238a6>] ? scsi_io_completion+0x45d/0x4d7
[    7.983842]  [<ffffffff8131ba58>] ? scsi_finish_command+0xe4/0xed
[    7.985206]  [<ffffffff8132338d>] ? scsi_softirq_done+0x109/0x112
[    7.986566]  [<ffffffff8122be8f>] ? blk_done_softirq+0x79/0x8d
[    7.987915]  [<ffffffff8105f1a8>] ? __do_softirq+0xdb/0x1ec
[    7.989285]  [<ffffffff8108aafa>] ? lock_release+0x173/0x19c
[    7.989288]  [<ffffffff8100e9fd>] ? paravirt_read_tsc+0x9/0xd
[    7.989290]  [<ffffffff814e62dc>] ? call_softirq+0x1c/0x30
[    7.989292]  [<ffffffff8100abb1>] ? do_softirq+0x4b/0xa2
[    7.989305]  [<ffffffff8105f4c9>] ? irq_exit+0x5d/0xc0
[    7.989307]  [<ffffffff814e6b7e>] ? do_IRQ+0x8e/0xa5
[    7.989309]  [<ffffffff814de7d3>] ? common_interrupt+0x13/0x13
[    7.989310]  <EOI>  [<ffffffff81091122>] ? arch_local_irq_enable+0x8/0xd
[    7.989314]  [<ffffffff814de43f>] ? _raw_spin_unlock_irq+0x32/0x36
[    7.989320]  [<ffffffffa0052e96>] ? usb_stor_control_thread+0x1cd/0x237 [usb_storage]
[    7.989323]  [<ffffffffa0052cc9>] ? fill_inquiry_response+0xf3/0xf3 [usb_storage]
[    7.989326]  [<ffffffff81075e19>] ? kthread+0xa8/0xb0
[    7.989328]  [<ffffffff814e61e4>] ? kernel_thread_helper+0x4/0x10
[    7.989331]  [<ffffffff814de894>] ? retint_restore_args+0x13/0x13
[    7.989333]  [<ffffffff81075d71>] ? __init_kthread_worker+0x5a/0x5a
[    7.989335]  [<ffffffff814e61e0>] ? gs_change+0x13/0x13
[    7.989337] handlers:
[    7.989339] [<ffffffffa00ce036>] sdhci_irq
[    7.989341] Disabling IRQ #17

--- Additional comment from mads on 2011-10-11 15:54:04 EDT ---

Wouldn't it be possible for you to push the kernel team towards consistently using some kind of more structured reporting of "Call Traces" with begin/end markers? That would ensure that you are ready for the rare and unexpected bug too.

Comment 1 Josh Boyer 2011-11-02 16:59:26 UTC
Moving this discussion to rawhide.  It's likely to not be fixed in f16, as this is really an upstream issue.

That traceback is because the function calls dump_stack instead of WARN_ON or BUG_ON.  There are numerous places in the kernel that just call dump_stack for various reasons.

Moving the "------------[ cut here ]------------\n" inside of dump_stack isn't really feasible either, because often other information is desired in the cut portion that dump_stack doesn't provide (like the list of modules, or a register dump, or some other informational message).

Comment 2 Josh Boyer 2011-11-08 00:46:45 UTC
I think we're going to close this out as CANTFIX because there doesn't really seem to be a suitable all-around solution for the various usages.

Comment 3 Mads Kiilerich 2011-11-08 11:17:28 UTC
What do you recommend abrt should do? How can it generate reports that is (slightly) useful for you as kernel guy?

Should it just add handling of new kinds of traces as they are reported incorrectly? And perhaps fall back to report 20 lines before an unknown "Call Trace:"?

I assume there is a kind of a privacy issue here and it thus can't report too much of the log automatically.

Comment 4 Jiri Moskovcak 2011-11-08 13:59:39 UTC
I discussed it with another kernel developer and I was told that if there is no "BUG: " or "WARNING: " or other known oops delimiter we (ABRT) should just ignore it.

Comment 5 Mads Kiilerich 2011-11-08 14:07:25 UTC
It seems strange that the kernel should emit stack traces as a part of its normal operation. I would assume that it implied that something should be fixed.

Ok, there is no reason to report a 'initrd not found' trace, but I guess that is a strange exception.