Bug 572533
| Summary: | [abrt] informational WARN_ON in mtrr generates backtrace | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Brock Organ <borgan> |
| Component: | kernel | Assignee: | Prarit Bhargava <prarit> |
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 6.0 | CC: | arozansk, esandeen, james.leddy |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | abrt_hash:62439878 | ||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2010-04-06 13:56:10 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Brock Organ
2010-03-11 13:43:31 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion. It's not a crash, it's the kenel fixing up a bios problem:
> mtrr: your BIOS has set up an incorrect mask, fixing it up.
Hi Eric, if the error is informational, do you think it is still proper behaviour to oops? Can the detection just be set to notify, or is there something else going on here that warrants the kernel to behave this way? (I'm trying to figure out if this is really a bug, and if not, how to keep the oops from triggering abrt so that many other users don't report a similar problem) As far as I can tell, it did not oops. What makes you say it was an oops? (backtrace != oops ... the kernel can dump_stack() for "how'd we get here" information any time it pleases) I'm referring to the event that triggered abrt ... if the error is only informational, then what changes make sense to keep this error from being repeatedly reported ...? I'm not sure ... I don't know what heuristics abrt uses. But AFAIK this is just an informational message.
Well, appears that we're not the only one to notice:
commit 942fa3b63eb525aa0512ba28c42e656d8efc6787
Author: Alan Cox <alan.com>
Date: Mon Feb 8 10:03:17 2010 +0000
x86, mtrr: Kill over the top warn
Fixes bugzilla: http://bugzilla.kernel.org/show_bug.cgi?id=12558
Fixes bugzilla: http://bugzilla.kernel.org/show_bug.cgi?id=12317
(and if this really needed to be a warn you'd be responding to the bugs left
in bugzilla from it...)
Signed-off-by: Alan Cox <alan.com>
LKML-Reference: <20100208100239.2568.2940.stgit>
Signed-off-by: H. Peter Anvin <hpa>
and from that bug:
"It's not an oops - it's just a noisy warning. The kernel is boasting that
your bios is busted, and we fixed it up.
That warning should be toned down a bit - it just misleads people."
So I guess perhaps we should look at pulling back that fix/change.
-Eric
Brock, Eric is 100% correct here. A trace != oops/panic. If abrt is only supposed to report panics or oopses it should differentiate between BUG() warnings and panics. I'm not sure if it has the smarts to do so ... Having said that -- what HW was this seen on? P. (In reply to comment #7) > I'm not sure ... I don't know what heuristics abrt uses. But AFAIK this is > just an informational message. Ok, I'm not sure either ... I just don't want a flood of these informational messages cluttering the bug lists ... I'll have to check with the abrt folks to see what they are looking for ... (In reply to comment #8) > Brock, Eric is 100% correct here. A trace != oops/panic. > > If abrt is only supposed to report panics or oopses it should differentiate > between BUG() warnings and panics. I'm not sure if it has the smarts to do so > ... > > Having said that -- what HW was this seen on? Hi Prarit, a Lenovo Thinkstation D10 649325U Desktop Server ... Regards, Brock I took a look at the code, and this is definitely not a bug. The message is a valid warning about the BIOS on the system. The BIOS has set an incorrect MTRR Physical Mask value, and the OS has detected the problem. Bottom line -- if we see this in the field it would let customers and partners know that there was something wrong with their BIOSes. P. Prarit, maybe we should change this to not WARN_ON, and just printk, as suggested in bug 579563 ? Backtraces scare people (and abrt) and it's not useful in this case... -Eric (In reply to comment #12) > Prarit, maybe we should change this to not WARN_ON, and just printk, as > suggested in bug 579563 ? > > Backtraces scare people (and abrt) and it's not useful in this case... > > -Eric This makes sense, we have a couple of bugs of this, perhaps we should mark them all dups of bug 579563. bug 614021 bug 607246 bug 579563 |