Bug 1279004

Summary:	WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2c5/0x380()
Product:	[Fedora] Fedora	Reporter:	Edward O'Callaghan <eocallaghan>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	23	CC:	eocallaghan, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-10-26 16:48:16 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Edward O'Callaghan 2015-11-07 01:43:40 UTC

Description of problem:

Kernel oops at boot :/

Version-Release number of selected component (if applicable):

[edward@foo ~]$ uname -a
Linux foo.bar 4.2.5-300.fc23.x86_64 #1 SMP Tue Oct 27 04:29:56 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Fedora 23 - fully patched as of this bug report time/date.

How reproducible:

Boot Fedora 23 on a Lenovo T450.

Actual results:

[    1.112486] software IO TLB [mem 0xab0aa000-0xaf0aa000] (64MB) mapped at [ffff8800ab0aa000-ffff8800af0a9fff]
[    1.112676] RAPL PMU detected, API unit is 2^-32 Joules, 4 fixed counters 655360 ms ovfl timer
[    1.112680] hw unit of domain pp0-core 2^-14 Joules
[    1.112683] hw unit of domain package 2^-14 Joules
[    1.112685] hw unit of domain dram 2^-14 Joules
[    1.112687] hw unit of domain pp1-gpu 2^-14 Joules
[    1.112770] resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff]
[    1.112772] ------------[ cut here ]------------
[    1.112782] WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2c5/0x380()
[    1.112785] Info: mapping multiple BARs. Your kernel is fine.
[    1.112787] Modules linked in:

[    1.112796] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.5-300.fc23.x86_64 #1
[    1.112799] Hardware name: LENOVO 20BV0020AU/20BV0020AU, BIOS JBET51WW (1.16 ) 07/08/2015
[    1.112803]  0000000000000000 00000000abfe5e8b ffff88023460ba78 ffffffff8177280a
[    1.112810]  0000000000000000 ffff88023460bad0 ffff88023460bab8 ffffffff8109e4b6
[    1.112815]  ffff88023460bae8 00000000fed10000 ffffc90000e90000 0000000000006000
[    1.112821] Call Trace:
[    1.112831]  [<ffffffff8177280a>] dump_stack+0x45/0x57
[    1.112839]  [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0
[    1.112845]  [<ffffffff8109e545>] warn_slowpath_fmt+0x55/0x70
[    1.112852]  [<ffffffff81065b95>] __ioremap_caller+0x2c5/0x380
[    1.112858]  [<ffffffff81065c67>] ioremap_nocache+0x17/0x20
[    1.112867]  [<ffffffff81039c89>] snb_uncore_imc_init_box+0x79/0xb0
[    1.112873]  [<ffffffff81038434>] uncore_pci_probe+0xd4/0x1a0
[    1.112881]  [<ffffffff813e61a5>] local_pci_probe+0x45/0xa0
[    1.112890]  [<ffffffff8129a67d>] ? sysfs_do_create_link_sd.isra.2+0x6d/0xb0
[    1.112897]  [<ffffffff813e739d>] pci_device_probe+0xed/0x140
[    1.112905]  [<ffffffff814d07f4>] driver_probe_device+0x1f4/0x450
[    1.112911]  [<ffffffff814d0ae0>] __driver_attach+0x90/0xa0
[    1.112918]  [<ffffffff814d0a50>] ? driver_probe_device+0x450/0x450
[    1.112924]  [<ffffffff814ce2bc>] bus_for_each_dev+0x6c/0xc0
[    1.112930]  [<ffffffff814cfffe>] driver_attach+0x1e/0x20
[    1.112936]  [<ffffffff814cfb4b>] bus_add_driver+0x1eb/0x280
[    1.112943]  [<ffffffff81d65149>] ? uncore_cpu_setup+0x12/0x12
[    1.112949]  [<ffffffff814d1350>] driver_register+0x60/0xe0
[    1.112956]  [<ffffffff813e5a8c>] __pci_register_driver+0x4c/0x50
[    1.112961]  [<ffffffff81d6521b>] intel_uncore_init+0xd2/0x2be
[    1.112967]  [<ffffffff81d65149>] ? uncore_cpu_setup+0x12/0x12
[    1.112973]  [<ffffffff81002123>] do_one_initcall+0xb3/0x200
[    1.112979]  [<ffffffff810bbee1>] ? parse_args+0x271/0x4a0
[    1.112986]  [<ffffffff81778c00>] ? ldsem_down_write+0x170/0x199
[    1.112993]  [<ffffffff81d571dc>] kernel_init_freeable+0x18e/0x228
[    1.112999]  [<ffffffff81768f50>] ? rest_init+0x80/0x80
[    1.113003]  [<ffffffff81768f5e>] kernel_init+0xe/0xe0
[    1.113010]  [<ffffffff817795df>] ret_from_fork+0x3f/0x70
[    1.113014]  [<ffffffff81768f50>] ? rest_init+0x80/0x80
[    1.113023] ---[ end trace a6c58dc9f39c03db ]---
[    1.113175] microcode: CPU0 sig=0x306d4, pf=0x40, revision=0x21


Additional info:

Let me know what other information I can provide?

Kind Regards,

Comment 1 Josh Boyer 2015-11-10 13:13:59 UTC

Is anything not working here?  As far as we know, this is just an unhelpful informational message.

Comment 2 Edward O'Callaghan 2015-11-10 23:21:53 UTC

Josh, it _is_ helpful if you read it carefully.

bisecting I believe it occurs around commit 15c1247953e8a45232ed5a5540f291d2d0a77665

The issue looks to be caused by uncore box initialization, certainly the box init should not be in the IPI context. This patch^[1] moves the box init into the uncore event init which seems like the way to go.

 $ cat /proc/cpuinfo | grep model
model           : 61
model name      : Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz

Kind Regards,
Edward.

[1] https://lkml.org/lkml/2015/4/28/21

Comment 3 Josh Boyer 2015-11-11 01:22:19 UTC

(In reply to Edward O'Callaghan from comment #2)
> Josh, it _is_ helpful if you read it carefully.
> 
> bisecting I believe it occurs around commit
> 15c1247953e8a45232ed5a5540f291d2d0a77665
> 
> The issue looks to be caused by uncore box initialization, certainly the box
> init should not be in the IPI context. This patch^[1] moves the box init
> into the uncore event init which seems like the way to go.

The patch you pointed to was never upstreamed, because the commit you bisected to (I think?) was reverted with 15c1247953e8a45232ed5a5540f291d2d0a77665, which is in 4.1-rc8.  So the kernel you hit this with already has a fix for that issue.

It's certainly possible something else is still needed, but it isn't that patch.

You also did not answer my question.  Is anything actually not working on your machine?  Does the boot continue after the oops?

Comment 4 Edward O'Callaghan 2015-11-30 16:04:28 UTC

OK so the problem those is that the commit that reverts the previous commit is returning to the original problem. The point of the revert was that the commit that attempted to fix the original problem was confused.

Naturally the oops causes erratic system behavior as would ofcourse be expected. The kernel does not oops for giggles, however you are right that a oops isn't a full panic so the machine can obviously boot.

I'm not sure I can provide any further information.

Comment 5 Edward O'Callaghan 2015-11-30 16:06:57 UTC

This is a potential duplicate: bug 1083853

Comment 6 Laura Abbott 2016-09-23 19:32:42 UTC

*********** MASS BUG UPDATE **************
 
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 23 kernel bugs.
 
Fedora 23 has now been rebased to 4.7.4-100.fc23.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 24 or 25, and are still experiencing this issue, please change the version to Fedora 24 or 25.
 
If you experience different issues, please open a new bug report for those.

Comment 7 Laura Abbott 2016-10-26 16:48:16 UTC

*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.

Comment 8 Red Hat Bugzilla 2023-09-14 03:12:41 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days