Bug 1551285

Summary: Kernel: Booting with module ipmi_si crashes the kernel
Product: [Fedora] Fedora Reporter: Dieter Stolte <dstolte>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 26CC: airlied, bskeggs, ewk, hdegoede, ichavero, itamar, jarodwilson, jcline, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linville, mchehab, mjg59, rkudyba, steved
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-29 11:24:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Kernel Trace with kernel 4.15.9 none

Description Dieter Stolte 2018-03-04 12:20:09 UTC
Description of problem:
After updating from kernel-4.14.14-200.fc26.x86_64 to kernel-4.15.4-200.fc26.x86_64 the system hangs on boot while loading/initializing the module ipmi_si. The system is an HP ProLiant MicroServer N40L.

I get the following message on screen and in the logs:

Mär 04 09:18:45 nas kernel: ipmi message handler version 39.2
Mär 04 09:18:45 nas kernel: ipmi device interface
Mär 04 09:18:45 nas kernel: IPMI System Interface driver.
Mär 04 09:18:45 nas kernel: ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS
Mär 04 09:18:45 nas kernel: ipmi_si: SMBIOS: io 0xca8 regsize 1 spacing 1 irq 0
Mär 04 09:18:45 nas kernel: ipmi_si: Adding SMBIOS-specified kcs state machine
Mär 04 09:18:45 nas kernel: ipmi_platform: probing via SPMI
Mär 04 09:18:45 nas kernel: ipmi_si: SPMI: mem 0x0 regsize 1 spacing 1 irq 0
Mär 04 09:18:45 nas kernel: ipmi_si: Adding SPMI-specified smic state machine
Mär 04 09:18:45 nas kernel: ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xca8, slave address 0x20, irq 0
Mär 04 09:18:45 nas kernel: ipmi_si dmi-ipmi-si.0: Interface detection failed
Mär 04 09:18:45 nas kernel: ipmi_si: Trying SPMI-specified smic state machine at mem address 0x0, slave address 0x0, irq 0
Mär 04 09:18:45 nas kernel: ipmi_si ipmi_si.0: Could not set up I/O space
Mär 04 09:18:45 nas kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
Mär 04 09:18:45 nas kernel: IP: kernfs_find_ns+0x11/0xb0
Mär 04 09:18:45 nas kernel: PGD 0 P4D 0 
Mär 04 09:18:46 nas kernel: Oops: 0000 [#1] SMP NOPTI
Mär 04 09:18:46 nas kernel: Modules linked in: ipmi_si(+) ipmi_devintf ipmi_msghandler k10temp ptp pps_core sp5100_tco i2c_piix4 shpchp acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ast i2c_algo_bit drm_kms_helper uas ttm usb_storage ata_generic drm pata_acpi p
Mär 04 09:18:46 nas kernel: CPU: 1 PID: 503 Comm: systemd-udevd Not tainted 4.15.4-200.fc26.x86_64 #1
Mär 04 09:18:46 nas kernel: Hardware name: HP ProLiant MicroServer, BIOS O41     07/29/2011
Mär 04 09:18:46 nas kernel: RIP: 0010:kernfs_find_ns+0x11/0xb0
Mär 04 09:18:46 nas kernel: RSP: 0018:ffffad0fc0937b28 EFLAGS: 00010246
Mär 04 09:18:46 nas kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000004e00
Mär 04 09:18:46 nas kernel: RDX: 0000000000000000 RSI: ffffffff99ea6dc8 RDI: 0000000000000000
Mär 04 09:18:46 nas kernel: RBP: ffffffff99ea6dc8 R08: 0000000000025120 R09: ffffffffc03bc828
Mär 04 09:18:46 nas kernel: R10: ffffef9c045d33c0 R11: 0000000000000001 R12: 0000000000000000
Mär 04 09:18:46 nas kernel: R13: ffff8a8d97ad4120 R14: 0000000000000000 R15: 0000000000000000
Mär 04 09:18:46 nas kernel: FS:  00007fe6bae61200(0000) GS:ffff8a8d9fc80000(0000) knlGS:0000000000000000
Mär 04 09:18:46 nas kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mär 04 09:18:46 nas kernel: CR2: 0000000000000070 CR3: 0000000119bce000 CR4: 00000000000006e0
Mär 04 09:18:46 nas kernel: Call Trace:
Mär 04 09:18:46 nas kernel:  kernfs_find_and_get_ns+0x2c/0x50
Mär 04 09:18:46 nas kernel:  sysfs_unmerge_group+0x18/0x60
Mär 04 09:18:46 nas kernel:  dpm_sysfs_remove+0x1d/0x60
Mär 04 09:18:46 nas kernel:  device_del+0x56/0x350
Mär 04 09:18:46 nas kernel:  platform_device_del.part.10+0x1e/0x70
Mär 04 09:18:46 nas kernel:  platform_device_unregister+0x13/0x20
Mär 04 09:18:46 nas kernel:  try_smi_init+0x8c9/0x104e [ipmi_si]
Mär 04 09:18:46 nas kernel:  init_ipmi_si+0x180/0x1b0 [ipmi_si]
Mär 04 09:18:46 nas kernel:  ? ipmi_si_add_smi+0x250/0x250 [ipmi_si]
Mär 04 09:18:46 nas kernel:  do_one_initcall+0x4e/0x190
Mär 04 09:18:46 nas kernel:  ? free_unref_page_commit+0x9f/0x110
Mär 04 09:18:46 nas kernel:  ? _cond_resched+0x15/0x40
Mär 04 09:18:46 nas kernel:  ? kmem_cache_alloc_trace+0xac/0x1b0
Mär 04 09:18:46 nas kernel:  ? do_init_module+0x22/0x201
Mär 04 09:18:46 nas kernel:  do_init_module+0x5b/0x201
Mär 04 09:18:46 nas kernel:  load_module+0x26b1/0x2b60
Mär 04 09:18:46 nas kernel:  ? alloc_vmap_area+0x7c/0x350
Mär 04 09:18:46 nas kernel:  ? SYSC_init_module+0x160/0x190
Mär 04 09:18:46 nas kernel:  ? _cond_resched+0x15/0x40
Mär 04 09:18:46 nas kernel:  SYSC_init_module+0x160/0x190
Mär 04 09:18:46 nas kernel:  do_syscall_64+0x75/0x180
Mär 04 09:18:46 nas kernel:  entry_SYSCALL_64_after_hwframe+0x21/0x86
Mär 04 09:18:46 nas kernel: RIP: 0033:0x7fe6b9ab611a
Mär 04 09:18:46 nas kernel: RSP: 002b:00007fff0d984fa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
Mär 04 09:18:46 nas kernel: RAX: ffffffffffffffda RBX: 000055e5c0185830 RCX: 00007fe6b9ab611a
Mär 04 09:18:46 nas kernel: RDX: 00007fe6ba5f0e8d RSI: 000000000001dc53 RDI: 000055e5c01e7dc0
Mär 04 09:18:46 nas kernel: RBP: 00007fe6ba5f0e8d R08: 000055e5c0182c70 R09: 0000000000000050
Mär 04 09:18:46 nas kernel: R10: 00007fe6b9d74b00 R11: 0000000000000246 R12: 000055e5c01e7dc0
Mär 04 09:18:46 nas kernel: R13: 000055e5c017ece0 R14: 0000000000020000 R15: 000055e5be7ffdf7
Mär 04 09:18:46 nas kernel: Code: 18 b8 01 00 00 00 5b 5d c3 48 83 6e 40 01 48 8b 77 08 eb da 66 0f 1f 44 00 00 0f 1f 44 00 00 41 55 41 54 48 85 d2 55 53 0f 95 c1 <0f> b7 47 70 49 89 d4 49 89 f5 66 83 e0 20 0f 95 c2 38 d1 75 4f 
Mär 04 09:18:46 nas kernel: RIP: kernfs_find_ns+0x11/0xb0 RSP: ffffad0fc0937b28
Mär 04 09:18:46 nas kernel: CR2: 0000000000000070
Mär 04 09:18:46 nas kernel: ---[ end trace 5c918daa4fe81ee6 ]---


Version-Release number of selected component (if applicable):
Linux version 4.15.4-200.fc26.x86_64 (mockbuild@bkernel02.phx2.fedoraproject.org) (gcc version 7.3.1 20180130 (Red Hat 7.3.1-2) (GCC)) #1 SMP Mon Feb 19 19:43:32 UTC 2018

How reproducible:
100%

Steps to Reproduce:
1. Boot kernel
2.
3.

Actual results:
Kernel hangs

Expected results:
Booting Kernel

Additional info:
1. Updating to 4.15.6-200.fc26.x86_64 doesnt help
2. Someone else has the same problem: https://bugs.archlinux.org/task/57429

Comment 1 Laura Abbott 2018-03-05 22:45:13 UTC
Please try the scratch build at https://koji.fedoraproject.org/koji/taskinfo?taskID=25506144

*** This bug has been marked as a duplicate of bug 1549316 ***

Comment 2 Dieter Stolte 2018-03-09 21:22:52 UTC
Some questions:
1. is there also a kernel for fedora 26 to test?
2. I installed the rpm kernel-core-4.15.7-200.fc26.x86_64 from updates-testing. is this the proposed fixed kernel? if the answer is yes, then it still fails with above error message.
3. how do I install a scratch build kernel?

Comment 3 Jeremy Cline 2018-03-14 14:09:20 UTC
Hi Dieter,

According to the bug this was marked a duplicate of, this got fixed in 4.15.8. I think that build failed for F26, but 4.15.9 should be in updates-testing now.

Comment 4 Dieter Stolte 2018-03-19 10:39:56 UTC
Hi Jeremy,

thanks for your reply. I installed kernel 4.15.9-200.fc26.x86_64 but it still crashes on the ipmi module:

Mär 19 10:43:08 nas kernel: Linux version 4.15.9-200.fc26.x86_64 (mockbuild@bkernel01.phx2.fedoraproject.org) (gcc version 7.3.1 20180130 (Red Hat 7.3.1-2) (GCC)) #1 SMP Mon Mar 12 17:11:43 UTC 2018

...

Mär 19 10:33:42 nas kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030

Thats all from the log files. I made a screenshot with the kernel trace but cant load it from my phone because bluetooth is also broken in fedora 26 at the moment. Trying to fix that.

I reopened the bug nevertheless because it is clearly not fixed.

Comment 5 Dieter Stolte 2018-03-19 10:53:00 UTC
Created attachment 1409783 [details]
Kernel Trace with kernel 4.15.9

Screenshot of kernel trace

Comment 6 Laura Abbott 2018-03-19 15:21:30 UTC
Can you try a rawhide kernel? The maintainer has been fixing things so it's possible something was missed from stable.

Comment 7 Dieter Stolte 2018-03-19 16:10:58 UTC
Installed kernel-4.16.0-0.rc5.git3.1.fc29.x86_64.rpm from rawhide. Doesnt work, same trace.

Comment 8 Fedora End Of Life 2018-05-03 07:56:10 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 9 RobbieTheK 2018-05-15 19:14:17 UTC
We're on Fedora 27 with 4.16.7-200.fc27.x86_64 and the BMC/impi no longer respond on a HP DL180 G6. Not sure if this is the same issue, if not let me know and I'll create a new bug report.

[Tue May 15 09:03:30 2018] ipmi message handler version 39.2
[Tue May 15 09:03:30 2018] ipmi device interface
[Tue May 15 09:03:30 2018] IPMI System Interface driver.
[Tue May 15 09:03:30 2018] ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS
[Tue May 15 09:03:30 2018] ipmi_si: SMBIOS: io 0xca2 regsize 1 spacing 1 irq 0
[Tue May 15 09:03:30 2018] ipmi_si: Adding SMBIOS-specified kcs state machine
[Tue May 15 09:03:30 2018] ipmi_si IPI0001:00: ipmi_platform: probing via ACPI
[Tue May 15 09:03:30 2018] ipmi_si IPI0001:00: [io  0x0ca2] regsize 1 spacing 1 irq 0
[Tue May 15 09:03:30 2018] ipmi_si dmi-ipmi-si.0: Removing SMBIOS-specified kcs state machine in favor of ACPI
[Tue May 15 09:03:30 2018] ipmi_si: Adding ACPI-specified kcs state machine
[Tue May 15 09:03:30 2018] ipmi_platform: probing via SPMI
[Tue May 15 09:03:30 2018] ipmi_si: SPMI: mem 0x0 regsize 1 spacing 1 irq 0
[Tue May 15 09:03:30 2018] ipmi_si: Adding SPMI-specified kcs state machine
[Tue May 15 09:03:30 2018] ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca2, slave address 0x20, irq 0
[Tue May 15 09:03:30 2018] ipmi_si IPI0001:00: There appears to be no BMC at this location
[Tue May 15 09:03:30 2018] ipmi_si: Trying SPMI-specified kcs state machine at mem address 0x0, slave address 0x0, irq 0
[Tue May 15 09:03:30 2018] ipmi_si ipmi_si.0: Could not set up I/O space
[Tue May 15 09:03:30 2018] IPMI SSIF Interface driver

I've tried these settings;
cat /sys/module/ipmi_si/parameters/kipmid_max_busy_us
100

cat /etc/modprobe.d/ipmi_si.conf
options ipmi_si type=kcs ports=0xca8 regspacings=4

Comment 10 Laura Abbott 2018-05-15 19:20:27 UTC
This was related to a kernel panic, what you have is a different issue. If you have a working and non-working kernel your best bet is to do a bisect or report the issue to the upstream maintainer directly.

Comment 11 Fedora End Of Life 2018-05-29 11:24:42 UTC
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.