Bug 1607901

Summary: Kernel 4.18 causing systemd to coredump on boot on i686
Product: [Fedora] Fedora Reporter: Jeff Backus <jeff.backus>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: airlied, bskeggs, dominik, ewk, hdegoede, ichavero, itamar, jarodwilson, jeff.backus, jforbes, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linville, mchehab, mjg59, steved
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-28 22:44:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1489998    
Attachments:
Description Flags
Portion of systemd log showing coredump
none
Patch to fix udev segfault issue. none

Description Jeff Backus 2018-07-24 14:13:12 UTC
Created attachment 1470305 [details]
Portion of systemd log showing coredump

Description of problem:
System fails to boot with kernel 4.18 on i686.

Version-Release number of selected component (if applicable):
Rawhide and F28, using any of the kernel release candidates after rc0.git2.1. On F28, systemd is what is available from official repos

How reproducible:
With small sample size, able to reproduce 66% of time. Able to reproduce in a VM.

Steps to Reproduce:
1. Install one of the kernel 4.18 release candidates after rc0.git2.1 on a 32bit Fedora 28 install
2. Reboot and select 4.18 kernel

Actual results:
Booting halts, dropping to dracut prompt. Log indicates coredump. Last systemd msg prior to coredump:
Jul 24 04:41:18 localhost.localdomain systemd[661]: var-lib-nfs-rpc_pipefs.mount: Executing: /usr/bin/mount sunrpc /var/lib/nfs/rpc_pipefs -t rpc_pipefs

I've attached a portion of the log from boot. Full log available on request.

Expected results:
Successful boot, leaving user at graphical login

Additional info:
Was able to narrow down when this showed up to somewhere between kernel dist-git  commits dc16ce7d36f (good, produced rc0.git2.1) and 6a5d7f80f2 (bad, produced rc0.git5.1). The following commits would not compile, probably due to soundwire issue (fix introduced in 6a5d7f80f2):
  - 037431cf9
  - 6cf9fb960
  - 9382c1533
  - 4b8512e91

Comment 1 Justin M. Forbes 2018-07-24 15:30:29 UTC
Adding it to the i686 blocker, the SIG should track this down and fix it.

Comment 2 Jeff Backus 2018-07-24 15:39:57 UTC
(In reply to Justin M. Forbes from comment #1)
> Adding it to the i686 blocker, the SIG should track this down and fix it.

Thanks, Justin. Working on it.

Comment 3 Jeff Backus 2018-07-25 16:27:49 UTC
Cherry-picked dist-git commit 6a5d7f80f2 into the commits above that failed to compile. Was able to get 037431cf9 and 6cf9fb960 to compile and they don't show the segfault bug. Commit 9382c1533 compiles and does show the segfault issue, so I suspect bug was introduced. I believe this dist-git commit corresponds to kernel commit 1c8c5a9d38f6...

Comment 4 Jeff Backus 2018-08-09 12:34:49 UTC
Finally managed to track it down to commit 24dea04767e6 in Linus's tree. Interestingly, the commit message is:

> Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from x32 JIT.

I'll try contacting upstream and keep digging...

Comment 5 Jeff Backus 2018-08-09 21:33:45 UTC
Created attachment 1474835 [details]
Patch to fix udev segfault issue.

Comment 6 Jeff Backus 2018-08-09 21:38:47 UTC
Found the issue. Looks like upstream reduced the size of the stack when making the above mentioned JIT improvements. It appears that the JIT is overrunning its stack, inducing random crashes.

The patch I just added fixes the issue. Next week I'll work on submitting to upstream.

Comment 7 Laura Abbott 2018-08-10 12:08:53 UTC
That's a great find!

Comment 8 Laura Abbott 2018-10-01 21:17:47 UTC
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs.
 
Fedora 28 has now been rebased to 4.18.10-300.fc28.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 29, and are still experiencing this issue, please change the version to Fedora 29.
 
If you experience different issues, please open a new bug report for those.

Comment 9 Dominik 'Rathann' Mierzejewski 2018-10-04 17:59:01 UTC
I can confirm that this isn't occurring with 4.18.10-200.fc28.i686 running on Intel Atom N270.

Comment 10 Ben Cotton 2019-05-02 19:22:39 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Ben Cotton 2019-05-02 20:44:41 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Ben Cotton 2019-05-28 22:44:58 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 Red Hat Bugzilla 2023-09-14 04:32:01 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days