Bug 1612416 - kernel-4.17.11-1 on fc28 x86_64 fails to boot on AMD Ryzen can't mount /boot/efi which is vfat
Summary: kernel-4.17.11-1 on fc28 x86_64 fails to boot on AMD Ryzen can't mount /boot/...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 28
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-04 05:27 UTC by Andrew Roberts
Modified: 2018-10-02 14:18 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-10-02 14:18:53 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl entries for failed boot. (132.13 KB, text/plain)
2018-08-04 05:27 UTC, Andrew Roberts
no flags Details
gnome-shell core dump (4.60 KB, text/plain)
2018-08-05 07:35 UTC, Andrew Roberts
no flags Details

Description Andrew Roberts 2018-08-04 05:27:09 UTC
Created attachment 1473243 [details]
journalctl entries for failed boot.

Description of problem:

dnf update, installed kernel-4.17.11-1.fc28.x86_64
noted kernel update and set system to boot into text mode:

systemctl set-default -f multi-user.target

booted new kernel to text mode

installed NVIDIA x86_64-396.45 driver (manually) - NVIDIA GeForce GTX 1060 6Gb (GP106-A) 
rebuilt and installed it87 driver for new kernel

no errors building and installing either driver

set system to boot into graphical mode:

systemctl set-default -f graphical.target

rebooted

System hung during reboot, reset and tried to boot new kernel. Got a systemd message saying it booted into emergency mode, and requested password. Unable to continue boot, rebooting to same kernel got same result.

Booted old kernel in text mode and all as well again, rebuilt drivers and
booted old kernel into graphical mode (again ok). old kernel was 4.17.9-200

Version-Release number of selected component (if applicable):

kernel-4.17.11-1.fc28.x86_64

How reproducible:

Always with this system


Steps to Reproduce:

See above

Actual results:

Boots to emergency mode:
Aug 04 05:11:44 ryzen systemd[1]: Mounting /boot/efi...
Aug 04 05:11:44 ryzen mount[864]: mount: /boot/efi: unknown filesystem type 'vfat'.
Aug 04 05:11:44 ryzen systemd[1]: boot-efi.mount: Mount process exited, code=exited status=32
Aug 04 05:11:44 ryzen systemd[1]: boot-efi.mount: Failed with result 'exit-code'.
Aug 04 05:11:44 ryzen systemd[1]: Failed to mount /boot/efi.
...
Aug 04 05:11:44 ryzen systemd[1]: Started Emergency Shell.


Expected results:

Normal operation

Additional info:

/boot/efi is:

/dev/sda1 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro)

did dracut fail to install vfat driver??

System hardware:

Processor: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
Motherboard: Asus PRIME B350-PLUS, BIOS 4011 04/19/2018
Graphics card: NVIDIA GeForce GTX 1060 6Gb (GP106-A) 

Kernel command line:
Aug 04 06:11:42 ryzen dracut-cmdline[318]: Using kernel command line parameters: BOOT_IMAGE=/vmlinuz-4.17.11-200.fc28.x86_64 root=/dev/mapper/VG_LinSSD-root ro rd.lvm.lv=VG_LinSSD/root rhgb quiet rd.driver.blacklist=nouveau LANG=en_GB.UTF-8

Comment 1 Andrew Roberts 2018-08-05 03:33:03 UTC
Tried again with updated kernel this time using new NVIDIA 396.51 driver.

The reboot from the old kernel to the new kernel in text mode took 15 secs from reboot command to bios splash screen.

After rebuilding the drivers (nvidia,it87) the reboot this time took 15 secs again. The previous attempt took far longer than this (over a minute) before I reset the machine. I have previously seen very long reboot times on this machine on occasion, with various kernels in the past.

After rebooting to graphicalmode, the reboot again only took 15 secs, and this time it worked ok. 

Not sure if the change of the NVIDIA driver is relevant, or if the initial install of the kernel failed to build the proper module support.

After the failure of the kernel the 1st time, I had uninstalled it. So for this new attempt I reinstalled the kernel using dnf update.

journalctl shows /boot/efi mounted without issues:

Aug 05 04:18:17 ryzen systemd[1]: Mounting /boot/efi...
Aug 05 04:18:17 ryzen systemd[1]: Mounted /boot/efi.

Comment 2 Andrew Roberts 2018-08-05 07:33:09 UTC
Ok, so this kernel (4.17.11-200.fc28.x86_64) still seems to be creating issues.
After reinstalling kernel along with lastest NVIDIA 396.51 driver, I left the system logged in. Within 5 minutes the desktop had crashed. I found it three hours later. I had to reset system again as neither keyboard or mouse functional. 
The crash is in the gnome desktop.

Aug 05 04:18:19 ryzen systemd-coredump[1455]: Process 1382 (gnome-shell) of user
 42 dumped core.

I've attached the full stack trace.

I've now rebooted back to the old (4.17.9-200.fc28.x86_64) kernel, and so far no crashes. I've been running FC 26-28 continually on this hardware with no crashes with the latest (at the time) NVIDIA driver since Apr 2017.

Comment 3 Andrew Roberts 2018-08-05 07:35:31 UTC
Created attachment 1473380 [details]
gnome-shell core dump

gnome-shell core dump from the system journal.

Comment 4 Andrew Roberts 2018-08-09 05:24:31 UTC
Kernel 4.17.12-200 seems to resolve these issues so far.

Comment 5 Laura Abbott 2018-10-01 21:19:31 UTC
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs.
 
Fedora 28 has now been rebased to 4.18.10-300.fc28.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 29, and are still experiencing this issue, please change the version to Fedora 29.
 
If you experience different issues, please open a new bug report for those.

Comment 6 Andrew Roberts 2018-10-02 03:14:08 UTC
I haven't seen this issue reoccur except with that specific kernel ((4.17.11-200.fc28.x86_64). So ok to close.

Comment 7 Laura Abbott 2018-10-02 14:18:53 UTC
Thanks for letting us know


Note You need to log in before you can comment on or make changes to this bug.