Bug 1985598

Summary: Kernel wont boot - AMDGPU
Product: [Fedora] Fedora Reporter: Kurt Heine <kheine7>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 34CC: acaringi, adscvr, airlied, alciregi, bskeggs, fhirtz, hdegoede, jarodwilson, jeremy, jglisse, jonathan, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, ptalbert, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-07 22:42:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Kernel boot message
none
Picture of kernel boot
none
Boot kernel journal message for kernel 5.13.9
none
lspci for system
none
Boot kernel journal message for kernel 5.13.9 - drm.debug=0x1fff none

Description Kurt Heine 2021-07-24 08:08:28 UTC
1. Please describe the problem:

When booting the kernel stops at fb0: switching to  amdgpudrmfb from EFI VGA

2. What is the Version-Release number of the kernel:

Happens with the latest kernel 5.13.4 but works fine with 5.12.x releases.

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Happens with the latest kernel 5.13.4 but works fine with 5.12.x releases.
Reboot with kernel 5.13.4.


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

I don't want to install rawhide kernel as I can use the 5.12 kernels to boot.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Kurt Heine 2021-07-24 08:09:27 UTC
Created attachment 1805087 [details]
Kernel boot message

Comment 2 Kurt Heine 2021-07-25 03:57:31 UTC
Comment on attachment 1805087 [details]
Kernel boot message

I have also tried the following command which didnt fix the problem, but was able to boot the 5.12.x kernels

sudo dracut --regenerate-all --force --install "/usr/lib/firmware/amdgpu/*"

Comment 3 Kurt Heine 2021-07-25 23:09:11 UTC
The following packages are installed:

linux-firmware-20210716-121.fc34.noarch
dracut-055-3.fc34.x86_64

I also tried sudo dracut --regenerate-all --force --install "/lib/firmware/amdgpu/*" and that didnt work either.

Comment 4 Kurt Heine 2021-07-28 04:14:49 UTC
Created attachment 1806476 [details]
Picture of kernel boot

Screen capture of issue.

Comment 5 Kurt Heine 2021-07-28 04:15:45 UTC
Arch has the same issue logged via:

https://bugs.archlinux.org/task/71605

Comment 6 Kurt Heine 2021-07-29 22:57:38 UTC
Updated to the latest kernel from Fedora - kernel-5.13.5-200.fc34.x86_64 and still no joy.

I did notice that when it booted it couldn't find the amdgpu/navi10_smc.bin file and returned an error -22 on firmware loading.  I uncompressed (xz) the firmware file and regenerated the initrd file with 

sudo dracut --regenerate-all --force --install "/lib/firmware/amdgpu/*" 

and this fixed the loading of the firmware issue but it still gets 'stuck' on the switching of the framebuffer to

fb0: switching to amdgpudrmfb from EFI VGA

So that steps are there to look into this issue

Comment 7 Kurt Heine 2021-07-29 23:02:26 UTC
The /etc/sysconfig/grub file

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="acpi_backlight=vendor"
GRUB_CMDLINE_LINUX_DEFAULT="acpi_backlight=vendor amdgpu.dpm=0 amdgpu.runpm=0 pcie_port_pm=force mitigations=off amdgpu.ppfeaturemask=0xffffffff"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

Comment 8 Kurt Heine 2021-07-29 23:36:58 UTC
I can bypass the issue if I add the grub 
   nomodeset
parameter to the kernel boot.  Whilst it gets me around the booting issue, the 5.12 -> 5.13 kernel update means I loose functionality, etc so it would be good to understand the reason why.

Comment 9 Kurt Heine 2021-08-11 03:33:04 UTC
I have tried Fedora kernels from 5.13.4 through to 5.13.8 and still there is no fix.

Comment 10 Kurt Heine 2021-08-11 07:43:14 UTC
I have included boot logs from 5.13.9-200 fedora kernel update.

Comment 11 Kurt Heine 2021-08-11 07:44:23 UTC
Created attachment 1813004 [details]
Boot kernel journal message for kernel 5.13.9

Comment 12 Kurt Heine 2021-08-11 07:44:49 UTC
Created attachment 1813005 [details]
lspci for system

Comment 13 Kurt Heine 2021-08-11 07:59:55 UTC
Created attachment 1813007 [details]
Boot kernel journal message for kernel 5.13.9 - drm.debug=0x1fff

Comment 14 Kurt Heine 2021-08-15 22:52:08 UTC
I have not tried kernel-5.13.9-200 and it now works but I cannot get the second external monitor working with DisplayPort.

EFI grub2.conf
   BOOT_IMAGE=(hd3,gpt2)/vmlinuz-5.12.17-300.fc34.x86_64 root=UUID=8cfd63b2-f18e-437b-b1c7-a8abafb8e245 ro rootflags=subvol=root acpi_backlight=vendor mitigations=off pcie_port_pm=force amdgpu.ppfeaturemask=0xffffffff drm.debug=0x1ff

I have removed all options and added one each reboot but cannot get it working yet.

Comment 15 Ben Cotton 2022-05-12 15:49:48 UTC
This message is a reminder that Fedora Linux 34 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '34'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 34 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 16 Ben Cotton 2022-06-07 22:42:13 UTC
Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07.

Fedora Linux 34 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release.

Thank you for reporting this bug and we are sorry it could not be fixed.