Bug 1772313 - 5.3.11 update broke AMDGPU Raven Ridge
Summary: 5.3.11 update broke AMDGPU Raven Ridge
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: 31
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: dracut-maint-list
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-14 05:50 UTC by Luya Tshimbalanga
Modified: 2020-11-24 19:12 UTC (History)
20 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-11-24 19:12:57 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Boot report (220.95 KB, text/plain)
2019-11-14 05:50 UTC, Luya Tshimbalanga
no flags Details
dmesg report with kernel rawhide (322.92 KB, text/plain)
2019-11-14 06:06 UTC, Luya Tshimbalanga
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 205521 0 None None None 2019-11-14 06:19:50 UTC

Description Luya Tshimbalanga 2019-11-14 05:50:15 UTC
Created attachment 1635997 [details]
Boot report

1. Please describe the problem:

Latest stable kernel update 5.3.11 has a broken firmware for the AMD Raven Ridge resulting a blank screen on boot and login session.

2. What is the Version-Release number of the kernel:

5.3.11

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Yes, 5.3.9 

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

- Update to kernel 5.3.11 on AMD Raven Ridge hardware


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Luya Tshimbalanga 2019-11-14 06:06:30 UTC
Created attachment 1636000 [details]
dmesg report with kernel rawhide

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Same result. Root cause is the broken Raven Ridge firmware on these lines:

nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Direct firmware load for amdgpu/ra>
nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Failed to load gpu_info firmware ">
nov 13 13:53:55 kernel: amdgpu 0000:03:00.0: Fatal error during GPU init

Comment 2 Luya Tshimbalanga 2019-11-14 06:19:50 UTC
Including upstream report as the issue affect all Raven Ridge hardware.

Comment 3 Justin M. Forbes 2019-11-14 17:58:15 UTC
Do you have /usr/lib/firmware/amdgpu/raven_gpu_info.bin installed on your system, and if so, is it listed in the initramfs? 'sudo lsinitrd /boot/initramfs-5.3.11-300.fc31.x86_64.img | greo raven'

Comment 4 Luya Tshimbalanga 2019-11-15 01:37:28 UTC
(In reply to Justin M. Forbes from comment #3)
> Do you have /usr/lib/firmware/amdgpu/raven_gpu_info.bin installed on your
> system,

Yes.
ls /usr/lib/firmware/amdgpu/ | grep raven_gpu
raven_gpu_info.bin

 and if so, is it listed in the initramfs? 'sudo lsinitrd
> /boot/initramfs-5.3.11-300.fc31.x86_64.img | grep raven'

'sudo lsinitrd /boot/initramfs-5.3.11-300.f31.x86_64.img' returned nothing. Contrast with

'sudo lsinitrd /boot/initramfs-5.3.9-300.f31.x86_64.img' which comfirm the installed raven_gpu_info.bin

sudo lsinitrd /boot/initramfs-5.3.9-300.fc31.x86_64.img | grep raven-rw-r--r--   2 root     root        86528 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_asd.bin
-rw-r--r--   1 root     root         9344 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_ce.bin
-rw-r--r--   1 root     root          316 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_gpu_info.bin
-rw-r--r--   1 root     root        17536 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_me.bin
-rw-r--r--   2 root     root       268048 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_mec2.bin
-rw-r--r--   2 root     root            0 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_mec.bin
-rw-r--r--   1 root     root        21632 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_pfp.bin
-rw-r--r--   1 root     root        38324 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_rlc.bin
-rw-r--r--   1 root     root        17408 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_sdma.bin
-rw-r--r--   1 root     root       343456 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_vcn.bin
-rw-r--r--   1 root     root        78336 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_asd.bin
-rw-r--r--   1 root     root         9344 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_ce.bin
-rw-r--r--   1 root     root        23152 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_dmcu.bin
-rw-r--r--   2 root     root          316 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_gpu_info.bin
-rw-r--r--   1 root     root        39084 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_kicker_rlc.bin
-rw-r--r--   1 root     root        17536 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_me.bin
-rw-r--r--   2 root     root       268048 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_mec2.bin
-rw-r--r--   2 root     root            0 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_mec.bin
-rw-r--r--   1 root     root        21632 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_pfp.bin
-rw-r--r--   1 root     root        39084 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_rlc.bin
-rw-r--r--   2 root     root        17408 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_sdma.bin
-rw-r--r--   2 root     root       341728 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_vcn.bin

The same issue with the latest rawhide kernel.

Comment 5 Luya Tshimbalanga 2019-11-15 08:10:00 UTC
It looks dracut seemed the culprit when using the command "dracut initramfs-5.3.11.fc31.c86_64". I made the mistake regenerating with the command "dracut --regenerate" resulting the failure to properly install the firmware. I am reinstalling Fedora 31 and see if the issue will reappear.

Comment 6 Luya Tshimbalanga 2019-11-15 16:05:04 UTC
I reinstalled Fedora 31 and updated to the current kernel 5.3.11. The boot processed nornmally and led to the login screen. Based on the test, it seems dracut somehow managed to no import the firmware and I am not sure how happened. 

sudo lsinitrd /boot/initramfs-5.3.11-300.fc31.x86_64.img | grep raven                                                        
-rw-r--r--   2 root     root        86528 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_asd.bin
-rw-r--r--   1 root     root         9344 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_ce.bin
-rw-r--r--   1 root     root          316 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_gpu_info.bin
-rw-r--r--   1 root     root        17536 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_me.bin
-rw-r--r--   2 root     root       268048 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_mec2.bin
-rw-r--r--   2 root     root            0 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_mec.bin
-rw-r--r--   1 root     root        21632 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_pfp.bin
-rw-r--r--   1 root     root        38324 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_rlc.bin
-rw-r--r--   1 root     root        17408 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_sdma.bin
-rw-r--r--   1 root     root       343456 Jul 24 15:24 usr/lib/firmware/amdgpu/raven2_vcn.bin
-rw-r--r--   1 root     root        78336 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_asd.bin
-rw-r--r--   1 root     root         9344 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_ce.bin
-rw-r--r--   1 root     root        23152 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_dmcu.bin
-rw-r--r--   2 root     root          316 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_gpu_info.bin
-rw-r--r--   1 root     root        39084 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_kicker_rlc.bin
-rw-r--r--   1 root     root        17536 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_me.bin
-rw-r--r--   2 root     root       268048 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_mec2.bin
-rw-r--r--   2 root     root            0 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_mec.bin
-rw-r--r--   1 root     root        21632 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_pfp.bin
-rw-r--r--   1 root     root        39084 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_rlc.bin
-rw-r--r--   2 root     root        17408 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_sdma.bin
-rw-r--r--   2 root     root       341728 Jul 24 15:24 usr/lib/firmware/amdgpu/raven_vcn.bin

Comment 7 Justin M. Forbes 2019-11-15 16:23:07 UTC
Yup, definitely a dracut error.

Comment 8 Ben Cotton 2020-11-03 17:17:12 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Ben Cotton 2020-11-24 19:12:57 UTC
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.