Bug 1414025

Summary: Kernel 4.9.3-200 breaks frame buffer
Product: [Fedora] Fedora Reporter: Steven A. Falco <stevenfalco>
Component: xorg-x11-drv-atiAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 25CC: cz172638, gansalmon, ichavero, itamar, jonathan, kernel-maint, labbott, madhu.chinakonda, mchehab, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-01 14:34:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
xorg log file
none
dmesg log
none
/var/log/messages none

Description Steven A. Falco 2017-01-17 15:00:04 UTC
Created attachment 1241807 [details]
xorg log file

Description of problem: Just upgraded to kernel 4.9.3-200.  I no longer see messages from the kernel during boot - I just get a black screen.  Also, in Xorg.0.log I get errors like:

[    33.444] (EE) open /dev/dri/card0: No such file or directory
[    33.445] (EE) open /dev/fb0: No such file or directory

Additionally, acceleration no longer works in virtual box - I had to turn that off, else I got just a black screen in the virtual machine.

The previous kernel (4.8.16-300) did not have this problem, so this is a regression.

My video card is [AMD/ATI] Tonga PRO [Radeon R9 285/380] (rev f1)

I also see the following errors in the logwatch report:

WARNING:  Kernel Errors Present
    WARNING: CPU: 2 PID: 471 at drivers/gpu/drm/ttm/ ...:  1 Time(s)
    [drm:amdgpu_cgs_get_firmware_info [amdgpu]] *ERROR* Failed to reque ...:  1 Time(s)
    [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP b ...:  1 Time(s)
    amdgpu 0000:05:00.0: Direct firmware load for amdgpu/tonga_k_smc.bin failed with error -2 ...:  1 Time(s)
    amdgpu 0000:05:00.0: Fatal error during GPU init ...:  1 Time(s)
    amdgpu: probe of 0000:05:00.0 failed with error -22 ...:  1 Time(s)

It looks like firmware load for the video card is broken in this kernel.

I attached the full xorg log file and dmesg log

Version-Release number of selected component (if applicable): 4.9.3-200


How reproducible: 100%


Steps to Reproduce:
1. boot machine
2.
3.

Actual results:
no framebuffer
video card firmware not loaded


Expected results:
usable framebuffer
video card firmware properly loaded

Additional info:

Comment 1 Steven A. Falco 2017-01-17 15:00:38 UTC
Created attachment 1241808 [details]
dmesg log

Comment 2 Laura Abbott 2017-01-17 16:47:45 UTC
Moving this for the graphics team to take a look

Comment 3 Laura Abbott 2017-01-17 21:42:09 UTC
https://koji.fedoraproject.org/koji/taskinfo?taskID=17314239 I pulled in a couple of 'low hanging' ATI graphics fixes to this build, can you please test this? This update will also be available in bodhi.

Comment 4 Steven A. Falco 2017-01-17 23:02:16 UTC
I tried 4.9.4-201.fc25 but it has the same problem.

I went back to 4.8.16-300.fc25.  It behaves properly.

Comment 5 Steven A. Falco 2017-01-25 13:35:20 UTC
I tried 4.9.5-200.fc25.  It too is bad.  In fact, I got no video and no way to log in.  I eventually needed to power cycle and reboot into the 4.8.16-300.fc25 kernel.

I'll add /var/log/messages from the failed boot.

Comment 6 Steven A. Falco 2017-01-25 13:38:05 UTC
Created attachment 1244265 [details]
/var/log/messages

Note that we get some suspicious messages:

Jan 25 08:17:44 saf kernel: amdgpu 0000:05:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff

Jan 25 08:17:44 saf kernel: fbcon: amdgpudrmfb (fb0) is primary device
Jan 25 08:18:44 saf systemd-udevd: seq 2983 '/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0' is taking a long time
Jan 25 08:18:45 saf systemd-udevd: seq 3759 '/devices/virtual/vtconsole/vtcon1' is taking a long time

At that point, with a black screen, I tried C-A-D with no effect.  I finally wound up just power cycling.

Comment 7 Steven A. Falco 2017-01-31 16:37:26 UTC
I tried kernel 4.9.6 with similar bad results.  I noticed that there was a similar kernel bug here:

https://bugzilla.kernel.org/show_bug.cgi?id=193651

so I cross-linked the bugs and added some new log files there.

Comment 8 Steven A. Falco 2017-06-01 12:27:27 UTC
I just upgraded to kernel 4.11.3-200.fc25.x86_64 and the problem no longer occurs.

Please close the bug.