Bug 1742960 - Frozen display (green) on AMD Ryzen 5 2400G
Summary: Frozen display (green) on AMD Ryzen 5 2400G
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 30
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-18 03:47 UTC by Suvayu
Modified: 2019-08-18 07:01 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)
dmesg log (3.30 MB, text/plain)
2019-08-18 03:47 UTC, Suvayu
no flags Details

Description Suvayu 2019-08-18 03:47:04 UTC
Created attachment 1605403 [details]
dmesg log

1. Please describe the problem:

When I boot up, everything seems fine; as in, I can move the mouse, type into the login dialog box, switch to a text terminal and login, etc.

But the moment I actually login to a graphical desktop, the screen becomes green, and everything becomes unresponsive.  I can't even switch to a text terminal.  I have also been unable to login remotely.  I have to hard shutdown the machine at this point.

2. What is the Version-Release number of the kernel:

This problem happens with the 5.2.x series of kernels.  The latest I have tried is: kernel-5.2.8-200.fc30.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
      https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

I first experienced this in: kernel-5.2.5-200.fc30.x86_64

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

1. Boot on a machine with Ryzen 5 2400G
2. Try to login

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
      ``sudo dnf update --enablerepo=rawhide kernel``:

I couldn't install the kernel from rawhide due to gpg errors:

# dnf update --releasever=rawhide --enablerepo=rawhide kernel
...
Key imported successfully
Import of key(s) didn't help, wrong key(s)?
...

I tried looking for the keys on https://getfedora.org/en/security/, but importing https://getfedora.org/static/fedora.gpg makes no difference.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
      issue occurred on a previous boot, use the journalctl ``-b`` flag.

I have attached the full output of `journalctl -b -1 --no-hostname -k > dmesg.txt` (-b -1 was with the problematic kernel), but here are some excerpts:

------------[ cut here ]------------
WARNING: CPU: 2 PID: 475 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1401 dcn_bw_update_from_pplib.cold+0x73/0x9c [amdgpu]
Modules linked in: amdgpu(+) hid_logitech_hidpp(+) amd_iommu_v2 gpu_sched i2c_algo_bit ttm drm_kms_helper crc32c_intel drm r8169 uas usb_storage hid_logitech_dj>
CPU: 2 PID: 475 Comm: systemd-udevd Not tainted 5.2.8-200.fc30.x86_64 #1
Hardware name: Gigabyte Technology Co., Ltd. AB350M-Gaming 3/AB350M-Gaming 3-CF, BIOS F23d 04/17/2018
RIP: 0010:dcn_bw_update_from_pplib.cold+0x73/0x9c [amdgpu]
Code: 48 8b 93 e0 02 00 00 db 42 78 83 f9 02 77 37 b8 02 00 00 00 8d 71 ff e9 1a 67 f7 ff 48 c7 c7 f8 e3 74 c0 31 c0 e8 9b 62 a9 ef <0f> 0b e9 94 67 f7 ff 48 c7>
RSP: 0018:ffffab0fc36f76b0 EFLAGS: 00010246
RAX: 0000000000000024 RBX: ffff94f56f8b6000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff94f580a97900
RBP: ffff94f56fa2e980 R08: 0000000000000001 R09: 00000000000003a8
R10: ffffffffb1bec958 R11: 0000000000000003 R12: ffffab0fc36f7750
R13: 0000000000000001 R14: 000000000000000a R15: ffffab0fc36f78d8
FS:  00007ff98b60d940(0000) GS:ffff94f580a80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005631e9403d68 CR3: 00000007f2f52000 CR4: 00000000003406e0
Call Trace:
 dcn10_create_resource_pool+0x975/0xa30 [amdgpu]
 ? lock_timer_base+0x61/0x80
 ? _cond_resched+0x15/0x30
 ? kmem_cache_alloc_trace+0x154/0x1c0
 ? firmware_parser_create+0x17e/0x5e0 [amdgpu]
 dc_create_resource_pool+0x188/0x230 [amdgpu]
 ? dal_gpio_service_create+0x95/0xe0 [amdgpu]
 dc_create+0x219/0x5e0 [amdgpu]
 ? amdgpu_cgs_create_device+0x23/0x50 [amdgpu]
 amdgpu_dm_init+0xeb/0x160 [amdgpu]
 dm_hw_init+0xe/0x20 [amdgpu]
 amdgpu_device_init.cold+0x128d/0x161f [amdgpu]
 amdgpu_driver_load_kms+0x88/0x270 [amdgpu]
 drm_dev_register+0x111/0x150 [drm]
 amdgpu_pci_probe+0xbd/0x120 [amdgpu]
 ? __pm_runtime_resume+0x58/0x80
 local_pci_probe+0x42/0x80
 pci_device_probe+0xfd/0x190
 really_probe+0xf0/0x380
 driver_probe_device+0x59/0xd0
 device_driver_attach+0x53/0x60
 __driver_attach+0x8a/0x150
 ? device_driver_attach+0x60/0x60
 bus_for_each_dev+0x78/0xc0
 bus_add_driver+0x14a/0x1e0
 driver_register+0x6c/0xb0
 ? 0xffffffffc089c000
 do_one_initcall+0x46/0x1f4
 ? _cond_resched+0x15/0x30
 ? kmem_cache_alloc_trace+0x154/0x1c0
 ? do_init_module+0x23/0x230
 load_module+0x233b/0x2930
 ? __do_sys_init_module+0x16e/0x1a0
 ? _cond_resched+0x15/0x30
 __do_sys_init_module+0x16e/0x1a0
 do_syscall_64+0x5f/0x1a0
 ? page_fault+0x8/0x30
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff98c60fd5e
Code: 48 8b 0d 2d 41 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3>
RSP: 002b:00007ffdc2c09908 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
RAX: ffffffffffffffda RBX: 00005631e842a090 RCX: 00007ff98c60fd5e
RDX: 00007ff98c26484d RSI: 00000000007058e6 RDI: 00005631e8cfe480
RBP: 00005631e8cfe480 R08: 00005631e8433700 R09: 0000000000000006
R10: 0000000000000007 R11: 0000000000000246 R12: 00007ff98c26484d
R13: 0000000000000001 R14: 00005631e8416580 R15: 00005631e8445e70
---[ end trace e68627bbb9265691 ]---

It still manages to load amdgpu, as subsequently I see this:

[drm] Initialized amdgpu 3.32.0 20150101 for 0000:06:00.0 on minor 0

However, further down, there is an unending list of backtraces until the very end.  I guess this is when I hard shutdown the machine.  An example from these backtraces:

------------[ cut here ]------------
WARNING: CPU: 4 PID: 206 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hw_sequencer.c:854 dcn10_verify_allow_pstate_change_high.cold+0xc/0x229 [amdgpu]
Modules linked in: xt_CHECKSUM xt_MASQUERADE tun bridge stp llc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6t>
 crc32c_intel drm r8169 uas usb_storage hid_logitech_dj wmi video pinctrl_amd i2c_dev
CPU: 4 PID: 206 Comm: kworker/u32:7 Tainted: G        W         5.2.8-200.fc30.x86_64 #1
Hardware name: Gigabyte Technology Co., Ltd. AB350M-Gaming 3/AB350M-Gaming 3-CF, BIOS F23d 04/17/2018
Workqueue: events_unbound commit_work [drm_kms_helper]
RIP: 0010:dcn10_verify_allow_pstate_change_high.cold+0xc/0x229 [amdgpu]
Code: 83 c8 ff e9 01 b1 f9 ff 48 c7 c7 08 01 75 c0 e8 24 51 a9 ef 0f 0b 83 c8 ff e9 eb b0 f9 ff 48 c7 c7 08 01 75 c0 e8 0e 51 a9 ef <0f> 0b 80 bb 93 01 00 00 00>
RSP: 0018:ffffab0fc35e7b58 EFLAGS: 00010246
RAX: 0000000000000024 RBX: ffff94f56f8b6000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff94f580b17900
RBP: ffff94f56f8b6000 R08: 0000000000000001 R09: 0000000000000493
R10: ffffffffb1bf268c R11: 0000000000000003 R12: ffff94f550a081b8
R13: 0000000000000000 R14: ffff94f550a081b8 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff94f580b00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3ef9429000 CR3: 00000007a655a000 CR4: 00000000003406e0
Call Trace:
 dcn10_pipe_control_lock.part.0+0x69/0x70 [amdgpu]
 dc_commit_updates_for_stream+0x84c/0xc10 [amdgpu]
 amdgpu_dm_atomic_commit_tail+0xa79/0x1940 [amdgpu]
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? _cond_resched+0x15/0x30
 ? wait_for_completion_timeout+0x38/0x170
 ? finish_task_switch+0x7a/0x2a0
 ? commit_tail+0x3c/0x70 [drm_kms_helper]
 commit_tail+0x3c/0x70 [drm_kms_helper]
 process_one_work+0x19d/0x380
 worker_thread+0x50/0x3b0
 kthread+0xfb/0x130
 ? process_one_work+0x380/0x380
 ? kthread_park+0x80/0x80
 ret_from_fork+0x22/0x40
---[ end trace e68627bbb9265692 ]---

Hardware info:
$ sudo inxi -C -G
CPU:       Topology: Quad Core model: AMD Ryzen 5 2400G with Radeon Vega Graphics bits: 64 type: MT MCP L2 cache: 2048 KiB
           Speed: 2015 MHz min/max: 1600/3600 MHz Core speeds (MHz): 1: 1477 2: 1483 3: 1419 4: 1444 5: 1454 6: 1425 7: 1596
           8: 1597
Graphics:  Device-1: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] driver: amdgpu v: kernel
           Display: server: Fedora Project X.org 1.20.5 driver: amdgpu,ati unloaded: fbdev,modesetting,vesa
           resolution: 1920x1080~60Hz
           OpenGL: renderer: AMD RAVEN (DRM 3.30.0 5.1.20-300.fc30.x86_64 LLVM 8.0.0) v: 4.5 Mesa 19.1.4

Extra info:
I use lightdm as my login manager, and XFCE as my desktop environment.

$ rpm -q lightdm
lightdm-1.28.0-7.fc30.x86_64

$ rpm -q xfwm4
xfwm4-4.13.4-1.fc30.x86_64
$ rpm -q xfdesktop
xfdesktop-4.13.6-1.fc30.x86_64

Comment 1 Suvayu 2019-08-18 07:01:47 UTC
correction: after it hangs, I can login remotely, and shutdown.

I also tried vanilla kernels from Thorsten's repos, same issue.
- 5.2.9 from kernel-vanilla-stable
- 5.3.0 (rc4) from kernel-vanilla-mainline (I believe this is the closest to rawhide)


Note You need to log in before you can comment on or make changes to this bug.