Bug 1892785 - [abrt] amdgpu_device_ip_suspend_phase1: WARNING: CPU: 4 PID: 261 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1656 dm_suspend+0xa6/0xb0 [amdgpu] [amdgpu]
Summary: [abrt] amdgpu_device_ip_suspend_phase1: WARNING: CPU: 4 PID: 261 at drivers/g...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 33
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:c80571eba50810d5cfb45f0f044...
: 1951820 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-29 16:17 UTC by Andrew Thurman
Modified: 2021-11-30 18:52 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-30 18:52:09 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: dmesg (108.66 KB, text/plain)
2020-10-29 16:17 UTC, Andrew Thurman
no flags Details
journalctl log around the crash (16.82 KB, text/plain)
2021-06-08 11:56 UTC, Gilles Duboscq
no flags Details

Description Andrew Thurman 2020-10-29 16:17:08 UTC
Description of problem:
Very reproducible crash in AMDGPU:

1. Let GNOME go idle
2. Display deactivation
3. Wait ~5min (waking the system earlier will actually prevent the crash)
4. System graphics are (considering my troubleshooting skills) unrecoverable. Monitor does not reawake. TTYs cannot be graphically accesed. SysRq doesn't work (which I think is extra strange.) I might try testing ssh from a different machine as well.
5. The only form of recovery I have found is a hardware shutdown (holding the power button).

Reproducible every time I have tried with ABRT error on recovery.

LSHW also provided. Feel free to ask for any nore info because this seems pretty critical.

Additional info:
reporter:       libreport-2.14.0
WARNING: CPU: 4 PID: 261 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1656 dm_suspend+0xa6/0xb0 [amdgpu]
Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp tun bridge stp llc nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat snd_hda_codec_realtek edac_mce_amd snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg eeepc_wmi snd_hda_codec asus_wmi kvm snd_usb_audio uvcvideo sparse_keymap snd_usbmidi_lib snd_hda_core snd_rawmidi rfkill irqbypass videobuf2_vmalloc snd_hwdep videobuf2_memops rapl video mxm_wmi wmi_bmof videobuf2_v4l2 sp5100_tco pcspkr snd_seq k10temp i2c_piix4 videobuf2_common snd_seq_device snd_pcm
 videodev xpad snd_timer joydev mc ff_memless snd soundcore gpio_amdpt gpio_generic acpi_cpufreq zram ip_tables amdgpu iommu_v2 gpu_sched ttm drm_kms_helper crct10dif_pclmul crc32_pclmul cec crc32c_intel drm uas ghash_clmulni_intel igb usb_storage ccp dca i2c_algo_bit wmi pinctrl_amd fuse
CPU: 4 PID: 261 Comm: kworker/4:3 Not tainted 5.8.16-300.fc33.x86_64 #1
Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 3103 06/17/2020
Workqueue: pm pm_runtime_work
RIP: 0010:dm_suspend+0xa6/0xb0 [amdgpu]
Code: 48 89 ef 48 89 85 10 53 01 00 48 89 c6 e8 12 b9 ff ff 48 8b bd 88 38 01 00 e8 56 fe ff ff 48 89 ef e8 3e 1d 00 00 31 c0 5d c3 <0f> 0b e9 73 ff ff ff 0f 1f 00 0f 1f 44 00 00 48 8b 47 08 8b 57 38
RSP: 0018:ffffa9c441297cd8 EFLAGS: 00010282
RAX: ffffffffc0a80630 RBX: ffff995bfd735d00 RCX: 0000000000000000
RDX: 000000000000000a RSI: 0000000000003fe0 RDI: ffff995bfd720000
RBP: ffff995bfd720000 R08: ffff995bfd720608 R09: ffff995c02580eac
R10: 0000000000000018 R11: 0000000000000018 R12: ffff995bfd720000
R13: ffff995c0b9320b0 R14: ffff995bfd720000 R15: ffff995c0b932238
FS:  0000000000000000(0000) GS:ffff995c0e900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcb5e5b59c0 CR3: 00000003ff010000 CR4: 00000000003406e0
Call Trace:
 amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
 amdgpu_device_suspend+0x7b/0x290 [amdgpu]
 ? get_cpu_iowait_time_us+0x3a/0x110
 amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]
 pci_pm_runtime_suspend+0x5e/0x170
 ? pci_dev_put+0x20/0x20
 ? pci_dev_put+0x20/0x20
 __rpm_callback+0x81/0x140
 ? pci_dev_put+0x20/0x20
 rpm_callback+0x1f/0x70
 ? pci_dev_put+0x20/0x20
 rpm_suspend+0x138/0x660
 ? __switch_to+0x152/0x420
 ? __switch_to_asm+0x36/0x70
 pm_runtime_work+0x8e/0x90
 process_one_work+0x1b4/0x370
 worker_thread+0x53/0x3e0
 ? process_one_work+0x370/0x370
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Comment 1 Andrew Thurman 2020-10-29 16:17:12 UTC
Created attachment 1725105 [details]
File: dmesg

Comment 2 Aurélien Rouëné 2020-11-23 15:26:38 UTC
Same problem here. In the hope that will help, here is my informations.

OS: Fedora 33
Kernel: 5.9.8-200
Graphic card: AMD 5700 xt
Stock amdgpu graphic driver.

Same steps to reproduce,
1. Go idle
2. Screen locks, then shut down
3. System freeze

dmesg excerpt:

 [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62    
 [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-62).    
 snd_hda_intel 0000:23:00.1: refused to change power state from D3hot to D0    
 snd_hda_intel 0000:23:00.1: CORB reset timeout#2, CORBRP = 65535    
 ------------[ cut here ]------------    
 WARNING: CPU: 4 PID: 53359 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1668 dm_suspend+0x175/0x190 [amdgpu]    
 Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp tun bridge stp llc rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace nfs_ssc fscache nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_rej>    
  videobuf2_common snd_pcm irqbypass videodev drm_kms_helper cdc_acm rapl ecdh_generic snd_timer snd sp5100_tco ecc wmi_bmof mc rfkill cec soundcore i2c_piix4 k10temp acpi_cpufreq auth_rpcgss drm sunrpc zram ip_tables crct10dif_pclmul crc32_pclmul mxm_wmi crc32c_intel ghash_clmulni_intel igb atlantic nvme dca macsec i2c_algo_bit ccp >    
[...]    
 Call Trace:    
  ? nv_common_set_clockgating_state+0xc2/0x210 [amdgpu]    
  amdgpu_device_ip_suspend_phase1+0x73/0xd0 [amdgpu]    
  amdgpu_device_suspend+0x7b/0x290 [amdgpu]    
  amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]    
  pci_pm_runtime_suspend+0x5e/0x170    
  ? update_load_avg+0x7a/0x610    
  ? pci_dev_put+0x20/0x20    
  __rpm_callback+0xce/0x180    
  ? pci_dev_put+0x20/0x20    
  rpm_callback+0x1f/0x70    
  ? pci_dev_put+0x20/0x20    
  rpm_suspend+0x138/0x660    
  ? __switch_to+0x118/0x470    
  ? __switch_to_asm+0x36/0x70    
  pm_runtime_work+0x8e/0x90    
  process_one_work+0x1b4/0x370    
  worker_thread+0x53/0x3e0    
  ? process_one_work+0x370/0x370    
  kthread+0x11b/0x140    
  ? __kthread_bind_mask+0x60/0x60    
  ret_from_fork+0x22/0x30    
 ---[ end trace 7bbc151ecc3ff39e ]---    
 ------------[ cut here ]------------    
 kernel BUG at mm/slub.c:304!    
 invalid opcode: 0000 [#1] SMP NOPTI    
[...]    
 Call Trace:    
  ? free_one_page+0x57/0xd0    
  kfree+0x397/0x3c0    
  ? kernel_queue_uninit+0x84/0xf0 [amdgpu]    
  kernel_queue_uninit+0x84/0xf0 [amdgpu]      
  stop_cpsch+0xa2/0xc0 [amdgpu]    
  kgd2kfd_suspend.part.0+0x2f/0x40 [amdgpu]    
  amdgpu_device_suspend+0x87/0x290 [amdgpu]    
  amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]    
[...]

Comment 3 Aurélien Rouëné 2020-11-29 15:10:13 UTC
(In reply to Aurélien Rouëné from comment #2)
> Same problem here. In the hope that will help, here is my informations.
> 
> OS: Fedora 33
> Kernel: 5.9.8-200
> Graphic card: AMD 5700 xt
> Stock amdgpu graphic driver.
> 
> Same steps to reproduce,
> 1. Go idle
> 2. Screen locks, then shut down
> 3. System freeze
> 
> dmesg excerpt:
> 
>  [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block
> <smu> failed -62    
>  [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed
> (-62).    
>  snd_hda_intel 0000:23:00.1: refused to change power state from D3hot to D0 
> 
>  snd_hda_intel 0000:23:00.1: CORB reset timeout#2, CORBRP = 65535    
>  ------------[ cut here ]------------    
>  WARNING: CPU: 4 PID: 53359 at
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1668
> dm_suspend+0x175/0x190 [amdgpu]    
>  Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
> nf_nat_tftp nf_conntrack_tftp tun bridge stp llc rpcsec_gss_krb5 nfsv4
> dns_resolver nfs lockd grace nfs_ssc fscache nft_objref
> nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4
> nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_rej>    
>   videobuf2_common snd_pcm irqbypass videodev drm_kms_helper cdc_acm rapl
> ecdh_generic snd_timer snd sp5100_tco ecc wmi_bmof mc rfkill cec soundcore
> i2c_piix4 k10temp acpi_cpufreq auth_rpcgss drm sunrpc zram ip_tables
> crct10dif_pclmul crc32_pclmul mxm_wmi crc32c_intel ghash_clmulni_intel igb
> atlantic nvme dca macsec i2c_algo_bit ccp >    
> [...]    
>  Call Trace:    
>   ? nv_common_set_clockgating_state+0xc2/0x210 [amdgpu]    
>   amdgpu_device_ip_suspend_phase1+0x73/0xd0 [amdgpu]    
>   amdgpu_device_suspend+0x7b/0x290 [amdgpu]    
>   amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]    
>   pci_pm_runtime_suspend+0x5e/0x170    
>   ? update_load_avg+0x7a/0x610    
>   ? pci_dev_put+0x20/0x20    
>   __rpm_callback+0xce/0x180    
>   ? pci_dev_put+0x20/0x20    
>   rpm_callback+0x1f/0x70    
>   ? pci_dev_put+0x20/0x20    
>   rpm_suspend+0x138/0x660    
>   ? __switch_to+0x118/0x470    
>   ? __switch_to_asm+0x36/0x70    
>   pm_runtime_work+0x8e/0x90    
>   process_one_work+0x1b4/0x370    
>   worker_thread+0x53/0x3e0    
>   ? process_one_work+0x370/0x370    
>   kthread+0x11b/0x140    
>   ? __kthread_bind_mask+0x60/0x60    
>   ret_from_fork+0x22/0x30    
>  ---[ end trace 7bbc151ecc3ff39e ]---    
>  ------------[ cut here ]------------    
>  kernel BUG at mm/slub.c:304!    
>  invalid opcode: 0000 [#1] SMP NOPTI    
> [...]    
>  Call Trace:    
>   ? free_one_page+0x57/0xd0    
>   kfree+0x397/0x3c0    
>   ? kernel_queue_uninit+0x84/0xf0 [amdgpu]    
>   kernel_queue_uninit+0x84/0xf0 [amdgpu]      
>   stop_cpsch+0xa2/0xc0 [amdgpu]    
>   kgd2kfd_suspend.part.0+0x2f/0x40 [amdgpu]    
>   amdgpu_device_suspend+0x87/0x290 [amdgpu]    
>   amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]    
> [...]

Seems to be resolved for now ?! (kernel 5.9.10)

Comment 4 Andrew Thurman 2020-11-29 16:58:21 UTC
5.9.10 seems to have also resolved the issue for now. Going to do a little more testing when I have time, but a quick 10-minute screen lock seemed to have no issue!

Comment 5 Andrew Thurman 2020-11-29 21:44:11 UTC
(In reply to Andrew Thurman from comment #4)
> 5.9.10 seems to have also resolved the issue for now. Going to do a little
> more testing when I have time, but a quick 10-minute screen lock seemed to
> have no issue!

Scratch this. Problem is still prevelant. @

Comment 6 Aurélien Rouëné 2020-11-30 09:06:02 UTC
(In reply to Andrew Thurman from comment #5)
> (In reply to Andrew Thurman from comment #4)
> > 5.9.10 seems to have also resolved the issue for now. Going to do a little
> > more testing when I have time, but a quick 10-minute screen lock seemed to
> > have no issue!
> 
> Scratch this. Problem is still prevelant. @

My former graphic card was a nVidia, and I had some remaining drivers I removed since the problem first appeared for me (remove xorg-x11-drv-nvidia-libs.i686 xorg-x11-drv-nvidia-cuda-libs.i686).
You may try a "dnf list installed | grep -i nvidia", to remove nvidia left overs, if you have some ?

Comment 7 Andrew Thurman 2020-11-30 14:56:54 UTC
I've never had an Nvidia card, and am on a silverblue system so have nothing but Nouveau. How long did you give your system? Mine didn't crash immediately on lock, and took about 10-15 (until the card reached a power save state(?)).

Comment 8 Aurélien Rouëné 2020-11-30 17:55:12 UTC
(In reply to Andrew Thurman from comment #7)
> I've never had an Nvidia card, and am on a silverblue system so have nothing
> but Nouveau. How long did you give your system? Mine didn't crash
> immediately on lock, and took about 10-15 (until the card reached a power
> save state(?)).

It's definitely fixed for mine, I left my PC on lock screen overnight, so at least 8 hours.

I just had these errors on dmesg:

[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
... x18

but no crash.


If it's not nvidia drivers, I don't know what could cause de problem, or how it was fixed on my system !?

Comment 9 Andrew Thurman 2020-12-01 13:59:58 UTC
On kernel 5.9.11. Will do a little more testing when I get a chance. I have no clue why the Nvidia drivers would even cause this, as they shouldn't even be loaded if they don't need to be, at least to my knowledge.

Comment 10 Andrew Thurman 2020-12-01 20:39:07 UTC
Just tested with lock screen and issue seems to be resolved on 5.9.11. Going to test auto-lock soon!

Comment 11 Andrew Thurman 2021-02-08 17:49:52 UTC
I issue still seems to be present :(

Comment 12 Andrew Thurman 2021-02-17 22:39:45 UTC
Description of problem:
1. Turn on computer and leave GDM idle.

Expected Result:

GDM goes to sleep, and can be reawokened with a keyboard press

Actual Result: GDM went to sleep, and never came back. Graphics were unrecoverable.

Happens every time.

Version-Release number of selected component:
kernel-core-5.10.15-200.fc33

Additional info:
reporter:       libreport-2.14.0
cmdline:        BOOT_IMAGE=(hd0,gpt2)/ostree/fedora-6d5375245bf1789376c3827e71dc24cc0691a64bcf3097fa09a6e275f83788a6/vmlinuz-5.10.15-200.fc33.x86_64 rhgb quiet root=UUID=5a74d1ad-e74b-4e95-b270-35e0f11264f4 rootflags=subvol=root ostree=/ostree/boot.1/fedora/6d5375245bf1789376c3827e71dc24cc0691a64bcf3097fa09a6e275f83788a6/0
crash_function: nv_common_set_clockgating_state
kernel:         5.10.15-200.fc33.x86_64
runlevel:       N 5
type:           Kerneloops

Truncated backtrace:
WARNING: CPU: 7 PID: 260 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1751 dm_suspend+0x178/0x190 [amdgpu]
Modules linked in: nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation snd_soc_core snd_compress snd_pcm_dmaengine soundwire_cadence vfat fat snd_usb_audio snd_hda_codec uvcvideo snd_hda_core snd_usbmidi_lib ac97_bus snd_rawmidi snd_hwdep videobuf2_vmalloc snd_seq edac_mce_amd videobuf2_memops snd_seq_device kvm_amd videobuf2_v4l2 snd_pcm videobuf2_common kvm xpad eeepc_wmi igb videodev ff_memless asus_wmi irqbypass snd_timer sparse_keymap joydev snd sp5100_tco
 rfkill mc soundcore video wmi_bmof i2c_piix4 dca rapl k10temp pcspkr gpio_amdpt gpio_generic acpi_cpufreq zram ip_tables amdgpu mxm_wmi iommu_v2 gpu_sched i2c_algo_bit ttm crct10dif_pclmul drm_kms_helper crc32_pclmul crc32c_intel cec ghash_clmulni_intel drm ccp wmi pinctrl_amd fuse
CPU: 7 PID: 260 Comm: kworker/7:3 Not tainted 5.10.15-200.fc33.x86_64 #1
Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 3103 06/17/2020
Workqueue: pm pm_runtime_work
RIP: 0010:dm_suspend+0x178/0x190 [amdgpu]
Code: c3 31 d2 4c 89 e6 4c 89 ef e8 14 17 13 00 83 f8 01 74 1e 89 c2 48 c7 c6 40 68 84 c0 48 c7 c7 c8 a8 8d c0 e8 fa 5a cc ff eb b8 <0f> 0b e9 ad fe ff ff 4c 89 e6 4c 89 ef e8 86 5f 12 00 eb a4 0f 1f
RSP: 0018:ffffa8bb012a7c78 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff9bfc11336118 RCX: 0000000000000000
RDX: 000000000000000a RSI: 0000000000000ff8 RDI: ffff9bfc11320000
RBP: ffff9bfc11320000 R08: 0000000001c0ca00 R09: ffff9bfc0114836c
R10: 0000000000000018 R11: 0000000000000018 R12: ffff9bfc11320000
R13: 0000000000000001 R14: ffff9bfc016441ac R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9bff0e9c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f03639237f0 CR3: 000000018ba92000 CR4: 00000000003506e0
Call Trace:
 ? nv_common_set_clockgating_state+0xc5/0x210 [amdgpu]
 amdgpu_device_ip_suspend_phase1+0x73/0xd0 [amdgpu]
 amdgpu_device_suspend+0x6b/0x280 [amdgpu]
 ? update_blocked_averages+0x529/0x610
 amdgpu_pmops_runtime_suspend+0x9d/0x140 [amdgpu]
 pci_pm_runtime_suspend+0x5e/0x170
 ? update_load_avg+0x7a/0x5e0
 ? pci_dev_put+0x20/0x20
 __rpm_callback+0xce/0x180
 ? pci_dev_put+0x20/0x20
 rpm_callback+0x1f/0x70
 ? pci_dev_put+0x20/0x20
 rpm_suspend+0x137/0x640
 ? __switch_to_asm+0x42/0x70
 ? __switch_to+0x114/0x450
 pm_runtime_work+0x8e/0x90
 process_one_work+0x1b6/0x350
 worker_thread+0x53/0x3e0
 ? process_one_work+0x350/0x350
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Comment 13 Andrew Thurman 2021-02-17 22:40:38 UTC
TL;DR: Problem still present.

Comment 14 silentis.butonis 2021-03-17 13:10:24 UTC
Description of problem:
Screen doesn't wake up after be suspend/sleep mode.

Version-Release number of selected component:
kernel-core-5.10.22-200.fc33

Additional info:
reporter:       libreport-2.14.0
cmdline:        BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.10.22-200.fc33.x86_64 root=UUID=5e6dc6f0-8df7-43a1-ae4f-4a86922cbaf0 ro rootflags=subvol=root rhgb quiet
crash_function: nv_common_set_clockgating_state
kernel:         5.10.22-200.fc33.x86_64
runlevel:       N 5
type:           Kerneloops

Truncated backtrace:
WARNING: CPU: 3 PID: 75 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1751 dm_suspend+0x178/0x190 [amdgpu]
Modules linked in: uinput xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp tun bridge stp llc nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security rfkill ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter snd_hda_codec_realtek snd_hda_codec_generic sunrpc snd_hda_codec_hdmi ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation snd_soc_core snd_compress snd_pcm_dmaengine vfat ppdev soundwire_cadence fat snd_hda_codec snd_usb_audio snd_hda_core snd_usbmidi_lib ac97_bus snd_hwdep snd_rawmidi edac_mce_amd snd_seq kvm_amd snd_seq_device snd_pcm kvm snd_timer mc joydev snd irqbypass rapl pcspkr soundcore
 sp5100_tco parport_pc i2c_piix4 k10temp wmi_bmof parport gpio_amdpt gpio_generic acpi_cpufreq zram ip_tables amdgpu iommu_v2 gpu_sched ttm i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul cec crc32c_intel drm ghash_clmulni_intel ccp nvme r8169 nvme_core wmi pinctrl_amd fuse
CPU: 3 PID: 75 Comm: kworker/3:1 Not tainted 5.10.22-200.fc33.x86_64 #1
Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.OO 10/02/2019
Workqueue: pm pm_runtime_work
RIP: 0010:dm_suspend+0x178/0x190 [amdgpu]
Code: c3 31 d2 4c 89 e6 4c 89 ef e8 f4 18 13 00 83 f8 01 74 1e 89 c2 48 c7 c6 40 18 a5 c0 48 c7 c7 b0 58 ae c0 e8 3a 2a bf ff eb b8 <0f> 0b e9 ad fe ff ff 4c 89 e6 4c 89 ef e8 46 61 12 00 eb a4 0f 1f
RSP: 0018:ffffb54a80447c78 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8c5e51fb6118 RCX: 0000000000000000
RDX: 000000000000000a RSI: 0000000000000ff8 RDI: ffff8c5e51fa0000
RBP: ffff8c5e51fa0000 R08: 0000000001c0ca00 R09: ffff8c5e41084f6c
R10: 0000000000000018 R11: 0000000000000018 R12: ffff8c5e51fa0000
R13: 0000000000000001 R14: ffff8c5e4190d1ac R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8c614e8c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00001110083e6000 CR3: 000000018adf2000 CR4: 00000000003506e0
Call Trace:
 ? nv_common_set_clockgating_state+0xc5/0x210 [amdgpu]
 amdgpu_device_ip_suspend_phase1+0x73/0xd0 [amdgpu]
 amdgpu_device_suspend+0x6b/0x280 [amdgpu]
 amdgpu_pmops_runtime_suspend+0x9d/0x140 [amdgpu]
 pci_pm_runtime_suspend+0x5e/0x170
 ? update_load_avg+0x7a/0x5e0
 ? pci_dev_put+0x20/0x20
 __rpm_callback+0x4c/0x240
 ? pci_dev_put+0x20/0x20
 rpm_callback+0x1f/0x70
 ? pci_dev_put+0x20/0x20
 rpm_suspend+0x137/0x640
 ? __switch_to_asm+0x42/0x70
 ? __switch_to+0x114/0x450
 pm_runtime_work+0x8e/0x90
 process_one_work+0x1b6/0x350
 worker_thread+0x53/0x3e0
 ? process_one_work+0x350/0x350
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Comment 15 tothaa 2021-03-24 05:57:50 UTC
Description of problem:
copied files to usb drives and then I ran the sync command as root and waited to it finish.

Version-Release number of selected component:
kernel-core-5.11.7-200.fc33

Additional info:
reporter:       libreport-2.14.0
cmdline:        BOOT_IMAGE=(hd2,gpt2)/boot/vmlinuz-5.11.7-200.fc33.x86_64 root=UUID=21952c56-5e98-4956-b005-b939e111ab5b ro rhgb quiet
crash_function: vi_common_set_clockgating_state
kernel:         5.11.7-200.fc33.x86_64
runlevel:       N 5
type:           Kerneloops

Truncated backtrace:
WARNING: CPU: 2 PID: 9593 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1792 dm_suspend+0x178/0x190 [amdgpu]
Modules linked in: ses enclosure scsi_transport_sas uinput hid_logitech_hidpp joydev hid_logitech_dj xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp tun bridge stp llc nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation vfat fat snd_soc_core intel_rapl_msr intel_rapl_common snd_compress snd_pcm_dmaengine soundwire_cadence x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp kvm_intel snd_hda_core iTCO_wdt ac97_bus
 snd_hwdep intel_pmc_bxt ee1004 iTCO_vendor_support mei_hdcp kvm snd_seq snd_seq_device eeepc_wmi irqbypass snd_pcm rapl intel_cstate asus_wmi intel_uncore sparse_keymap snd_timer i2c_i801 rfkill pcspkr wmi_bmof mxm_wmi i2c_smbus snd mei_me soundcore mei acpi_pad zram ip_tables amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched crct10dif_pclmul i2c_algo_bit drm_kms_helper crc32_pclmul crc32c_intel nvme nvme_core cec drm r8169 ghash_clmulni_intel uas usb_storage wmi video fuse
CPU: 2 PID: 9593 Comm: kworker/2:2 Not tainted 5.11.7-200.fc33.x86_64 #1
Hardware name: System manufacturer System Product Name/Z170-K, BIOS 3805 05/16/2018
Workqueue: pm pm_runtime_work
RIP: 0010:dm_suspend+0x178/0x190 [amdgpu]
Code: c3 31 d2 4c 89 e6 4c 89 ef e8 d4 15 13 00 83 f8 01 74 1e 89 c2 48 c7 c6 40 f5 96 c0 48 c7 c7 b8 8c a1 c0 e8 5a d7 d0 ff eb b8 <0f> 0b e9 ad fe ff ff 4c 89 e6 4c 89 ef e8 b6 59 12 00 eb a4 0f 1f
RSP: 0018:ffffa40948a17c70 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8d2c4d036920 RCX: 0000000000000000
RDX: 0000000000000009 RSI: ffff8d2f66c98ac0 RDI: ffff8d2c4d020000
RBP: ffff8d2c4d020000 R08: 0000000000000000 R09: ffffa40948a17a40
R10: ffffa40948a17a38 R11: ffffffff99b44f08 R12: ffff8d2c4d020000
R13: 0000000000000001 R14: ffff8d2c4160d1ac R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8d2f66c80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdf0041a00c CR3: 00000003f9a10006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? vi_common_set_clockgating_state+0x227/0x2f0 [amdgpu]
 amdgpu_device_ip_suspend_phase1+0x8e/0x100 [amdgpu]
 amdgpu_device_suspend+0x6b/0x2b0 [amdgpu]
 ? update_blocked_averages+0x23c/0x650
 amdgpu_pmops_runtime_suspend+0x9d/0x130 [amdgpu]
 pci_pm_runtime_suspend+0x5e/0x170
 ? pci_dev_put+0x20/0x20
 __rpm_callback+0x4c/0x240
 ? pci_dev_put+0x20/0x20
 rpm_callback+0x1f/0x70
 ? pci_dev_put+0x20/0x20
 rpm_suspend+0x137/0x640
 ? __switch_to_asm+0x42/0x70
 ? __switch_to+0x114/0x450
 pm_runtime_work+0x8e/0x90
 process_one_work+0x1ec/0x380
 worker_thread+0x53/0x3e0
 ? process_one_work+0x380/0x380
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Comment 16 Andrew Thurman 2021-04-14 23:29:16 UTC
This appears to be fixed in kernel-5.11.13-300.fc34.x86_64. Haven't tested in a while, and god knows who/what/when/where/why it was fixed, but seems to work perfectly fine now. I'll close this bug once I get one more confirmation.

Comment 17 mcronen 2021-04-20 22:08:14 UTC
*** Bug 1951820 has been marked as a duplicate of this bug. ***

Comment 18 silentis.butonis 2021-05-01 18:05:53 UTC
Description of problem:
When the monitor goes to sleep mode the computer doesn't wakes up.

Version-Release number of selected component:
kernel-core-5.11.16-200.fc33

Additional info:
reporter:       libreport-2.14.0
cmdline:        BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.11.16-200.fc33.x86_64 root=UUID=939234ac-b2e0-4210-8ce8-1300e8046610 ro rhgb quiet
crash_function: nv_common_set_clockgating_state
kernel:         5.11.16-200.fc33.x86_64
runlevel:       N 5
type:           Kerneloops

Truncated backtrace:
WARNING: CPU: 5 PID: 167 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:1792 dm_suspend+0x178/0x190 [amdgpu]
Modules linked in: uinput xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp tun bridge stp llc nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security rfkill ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter snd_hda_codec_realtek snd_hda_codec_generic sunrpc ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation snd_soc_core intel_rapl_msr vfat intel_rapl_common fat snd_compress snd_pcm_dmaengine edac_mce_amd soundwire_cadence snd_hda_codec kvm_amd snd_usb_audio snd_hda_core ppdev snd_usbmidi_lib kvm snd_rawmidi ac97_bus snd_hwdep snd_seq snd_seq_device irqbypass rapl snd_pcm mc pcspkr
 snd_timer snd joydev sp5100_tco wmi_bmof k10temp i2c_piix4 soundcore parport_pc parport gpio_amdpt gpio_generic acpi_cpufreq zram ip_tables amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched i2c_algo_bit drm_kms_helper cec crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel r8169 ccp nvme nvme_core wmi pinctrl_amd fuse
CPU: 5 PID: 167 Comm: kworker/5:1 Not tainted 5.11.16-200.fc33.x86_64 #1
Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.OO 10/02/2019
Workqueue: pm pm_runtime_work
RIP: 0010:dm_suspend+0x178/0x190 [amdgpu]
Code: c3 31 d2 4c 89 e6 4c 89 ef e8 e4 1b 13 00 83 f8 01 74 1e 89 c2 48 c7 c6 60 15 96 c0 48 c7 c7 e0 ae a0 c0 e8 2a 43 c9 ff eb b8 <0f> 0b e9 ad fe ff ff 4c 89 e6 4c 89 ef e8 c6 5f 12 00 eb a4 0f 1f
RSP: 0018:ffffb0f4c04ffc68 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8e2794756910 RCX: 0000000000000000
RDX: 000000000000000a RSI: 0000000000000ff8 RDI: ffff8e2794740000
RBP: ffff8e2794740000 R08: 0000000001c0ca00 R09: ffff8e2780bbfaec
R10: 0000000000000003 R11: 0000000000000000 R12: ffff8e2794740000
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8e2a8e940000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efd48024018 CR3: 000000010615e000 CR4: 00000000003506e0
Call Trace:
 ? nv_common_set_clockgating_state+0xd2/0x220 [amdgpu]
 amdgpu_device_ip_suspend_phase1+0x8f/0x100 [amdgpu]
 amdgpu_device_suspend+0x6f/0x2b0 [amdgpu]
 ? update_blocked_averages+0x54d/0x650
 amdgpu_pmops_runtime_suspend+0x9d/0x130 [amdgpu]
 pci_pm_runtime_suspend+0x5e/0x170
 ? pci_dev_put+0x20/0x20
 ? pci_dev_put+0x20/0x20
 __rpm_callback+0x81/0x140
 ? pci_dev_put+0x20/0x20
 rpm_callback+0x1f/0x70
 ? pci_dev_put+0x20/0x20
 rpm_suspend+0x137/0x6c0
 ? __switch_to_asm+0x42/0x70
 ? __switch_to+0x114/0x450
 pm_runtime_work+0x8e/0x90
 process_one_work+0x1ec/0x380
 worker_thread+0x53/0x3e0
 ? process_one_work+0x380/0x380
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Comment 19 Gilles Duboscq 2021-06-08 11:56:24 UTC
Created attachment 1789364 [details]
journalctl log around the crash

I can see the same thing with 5.12.8-300.fc34.x86_64 when the computer goes to sleep.
Note in the logs the amdgpu errors that are being logged just before the crash.

Comment 20 Ben Cotton 2021-11-04 13:56:24 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 21 Ben Cotton 2021-11-04 14:25:47 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 22 Ben Cotton 2021-11-04 15:23:27 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 23 Ben Cotton 2021-11-30 18:52:09 UTC
Fedora 33 changed to end-of-life (EOL) status on 2021-11-30. Fedora 33 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.