S0i3 (modern standby) suspend is broken on AMD Ryzen AI 9 HX 370 (Strix Point, Radeon 890M) after upgrading from amd-gpu-firmware-20260110 to amd-gpu-firmware-20260221. Every suspend attempt fails with: amd_pmc:amd_pmc_idlemask_read: amd_pmc: SMU idlemask s0i3: 0xfeff9afd [...] amd_pmc AMDI000A:00: Last suspend didn't reach deepest state The result is a warm laptop and abnormal battery drain in suspend. Hardware: - CPU: AMD Ryzen AI 9 HX 370 (Strix Point) - GPU: AMD Radeon 890M (PCI device 1002:150e) - Laptop: Tuxedo Infinity Book Pro 14 Gen 10 AMD - BIOS: N.1.20A13 I did my best to isolate the root cause. Here's what I did: I've tested these Kernels: 6.18.9, 6.18.12, 6.18.16 (F43), 6.19.2 (F44), 6.19.6 (F43) — all exhibit the same behaviour with amd-gpu-firmware-20260221 installed. Yes, I've installed a F44 Kernel as I tried to see if the issue is in F44 Beta by booting a Live ISO. The issue did not exist thus I installed the F44 Kernel on F43 to see if that changes anything. It did not. After downgrading from amd-gpu-firmware-20260221 to amd-gpu-firmware-20260110, the message about the suspend not reaching the deepest state went away on all kernels. The SMU idlemask changed to: amd_pmc:amd_pmc_idlemask_read: amd_pmc: SMU idlemask s0i3: 0xffff1afd And the DMUB firmware version changed also from (broken): amdgpu: [drm] Loading DMUB firmware via PSP: version=0x09003D00 amdgpu: [drm] DMUB hardware initialized: version=0x09003D00 To (working): amdgpu: [drm] Loading DMUB firmware via PSP: version=0x09003600 amdgpu: [drm] DMUB hardware initialized: version=0x09003600 Reproducible: Always Steps to Reproduce: 1.Install amd-gpu-firmware-20260221 on a Strix Point system 2. Try to suspend 3. Wake up the system and check the logs for example via: journalctl -b | grep -E "idlemask|deepest|DMUB|dmub" | head -10 4. Downgrade your package: sudo rpm -Uvh --oldpackage amd-gpu-firmware-20260110-1.fc43.noarch.rpm 5. Update initramfs: sudo dracut --force --regenerate-all 6. Reboot and retry to suspend 7. Check the logs again: The issue is gone. Actual Results: System suspends, fans turn off and the system "seems" to sleep. But it's not deep enough as the system remains warm and shows abnormal battery drain. Expected Results: The system should enter deep sleep and not show amd_pmc AMDI000A:00: Last suspend didn't reach deepest state in the logs / be warm and eat through the battery. Additional Information: Note: To see the idlemask the kernel parameter amd_pmc.dyndbg=+pmf is required at boot time. The 0xffff in the idlemask 0xffff1afd indicates successful poweroff of all AMDGPU components I think. As far as I understand it, is bit 24 which is affected, which might be the VPE (Video Processing Engine) failing to suspend. But that's thin ice because I have no Idea how to read this and it's just a guess. I did not manage to isolate the individual file in the firmware package that is causing this. I hope that this bug report is useful without this.
Already fixed in updates-testing, please check for duplicate bugs before wasting people's time. *** This bug has been marked as a duplicate of bug 2445615 ***
> Already fixed in updates-testing, please check for duplicate bugs before wasting people's time. Glad to hear it is fixed. I did just that - I used the guided form as outlined herehttps://fedoraproject.org/wiki/Bugzilla#guided and read the documentation on how to file bugs. I searched "amd-gpu-firmware" in the guided form, "amdgpu" and a couple of other things to ensure that this is new. After I was sure this is not filed as your form clearly showed it wasn't, I spent hours compiling this information to make it as comprehensive / actionable as possible. To get accused of "wasting others peoples time" feels inaproppriate here. Please get your own submission system working properly to avoid such issues in the future.
Also after reading the other bug: That seems to be an entirly seperate issue. I don't see why this is not a seperate issue: 1. The system there never resumes - mine does but does not go into deep sleep. 2. Their issue is related to the NPU firmware, mine is related to the GPU 3. The log output has absolutily no resemblance to what I mentioned in my report Why do you think this is a duplicate? And if it is: How should this be obvious to a non-firmware developer?
Please look over your kernel log - did amxdna successfully initialize? If not then it's a duplicate of that issue. Entering into s0i3 requires all components in the right state.
Dear Mario, I've checked the logs of an affected boot and indeed found the loglines you've suspected. journalctl -b 825462ba50b64f1789a1ca0956c59637 -g "amdxdna" -o cat amdxdna 0000:66:00.1: enabling device (0000 -> 0002) amdxdna 0000:66:00.1: [drm] *ERROR* aie2_check_protocol: Incompatible firmware protocol major 7 minor 2 amdxdna 0000:66:00.1: [drm] *ERROR* aie2_hw_start: firmware is not alive amdxdna 0000:66:00.1: [drm] *ERROR* aie2_smu_exec: smu cmd 4 failed, 0xff amdxdna 0000:66:00.1: [drm] *ERROR* aie2_smu_fini: Power off failed, ret -22 amdxdna 0000:66:00.1: [drm] *ERROR* aie2_init: start npu failed, ret -22 amdxdna 0000:66:00.1: [drm] *ERROR* amdxdna_probe: Hardware init failed, ret -22 amdxdna 0000:66:00.1: probe with driver amdxdna failed with error -22 drwxr-xr-x 2 root root 0 Oct 14 02:00 usr/lib/modules/6.19.6-200.fc43.x86_64/kernel/drivers/accel/amdxdna -rw-r--r-- 1 root root 101092 Oct 14 02:00 usr/lib/modules/6.19.6-200.fc43.x86_64/kernel/drivers/accel/amdxdna/amdxdna.ko.xz I also tested with the latest amd-gpu-firmware-20260309 package that released today and can confirm that the issue is gone. So I agree that this is indeed a duplicate just as Peter suspected initially. @Peter: I apologize for causing duplicate work. I did not connect those errors to the suspend issue as they happend so early in boot amdxdna did not appear around the actual supend issue. Also, since the firmware package I downgraded was called "amd-gpu-firmware" and with the idlemask seemingly relating to the video engine I simply could not figure out that https://bugzilla.redhat.com/show_bug.cgi?id=2445615 is the root cause for the issues I observed. I will try to do more research next time. Thanks to everyone for fixing this so quickly!