The kernel has been updated today to version kernel-6.4.11-200.fc38.x86_64 After updating, while rebooting, the system enters into emergency mode. Selecting the previous kernel at booting, everything fine. Reproducible: Always
Created attachment 1984571 [details] The journalctl -xb log I am attaching the log of journalctl -xb
Encountering the same condition here. Also, in my case, I am unable to log in with the root password when prompted. Rebooting the same kernel without "rhgb quiet" on the kernel command line reveals an NVME issue (typed from a photo, not pasted excuse the typos): nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff nvme nvme0: Does your device have a faulty power saving mode enabled? nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug nvme0n1: I/O Cmd(0x2) @ LBA 231710983, 256 blocks, I/O Error (sct 0x3 / sc 0x71) I/O error, dev nvme0n1, sector 231710983 op 0x0:(READ) flags 0x80700 phys_segf 32 prio cla...(cut off from the photo of my display) nvme 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible nvme nvme0: Disabling device after reset failure: -19 After rebooting kernel 6.4.11-200 with the recommended settings on the kernel command line, I encountered the same condition: emergency mode with the NVME messages displayed. So it appears to not be a "faulty power saving mode"? Since the kernel cannot access the NVME, it can't write to the system journal and I'm unable to provide a copy of it.
Created attachment 1985397 [details] Log of journalctl -xb against kernel-6.4.12-200.fc38.x86_64
Hello, I'm running Fedora 38 and I'm experiencing the same bug. When I boot with kernel 6.4.11 or 6.4.12 I end up in emergency mode.Everything is ok with kernel 6.4.10 I'm posting this message based on a suggestion I've received from Fedora community (https://discussion.fedoraproject.org/t/kernel-6-4-11-200-emergency-mode-during-boot/88088/8) If there are tests I can do to help you diagnose/resolve the issue, let me know! Thank you in advance and have a great day!
Created attachment 1985687 [details] Full log of journal -xb Here is the full log returned by journalctl -xb
There's a similar looking bug report in the kernel bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=217849
Does booting with `pcie_aspm=off` help?
(In reply to David Klann from comment #2) > Since the kernel cannot access the NVME, it can't write to the system journal and I'm unable to provide a copy of it. Please remove `quiet` and `rhgb` from your kernel boot options, boot, take a picture of the screen and upload it here.
(In reply to Artem S. Tashkinov from comment #8) > (In reply to David Klann from comment #2) > > > Since the kernel cannot access the NVME, it can't write to the system journal and I'm unable to provide a copy of it. > > Please remove `quiet` and `rhgb` from your kernel boot options, boot, take a > picture of the screen and upload it here. This is a composite photo (taken from an external monitor because the laptop display is HiDPI and the text is larger on this display). This is with the following kernel command line: root=UUID=4d0f3832-b5ba-46ef-b143-06a76341463d ro resume=UUID=1b7eae4f-5cbc-4bbe-8e87-384aa1549910 pcie_aspm=off iommu=1 I have also attempted to boot using the kernel parm nvme_core.default_ps_max_latency_us=0 added to the above with no different results. Please let me know if there are other kernel command line parameters I might try when booting. I also tried today booting the 6.4.13-200 kernel (in testing) and it results in the same behavior.
David, you've not actually uploaded the photo. There's "Add an attachment" link above to do that.
Sigh... I blame the "multi tasking" ... :)
Created attachment 1986479 [details] Photo of a physical display showing errors during attempted boot of kernel 6.4.13-200
Created attachment 1986510 [details] 6.4.11.200 no 'quiet rhgb' This is the result of booting kernel 6.4.11.200 removing the 'quiet rhgb' option
Created attachment 1986523 [details] 6.4.11.200 without 'quiet rhgb' and with 'pcie_aspm=off' And this is the result of booting kernel 6.4.11.200 removing the 'quiet rhgb' option and adding 'pcie_aspm=off' (result of command journalctl -xe typed into emergency mode)
From the picture: nvme nvme0: allocated 64 MiB host memory buffer. nvme nvme0: 4/0/0 default/read/poll queues ... nvme nvme0: controller is down: will reset: CSTS=0xffffffff, PCI_STATUS=0x10 nvme nvme0: Does your device have a faulty power saving mode enabled? nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug nvmeOnl: I/O Cmd(0x2) @ LBA 976772992, 8 blocks, I/O Error (sct 0x3 / sc 0x71) I/O error, dev nvme0nl, sector 976772992 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 nvme 0000:04:00.0: enabling device (0000 -> 0002) nvme nvme0: Disabling device after reset failure: -19 nvme0n1: detected capacity change from 976773168 to 0 That could be a regression. git bisect is advised: https://docs.kernel.org/admin-guide/bug-bisect.html
The reported bug still persists in kernel-6.4.13-200.fc38.x86_64 Thanks, Paul
This seems to affect a non-trivial amount of people: https://discussion.fedoraproject.org/t/kernel-6-4-11-200-emergency-mode-during-boot/88088 https://discussion.fedoraproject.org/t/laptop-does-not-show-luks-password-prompt-as-nvme-fails-with-kernel-6-4-11-kernel-6-4-12/88622 https://bugzilla.kernel.org/show_bug.cgi?id=217802 https://bugzilla.kernel.org/show_bug.cgi?id=217849 The problematic commit (at least for 1 of those people) has been identified here: https://lkml.org/lkml/2023/8/16/1363 Examples of affected devices: Dell XPS 15 9560 (many reports), Dell Precision 5520 (several reports), Dell Latitude 5530 (one report) Proposing for a F39 blocker discussion.
Is this an issue with 6.5.x kernels? Everything I have seen on it mentions 6.4.11+ kernels in the 6.4 series. It is quite possible that a patch was backported which had some other dependency.
In https://bugzilla.kernel.org/show_bug.cgi?id=217802#c0 it says "6.5-rc6 - fails". But it would of course be great to test with the latest version. Anyone affected, can you please test latest 6.5 kernel from https://koji.fedoraproject.org/koji/packageinfo?packageID=8 ? Thanks!
(In reply to Kamil Páral from comment #19) > In https://bugzilla.kernel.org/show_bug.cgi?id=217802#c0 it says "6.5-rc6 - > fails". But it would of course be great to test with the latest version. > > Anyone affected, can you please test latest 6.5 kernel from > https://koji.fedoraproject.org/koji/packageinfo?packageID=8 ? Thanks! Confirming that this behavior exists in kernel 6.5.2-200 here on Dell XPS 15 9560 (07BE) with the Realtek RTS525A PCI Express Card Reader.
Thanks for the confirmation there.
(In reply to David Klann from comment #9) > (In reply to Artem S. Tashkinov from comment #8) > > (In reply to David Klann from comment #2) > > > > > Since the kernel cannot access the NVME, it can't write to the system journal and I'm unable to provide a copy of it. > > > > Please remove `quiet` and `rhgb` from your kernel boot options, boot, take a > > picture of the screen and upload it here. > > This is a composite photo (taken from an external monitor because the laptop > display is HiDPI and the text is larger on this display). This is with the > following kernel command line: > > root=UUID=4d0f3832-b5ba-46ef-b143-06a76341463d ro > resume=UUID=1b7eae4f-5cbc-4bbe-8e87-384aa1549910 pcie_aspm=off iommu=1 > > I have also attempted to boot using the kernel parm > nvme_core.default_ps_max_latency_us=0 added to the above with no different > results. > > Please let me know if there are other kernel command line parameters I might > try when booting. > > I also tried today booting the 6.4.13-200 kernel (in testing) and it results > in the same behavior. I have had a similar issue I believe is the same bug, however booting with nvme_core.default_ps_max_latency_us=0 did fix the issue. I had it when first installing Fedora on my HP LAPTOP 14-cf1015cl. I first checked to make sure my SSD was fully updated, which it was. My plan is to use grubby to make it so that is added permanently.
Proposing for Beta Blocker as well, worth discussing at least, in my opinion.
Upstream ML thread, where it looks like they're preparing to revert the problematic commit: https://lore.kernel.org/linux-nvme/c766f724-709d-42c1-b0eb-a7a543d28bd6@gmail.com/T/#t as SUSE has already reverted it in Tumbleweed, I'd suggest we should do the same across all Fedora releases at least until there's a better suggestion.
Discussed during the 2023-09-11 blocker review meeting: [0] The decision to classify this bug as an "AcceptedBlocker (Beta)" was made as it violates the following criterion: "The installer must be able to complete an installation to a single disk" and/or "A system installed with a release-blocking desktop must boot to a log in screen" on affected systems (whichever point it fails at). [0] https://meetbot.fedoraproject.org/fedora-blocker-review/2023-09-11/f39-blocker-review.2023-09-11-16.00.txt
FEDORA-2023-ac464517f7 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-ac464517f7
FEDORA-2023-ac464517f7 has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-ac464517f7` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-ac464517f7 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-ac464517f7 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-c8e7a96e54 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-c8e7a96e54
FEDORA-2023-0ae1162af6 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-0ae1162af6
FEDORA-2023-0ae1162af6 has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-0ae1162af6` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-0ae1162af6 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-c8e7a96e54 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-c8e7a96e54` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-c8e7a96e54 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
Can anyone who was affected by the exact original issue here - with the "nvme0: controller is down; will reset" messages - confirm whether the update fixed this?
(In reply to Adam Williamson from comment #34) > Can anyone who was affected by the exact original issue here - with the > "nvme0: controller is down; will reset" messages - confirm whether the > update fixed this? Sigh of relief! Confirming that kernel 6.4.16-200 on Fedora 38 is working on Dell XPS 15 9560. Thank you all for getting the Realtek Express card reader driver changes reverted and the kernel working again!
awesome, thanks. If it's not too much trouble, it'd be awesome if you could possibly check whether the Beta candidate compose works OK. You should be able to test by booting https://dl.fedoraproject.org/pub/alt/stage/39_Beta-1.1/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-39_Beta-1.1.iso and seeing if you see the telltale errors (and if your drive is actually visible in the installer, I guess). No need to actually install. I understand if you can't :) thanks again!
(In reply to Adam Williamson from comment #36) > awesome, thanks. If it's not too much trouble, it'd be awesome if you could > possibly check whether the Beta candidate compose works OK. You should be > able to test by booting > https://dl.fedoraproject.org/pub/alt/stage/39_Beta-1.1/Everything/x86_64/iso/ > Fedora-Everything-netinst-x86_64-39_Beta-1.1.iso and seeing if you see the > telltale errors (and if your drive is actually visible in the installer, I > guess). No need to actually install. I understand if you can't :) thanks > again! Boot from USB device, successfully loaded the kernel and started the installer, but right away the installer presented me with an error. But it appears to be unrelated to this bug because I was able to see the NVME device. I saved the system journal (journalctl > foo) on a partition on the NVME device. I was able to successfully start the installer on a different computer (Lenovo ThinkBook 15). I'll also attach the system journal output showing the Python backtrace from anaconda (and some additiona journal lines). I'm happy to open a separate bug for the Beta if folks think it's unrelated to this bug.
Created attachment 1988900 [details] Journal output showing anaconda backtrace, and additional info
well, that's definitely something different, but it's...worrying. can you try again and see if it happens reliably, and either way, file a separate bug for it? thanks!
Thanks, Adam, for your efforts to fix the reported kernel issue! I have just tested kernel 6.4.16-200 on Fedora 38 and, unfortunately, the reported problem persists.
(In reply to Adam Williamson from comment #34) > Can anyone who was affected by the exact original issue here - with the > "nvme0: controller is down; will reset" messages - confirm whether the > update fixed this? I'm one of the users who experienced this issue (Dell XPS 9560): upgrading to kernel 6.4.16 using the suggested command (sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-0ae1162af6) solved the issue. Thank you very much for your support!
Paul: I didn't do any of the work, I'm just trying to clarify if it worked. :D Looking back at your logs, you do not actually seem to have anything like the same problem as anyone else. Your logs have no NVMe-related errors. This seems to be your problem: Aug 28 19:05:39 localhost @ystemctl[626]: Failed to switch root: Specified switch root path '/sysroot' does not seem to be an OS tree. os-release file is missing. can you mount your system partitions in a live environment, or use the Fedora rescue environment or something, and see if /etc/os-release exists in there?
Thanks, Adam, for your reply. Fortunately, I can boot on my computer, as I still keep kernel-6.4.10-200.fc38.x86_64 installed. The directory /etc/os-release exists, as # dir /etc/os-release /etc/os-release # Paul
Huh. Well, your case is certainly weird, but it doesn't seem like the nvme bug we wound up addressing here (even though you originally filed the bug, sorry for that). So it makes sense that the update doesn't fix it. Looking at your logs again, on the 6.4.10 logs I see this: Using kernel command line parameters: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.4.10-200.fc38.x86_64 root=/dev/mapper/fedora_localhost--live-root ro resume=/dev/mapper/fedora_localhost--live-swap rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap rhgb quiet initcall_blacklist=simpledrm_platform_driver_init rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 but on the 6.4.12 logs I see: Using kernel command line parameters: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.4.12-200.fc38.x86_64 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 it seems like, somehow, you've lost a bunch of your boot parameters there. Did you maybe do anything to cause that, between when you installed 6.4.10 and when you installed 6.4.11?
Thanks, Adam, for your reply. I have never touched the boot parameters while using Fedora 38!
(In reply to Adam Williamson from comment #34) > Can anyone who was affected by the exact original issue here - with the > "nvme0: controller is down; will reset" messages - confirm whether the > update fixed this? I installed kernel 6.4.16-200 on Fedora 38 and even though it booted, it would not activate my WiFi connection. I then installed all waiting updates (I hadn't installed any since I did the upgrade from 6.4.9->6.4.12) and after reboot everything worked again. Do I have to do anything to remove the upgrade-testing/advisory thing once 6.4.16 is released? Thanks everyone!
Thanks for the feedback! I'm not sure what you mean by "Do I have to do anything to remove the upgrade-testing/advisory thing once 6.4.16 is released?", can you elaborate? Thanks!
(In reply to Adam Williamson from comment #44) > Huh. Well, your case is certainly weird, but it doesn't seem like the nvme > bug we wound up addressing here (even though you originally filed the bug, > sorry for that). So it makes sense that the update doesn't fix it. > > Looking at your logs again, on the 6.4.10 logs I see this: > > Using kernel command line parameters: > BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.4.10-200.fc38.x86_64 > root=/dev/mapper/fedora_localhost--live-root ro > resume=/dev/mapper/fedora_localhost--live-swap > rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap > rhgb quiet initcall_blacklist=simpledrm_platform_driver_init > rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 > > but on the 6.4.12 logs I see: > > Using kernel command line parameters: > BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.4.12-200.fc38.x86_64 > rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 > > it seems like, somehow, you've lost a bunch of your boot parameters there. > Did you maybe do anything to cause that, between when you installed 6.4.10 > and when you installed 6.4.11? Is there anything that I could do to restore the missing boot parameters? Thanks!
Just in case you are interested: This never showed up on this Dell XPS 15 95*50*. (The original power save problem did.) $ uname -a Linux caol-ila 6.4.15-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Sep 7 00:25:01 UTC 2023 x86_64 GNU/Linux (13 and 14 worked, too.) $ lspci ... 03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader (rev 01) 04:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 (rev 01)
Paul: er, there probably is, but off the top of my head I'm not sure what. Could you file a new bug and we can discuss it there? I know you filed this bug originally, but the course of events has sort of meant this one turned into a bug for the *other* problem, so it's probably best to have a clean slate. sorry for the trouble.
(In reply to Adam Williamson from comment #50) > Paul: er, there probably is, but off the top of my head I'm not sure what. > Could you file a new bug and we can discuss it there? I know you filed this > bug originally, but the course of events has sort of meant this one turned > into a bug for the *other* problem, so it's probably best to have a clean > slate. sorry for the trouble. Thanks, Adam. I have meanwhile filed another bug, as requested: https://bugzilla.redhat.com/show_bug.cgi?id=2239542
(In reply to Adam Williamson from comment #47) > Thanks for the feedback! > > I'm not sure what you mean by "Do I have to do anything to remove the > upgrade-testing/advisory thing once 6.4.16 is released?", can you elaborate? > Thanks! I guess I'm not familiar enough with the command I executed .Tried reading the docs https://dnf.readthedocs.io/en/latest/command_ref.html but still unsure. I used `--enablerepo=updates-testing`, is this something I need to disable again, or is that just enabling it for the current upgrade? Docs mention 'temporarily', so probably just for that one upgrade. I also used `--advisory=FEDORA-2023-0ae1162af6`, will this inhibit future upgrades to the kernel? In other words, am I now on some sort of branch and do I need to revert this to get back on the main branch, or will future upgrades just get applied if I run `sudo dnf upgrade`? Googling fedora advisory does not yield many results.
Ahhh, I see, you're asking about the instructions from Bodhi. No, all those options are one-time things, they apply only to that single run of dnf. Your permanent configuration has not been changed at all.
FEDORA-2023-4c8291ba6a has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-4c8291ba6a` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-4c8291ba6a See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-3100e4d61c has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-3100e4d61c` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-3100e4d61c See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-3100e4d61c has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-4c8291ba6a has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
Just to close the loop on this: confirming that kernel 6.5.5-200.fc38 works as expected. Thank you to all who worked to solve this mystery!