1. Please describe the problem: no output after grub & no boot removing any output surpressing options in the grub does not change anything, no output, whatsoever. 2. What is the Version-Release number of the kernel: 6.1.5 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : <= 6.0.18 works fine 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: yes, permanent problem 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: not checked yet. 6. Are you running any modules that not shipped with directly Fedora's kernel?: nvidia rpmfusion virtualbox rpmfusion 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag. there are no logs, as initramfs doesn't come up.
Update: rawhide kernel 6.2.0.... does not work too. CPU: AMD Ryzen 5600X BIOS Information Vendor: American Megatrends Inc. Version: P1.20 Release Date: 08/13/2020 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 16 MB Characteristics: PCI is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 kB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) ACPI is supported USB legacy is supported BIOS boot specification is supported Targeted content distribution is supported UEFI is supported BIOS Revision: 5.17 Handle 0x0002, DMI type 2, 15 bytes Base Board Information Manufacturer: ASRock Product Name: B550M Pro4 Version: Asset Tag: Features: Board is a hosting board Board is replaceable Location In Chassis: Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0 Memory Device Array Handle: 0x000E Error Information Handle: 0x0017 Total Width: 64 bits Data Width: 64 bits Size: 8 GB Form Factor: DIMM Set: None Locator: DIMM 1 Bank Locator: P0 CHANNEL A Type: DDR4 Type Detail: Synchronous Unbuffered (Unregistered) Speed: 3200 MT/s Manufacturer: Unknown Serial Number: 00000000 Asset Tag: Not Specified Part Number: F4-3200C16-8GIS Rank: 1 Configured Memory Speed: 3200 MT/s Minimum Voltage: 1.2 V Maximum Voltage: 1.2 V Configured Voltage: 1.2 V
btw.. the system boots via NVME.
Update: further investigations shows: System boots into initramfs and request passwaord for luks decryption. If entered blindly, system boots into runlevel 3, as X does not start due to lack of screens. NO PCI hardware is initialized after grub is displayed. # cat 7e390913b33b4e5ba8f960a9ba97aeee-6.2.0-0.rc3.20230113gitd9fc1511728c.28.fc38.x86_64.conf title Fedora Linux (6.2.0-0.rc3.20230113gitd9fc1511728c.28.fc38.x86_64) 36 (Thirty Six) version 6.2.0-0.rc3.20230113gitd9fc1511728c.28.fc38.x86_64 linux /vmlinuz-6.2.0-0.rc3.20230113gitd9fc1511728c.28.fc38.x86_64 initrd /initramfs-6.2.0-0.rc3.20230113gitd9fc1511728c.28.fc38.x86_64.img options root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 initcall_blacklist=simpledrm_platform_driver_init grub_users $grub_users grub_arg --unrestricted grub_class fedora
Created attachment 1938262 [details] kernel 6.2.0 boot log ( failing )
Created attachment 1938263 [details] successfull boot log ( 6.0.18 )
From comparing the logs, NVIDIA driver is not loaded/used .
Nvidia is working fine here. $ inxi -GSxx System: Host: leigh-pc Kernel: 6.1.5-200.fc37.x86_64 arch: x86_64 bits: 64 compiler: gcc v: 2.38-25.fc37 Desktop: Cinnamon v: 5.6.5 tk: GTK v: 3.24.36 wm: muffin dm: LightDM Distro: Fedora release 37 (Thirty Seven) Graphics: Device-1: NVIDIA TU117 [GeForce GTX 1650] driver: nvidia v: 525.78.01 arch: Turing pcie: speed: 2.5 GT/s lanes: 8 ports: active: none off: DP-2 empty: DP-1,HDMI-A-1 bus-ID: 01:00.0 chip-ID: 10de:1f82 Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.7 driver: X: loaded: nvidia unloaded: fbdev,modesetting,nouveau,vesa alternate: nv gpu: nvidia,nvidia-nvswitch display-ID: :0 screens: 1 Screen-1: 0 s-res: 3840x2160 s-dpi: 157 Monitor-1: DP-2 note: disabled model: Idek Iiyama PL2888UH res: 3840x2160 dpi: 157 diag: 708mm (27.9") API: OpenGL v: 4.6.0 NVIDIA 525.78.01 renderer: NVIDIA GeForce GTX 1650/PCIe/SSE2 direct render: Yes
This exact thing happens on f37, see https://bugzilla.redhat.com/show_bug.cgi?id=2161029
VT switching is broken. It looks like the kernel devs have forgotten to compile efifb support for 6.1.x stable release again! :-(
thanks to scott for find this so fast.
On F37, I can see this: $ grep FB_EFI /boot/config-6.0.15-300.fc37.x86_64 CONFIG_FB_EFI=y $ grep FB_EFI /boot/config-6.1.5-200.fc37.x86_64 # CONFIG_FB_EFI is not set So, Scott is right.
Guys, then entire FB Section is false: 6.1.5: # CONFIG_FB_VGA16 is not set # CONFIG_FB_UVESA is not set # CONFIG_FB_VESA is not set # CONFIG_FB_EFI is not set 6.0.18: # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set # CONFIG_FB_UVESA is not set CONFIG_FB_VESA=y CONFIG_FB_EFI=y # CONFIG_FB_N411 is not set
I can confirm this issue. NVIDIA is broken on 6.1.x due to unset CONFIG_FB_EFI config option.
It seems that the NVIDIA driver still has not yet fixed their code for dealing with no fbdev drivers at all (pure simpledrm). :(
I have no idea what the release process for Fedora looks like or who would be the best person to approach about this, so please point me to another channel if there's a better place to discuss this. Would it make sense to add two additional steps to the release process: - at some early point of the release process, check if the new kernel has support for VESAFB+EFIFB. If it doesn't, raise a flag and troubleshoot it, instead of releasing the new version into the updates repo (where it will break hundreds of thousands of systems and drive people away from the Fedora project) - have a test system which is using official nvidia drivers + Wayland and if the new version of the kernel doesn't work for whatever reason (might be due to missing EFIFB in the kernel or for some other reason), don't release it
Javier, you were dealing with nvidia+simpledrm/efifb issues the last time at: https://ask.fedoraproject.org/t/common-issues/22440 Can you please look at this issue, whether this is the same/similar problem, and whether we can do something reasonable about it? Is the missing efifb in latest kernels intentional? Thanks! (In reply to Michal Wasilewski from comment #15) > - have a test system which is using official nvidia drivers + Wayland and if > the new version of the kernel doesn't work for whatever reason (might be due > to missing EFIFB in the kernel or for some other reason), don't release it Michal, I understand this is inconvenient, but Fedora doesn't support the proprietary Nvidia driver. If you decide to use it, you're on your own. It's very problematic (there's a constant catch-up game between kernel and nvidia), and Fedora devs of course try to accommodate if possible (see the link above), but Fedora won't stop shipping latest software just because a proprietary driver is broken. Please consider buying hardware from a more Linux-friendly vendor next time (of possibly at least use the nouveau opensource driver, if it works for you).
*** Bug 2161029 has been marked as a duplicate of this bug. ***
Well, the goal is to *not* ship any fbdev stuff. And we intentionally don't block on the proprietary nvidia driver. I'm hoping to see this become less of a need anyway as nouveau integrates support for the new firmware exposing capabilities for NVIDIA's open source driver they released last year. There's already been some decent progress on that front. :)
> Michal, I understand this is inconvenient, but Fedora doesn't support the proprietary Nvidia driver. If you decide to use it, you're on your own. Yea, I know, that's why there's a trade-off on our side: security (picking up patches from the updates repo) vs reliability (the gpus actually being operational). This in turn led us to having a complicated, multi-stages updates process, but it's costly and it's not working very well. > (of possibly at least use the nouveau opensource driver, if it works for you) Unfortunately that's not an option, we need CUDA and it doesn't work with nouveau > I'm hoping to see this become less of a need anyway as nouveau integrates support for the new firmware exposing capabilities for NVIDIA's open source driver they released last year. Looking forward to that! Thanks everyone for your great work! I just wanted to bring this up as I imagine there are many people impacted. Thanks again for your help :)
(In reply to Neal Gompa from comment #18) > Well, the goal is to *not* ship any fbdev stuff. And we intentionally don't > block on the proprietary nvidia driver. I'm hoping to see this become less > of a need anyway as nouveau integrates support for the new firmware exposing > capabilities for NVIDIA's open source driver they released last year. > There's already been some decent progress on that front. :) Removing efifb or vesefb support isn't an option IMO. 1: There has been no proposal to remove efifb or vesefb so this issue is a regression. 2: lack of firmware for nouveau cripples any card newer than Maxwell generation cards. 3: NVIDIA's open source driver doesn't support any card less than Turing (RTX) generation cards.
> Removing efifb or vesefb support isn't an option IMO. > > 1: There has been no proposal to remove efifb or vesefb so this issue is a > regression. Yes there has, it was a F36 Change: https://fedoraproject.org/wiki/Releases/36/ChangeSet#Replace_the_fbdev_drivers_with_simpledrm_and_the_DRM_fbdev_emulation_layer
As Peter said, there was a change proposal to disable all fbdev drivers and use simpledrm instead for early console or with "nomodeset". We have been carrying a downstream patch to avoid using simpledrm and instead using {efi,vesa}fb if the nvidia-drm.modeset=1 cmdline param is used: https://gitlab.com/cki-project/kernel-ark/-/commit/811fe0e4dcfd86a0db5135d3bfef4936794efdb6 We had that in the Fedora 6.0 kernel version and it seems that was not dropped when rebasing to 6.1. I agree that this is inconvenient to Nvidia users and maybe even an oversight of the Fedora kernel maintainers, but until when that workaround should be carried? This change has been since F36, so 2 Fedora releases and the Nvidia driver still doesn't register its own emulated fbdev device. So I believe is a decision that the Fedora kernel package maintainers should do, but saying that's not an option isn't true really. What's not an option IMO is to carry this patch ad infinitum.
(In reply to Peter Robinson from comment #21) > > Removing efifb or vesefb support isn't an option IMO. > > > > 1: There has been no proposal to remove efifb or vesefb so this issue is a > > regression. > > Yes there has, it was a F36 Change: > https://fedoraproject.org/wiki/Releases/36/ > ChangeSet#Replace_the_fbdev_drivers_with_simpledrm_and_the_DRM_fbdev_emulatio > n_layer Which was a year ago and it was reverted for NVIDIA drivers, which NVIDIA devs were aware of: https://github.com/NVIDIA/open-gpu-kernel-modules/issues/228#issuecomment-1150237242 No such proposal or announcement was made that this change was going to be reverted, crippling NVIDIA users in the process.
Add negative karma to https://bodhi.fedoraproject.org/updates/FEDORA-2023-47cff193ec
(In reply to Javier Martinez Canillas from comment #22) > This change has been since F36, so 2 Fedora releases and the Nvidia driver > still doesn't register its own emulated fbdev device. I assume the problem is that the change affected stable releases in the middle of their life cycle. Javier (or kernel maintainers), would it be possible to keep the patch included in F36+F37 until they're EOL, and drop it only in Rawhide/F38? (In reply to leigh scott from comment #24) > Add negative karma to > https://bodhi.fedoraproject.org/updates/FEDORA-2023-47cff193ec No, that's not helpful. The change occurred in kernel 6.1.5 which is already stable. It doesn't make sense to downvote 6.1.6.
> Yes there has, it was a F36 Change: https://fedoraproject.org/wiki/Releases/36/ChangeSet#Replace_the_fbdev_drivers_with_simpledrm_and_the_DRM_fbdev_emulation_layer This has been reverted by a downstream patch especially for NVIDIA users.
I don't think it is solely an nvidia issue. I reckon anything that depends on efifb or vesafb will be broken. Thus I really don't understand this discussion and why those configs shouldn't be enabled by default for the foreseeable future (even f38, f39, etc.).
(In reply to nesdeq from comment #27) > I don't think it is solely an nvidia issue. I reckon anything that depends > on efifb or vesafb will be broken. Thus I really don't understand this > discussion and why those configs shouldn't be enabled by default for the > foreseeable future (even f38, f39, etc.). Because Fedora wants to force change so that developers migrate to newer technologies.
(In reply to nesdeq from comment #27) > I don't think it is solely an nvidia issue. I reckon anything that depends > on efifb or vesafb will be broken. Thus I really don't understand this > discussion and why those configs shouldn't be enabled by default for the > foreseeable future (even f38, f39, etc.). So what exactly depends on efifb and vesafb that isn't the proprietary Nvidia driver? All the in-tree DRM drivers set up their own emulated fbdev device for fbcon to bind with. And for early console before the DRM drivers are probed, the simpledrm driver can also use a firmware provided framebuffer, just like efifb and vesafb.
(In reply to danielsuarez369 from comment #28) > (In reply to nesdeq from comment #27) > > I don't think it is solely an nvidia issue. I reckon anything that depends > > on efifb or vesafb will be broken. Thus I really don't understand this > > discussion and why those configs shouldn't be enabled by default for the > > foreseeable future (even f38, f39, etc.). > > Because Fedora wants to force change so that developers migrate to newer > technologies. DRM isn't new. The fbdev subsystem has been deprecated and in maintenance only mode for almost a decade now.
It's not our (users) decision to make. What would you advise then? Will this bug be closed, in that case we can just compile our own efifb and vesafb enabled fedora kernel flavors - or will this "workaround" be carried over for the foreseeable future? No hard feelings, just looking for guidance.
(In reply to nesdeq from comment #27) > I don't think it is solely an nvidia issue. I reckon anything that depends > on efifb or vesafb will be broken. Thus I really don't understand this > discussion and why those configs shouldn't be enabled by default for the > foreseeable future (even f38, f39, etc.). Nobody has used these for anything meaningful in almost 15 years. Even the NVIDIA driver doesn't actually use them, it just doesn't have the ability to initialize its own framebuffer device and requires handoff from one of these to its own. There's no reason the NVIDIA driver should need them now, other than they haven't fixed their code to stop needing them.
https://www.kernel.org/doc/html/latest/fb/efifb.html
I'm aware what efifb does. SimpleDRM basically replaces it.
(In reply to Neal Gompa from comment #34) > I'm aware what efifb does. SimpleDRM basically replaces it. No basically about it, it does, because it is actively maintained and hasn't suffered from a bunch of hard to fix security issues and the drm stack is actively maintained. efifb is legacy and vendors should have migrated from it long ago.
I see a lot of opinions and zero solutions. What is the guidance on this topic? I would set up a copr repo with nvidia support for all of us who need to do actual ml work with cuda. But that takes work and will not make sense if this efifb and vesafb will be enabled again by rh/fedora maintainers. Awaiting decisions. Eagerly.
(In reply to nesdeq from comment #36) > I see a lot of opinions and zero solutions. What is the guidance on this > topic? I would set up a copr repo with nvidia support for all of us who need > to do actual ml work with cuda. But that takes work and will not make sense > if this efifb and vesafb will be enabled again by rh/fedora maintainers. > Awaiting decisions. Eagerly. CUDA should still work, you just won't get the early console, once the driver loads it should all just work?
(In reply to Peter Robinson from comment #37) > (In reply to nesdeq from comment #36) > > I see a lot of opinions and zero solutions. What is the guidance on this > > topic? I would set up a copr repo with nvidia support for all of us who need > > to do actual ml work with cuda. But that takes work and will not make sense > > if this efifb and vesafb will be enabled again by rh/fedora maintainers. > > Awaiting decisions. Eagerly. > > CUDA should still work, you just won't get the early console, once the > driver loads it should all just work? Yes and so the solution for everyone using LUKS is to just stare at a blank screen and hope they blindly type in the password correctly into the abyss? And get the idea to try that at all in the first place? This is not usable.
From a practical side: No user will replace it's hardware, just because Fedora kernel compilation will stop to support the prop driver. They will switch to a distro that support it. So this "Enforcement" some spoke of, will just backfire, and that is the sad result I really want to avoid for you and me.
> > Yes and so the solution for everyone using LUKS is to just stare at a blank > screen and hope they blindly type in the password correctly into the abyss? > And get the idea to try that at all in the first place? This is not usable. After you entered the password, the screen keeps blank as no screen for X is found, it X is used. Thats no solution at all, except for a server.
(In reply to customercare from comment #39) > From a practical side: > > No user will replace it's hardware, just because Fedora kernel compilation > will stop to support the prop driver. > They will switch to a distro that support it. So this "Enforcement" some > spoke of, will just backfire, and that is the sad result I really want to > avoid for you and me. If they make it default I will switch distro and abandon my fedora packages and my rpmfusion infra duties and packages.
This is not some conspiracy to get users to switch hardware. The simple explanation is nvidia's driver is BROKEN. We know this, and as a result, we carry a nasty hack to make things work. That hack is not, and never will be upstream, and we do not carry it in rawhide. Every rebase is an opportunity to check and see if nvidia has fixed their driver. As you can see, they have not. I had brought in the patch for the hack with 6.1.5, but forgot the config changes to make the hack work. Everything will be working with 6.1.7 when it comes out this week. So no, it has nothing to do with my trying to force users to switch hardware. It is just more hope that nvidia will fix their driver.
Thank you for your response. Could you please clarify your wording "hack". As I see the code is included in vanilla kernel https://www.kernel.org/doc/html/latest/fb/efifb.html 6.20rc4, so what do you mean by hack? Is hack := setting config_efi_fb=y?
(In reply to nesdeq from comment #43) > Thank you for your response. Could you please clarify your wording "hack". > As I see the code is included in vanilla kernel > https://www.kernel.org/doc/html/latest/fb/efifb.html 6.20rc4, so what do you > mean by hack? Is hack := setting config_efi_fb=y? Just enabling CONFIG_EFI_FB=y is not enough to make the nvidia binary driver case work . simpledrm will still take precedence over CONFIG_EFI_FB=y, so unless for some reason simpledrm cannot use the EFIFB (because of e.g. some weird pixelformat) you will still get simpledrm and not efifb even with CONFIG_EFI_FB=y. To still make the nvidia binary driver case work the Fedora kernels carry a Fedora specific patch/hack to skip simpledrm initialization if nvidia-drm.modeset=1 is present on the kernel commandline.
(In reply to Hans de Goede from comment #44) > (In reply to nesdeq from comment #43) > > Thank you for your response. Could you please clarify your wording "hack". > > As I see the code is included in vanilla kernel > > https://www.kernel.org/doc/html/latest/fb/efifb.html 6.20rc4, so what do you > > mean by hack? Is hack := setting config_efi_fb=y? > > Just enabling CONFIG_EFI_FB=y is not enough to make the nvidia binary driver > case work . > > simpledrm will still take precedence over CONFIG_EFI_FB=y, so unless for > some reason simpledrm cannot use the EFIFB (because of e.g. some weird > pixelformat) you will still get simpledrm and not efifb even with > CONFIG_EFI_FB=y. > > To still make the nvidia binary driver case work the Fedora kernels carry a > Fedora specific patch/hack to skip simpledrm initialization if > nvidia-drm.modeset=1 is present on the kernel commandline. Thanks for the insight. So initcall_blacklist=simpledrm_platform_driver_init as kernel-parameter does nothing by itself without the patch?
(In reply to nesdeq from comment #45) > Thanks for the insight. So initcall_blacklist=simpledrm_platform_driver_init > as kernel-parameter does nothing by itself without the patch? I am not familiar enough with the details to answer that.
(In reply to Hans de Goede from comment #46) > (In reply to nesdeq from comment #45) > > Thanks for the insight. So initcall_blacklist=simpledrm_platform_driver_init > > as kernel-parameter does nothing by itself without the patch? > > I am not familiar enough with the details to answer that. Weird. I can build vanilla 6.1.x from kernel.org, and CONFIG_EFI_FB=y and everything seems to work fine. That's why I am really interested to learn what patch exactly is needed that is referenced by you and Justin as "hack". Could you point me to it?
(In reply to nesdeq from comment #47) > Weird. I can build vanilla 6.1.x from kernel.org, and CONFIG_EFI_FB=y and > everything seems to work fine. That's why I am really interested to learn > what patch exactly is needed that is referenced by you and Justin as "hack". > Could you point me to it? Did you also enable SIMPLEDRM / start with Fedora's kernel config and then only add CONFIG_EFI_FB=y? This is the patch we are carrying for this: https://gitlab.com/cki-project/kernel-ark/-/commit/811fe0e4dcfd86a0db5135d3bfef4936794efdb6
Thank you for your response. I can double check that tomorrow. Maybe the patch is no longer needed? I took 6.0.18 config as starting point, n on the new and changed efi_fb to y on 6.1.5 from kernel.org.
(In reply to Hans de Goede from comment #48) > (In reply to nesdeq from comment #47) > > Weird. I can build vanilla 6.1.x from kernel.org, and CONFIG_EFI_FB=y and > > everything seems to work fine. That's why I am really interested to learn > > what patch exactly is needed that is referenced by you and Justin as "hack". > > Could you point me to it? > > Did you also enable SIMPLEDRM / start with Fedora's kernel config and then > only add CONFIG_EFI_FB=y? > > This is the patch we are carrying for this: > > https://gitlab.com/cki-project/kernel-ark/-/commit/ > 811fe0e4dcfd86a0db5135d3bfef4936794efdb6 After reading up on the issue and looking at the source initcall_blacklist=simpledrm_platform_driver_init should block simpledrm from initializing and thus blacklisting it. Can someone verify on rh/fedora's end that Fedora patch/hack is not needed anymore but config_efi_fb=y and initcall_blacklist=simpledrm_platform_driver_init as kernel parameter is enough? Because also, it seems that this kernel parameter is added by nvidia driver/rpm in addition to nouveau blacklist and nvidia-drm.modeset=1. So they seem aware but still rely on efifb.
(In reply to Hans de Goede from comment #44) That's quite weird to hear as I'm using the vanilla kernel and the nvidia proprietary driver works just fine here. $ grep ^CONFIG_FB .config CONFIG_FB_CMDLINE=y CONFIG_FB_NOTIFY=y CONFIG_FB=y CONFIG_FB_CFB_FILLRECT=y CONFIG_FB_CFB_COPYAREA=y CONFIG_FB_CFB_IMAGEBLIT=y CONFIG_FB_SYS_FILLRECT=m CONFIG_FB_SYS_COPYAREA=m CONFIG_FB_SYS_IMAGEBLIT=m CONFIG_FB_SYS_FOPS=m CONFIG_FB_DEFERRED_IO=y CONFIG_FB_VESA=y CONFIG_FB_EFI=y CONFIG_FB_SIMPLE=y I don't use any kernel patches or special kernel boot options.
A couple of things. First, the hack actually is 2 pieces. We will not carry it in rawhide because it really needs to go away. The first piece is the kernel patch, the second piece is the config change to turn on FB_EFI. Javier pinged me around the 6.1.5 release to mention that the hack was needed for 6.1, so I added the patch and forgot the other piece. That has been added now, but I am not willing to do a special build for it. There is nothing in 6.1.x that is so critical that nvidia users needing vt can't just sit on 6.0.18 until 6.1.7 ships in a couple of days. As to that hack, yes, it can be dropped as long as FB_EFI is on provided that you are willing to add the kernel command line option to make it work. SuSE did this and got it upstream I believe. Fedora went a different route because what we did should not require any interaction on the end user part. That hack keys off of a command line option that should already be present for nvidia driver users. How do we stop this from being a problem with the 6.2 rebases? Well, someone will likely have to remind me that it is required again when we branch 6.2 provided that nvidia hasn't fixed their driver. I do have notes now pointing to both commits which need to be added, so I won't accidentally add 1 and forget the other as happened this time. And ideally, if some nvidia users participated in test week, this might be caught before it is shipped to users. We really didn't get a clear indication during test week (where we had neither the hack nor the config changes), or even in the 6.1.5 update while it was in updates-testing. The noise came after 6.1.5 was stable and 6.1.6 didn't fix it. Kernels move fast here, and 6.1.6 was already built before the 6.1.5 kernels pushed to stable. Test kernels, it really helps everyone involved.
*** Bug 2161419 has been marked as a duplicate of this bug. ***
(In reply to Justin M. Forbes from comment #52) > A couple of things. First, the hack actually is 2 pieces. We will not carry > it in rawhide because it really needs to go away. The first piece is the > kernel patch, the second piece is the config change to turn on FB_EFI. > Javier pinged me around the 6.1.5 release to mention that the hack was > needed for 6.1, so I added the patch and forgot the other piece. That has > been added now, but I am not willing to do a special build for it. There is > nothing in 6.1.x that is so critical that nvidia users needing vt can't just > sit on 6.0.18 until 6.1.7 ships in a couple of days. As to that hack, yes, > it can be dropped as long as FB_EFI is on provided that you are willing to > add the kernel command line option to make it work. SuSE did this and got it > upstream I believe. Fedora went a different route because what we did should > not require any interaction on the end user part. That hack keys off of a > command line option that should already be present for nvidia driver users. > > How do we stop this from being a problem with the 6.2 rebases? Well, someone > will likely have to remind me that it is required again when we branch 6.2 > provided that nvidia hasn't fixed their driver. I do have notes now > pointing to both commits which need to be added, so I won't accidentally add > 1 and forget the other as happened this time. And ideally, if some nvidia > users participated in test week, this might be caught before it is shipped > to users. We really didn't get a clear indication during test week (where we > had neither the hack nor the config changes), or even in the 6.1.5 update > while it was in updates-testing. The noise came after 6.1.5 was stable and > 6.1.6 didn't fix it. Kernels move fast here, and 6.1.6 was already built > before the 6.1.5 kernels pushed to stable. Test kernels, it really helps > everyone involved. Thx for fixing in 6.1.7. However I cannot follow your explanation. The following is true: 1. kernel config config_fb_efi=y is needed 2. simpledrm must not be loaded The patch you have been using is obsolete and has been for some time. This is because nvidia driver installer and also the akmod-nvidia from rpmfusion will add the needed kernel cmdline to grub automatically, i.e. initcall blacklist simpledrm. So NO user interaction is needed at all! You just were not aware of this. So what needs to be done is 1. drop the patch/"hack" forever 2. just enable config_fb_efi=y and everyone is happy.
(In reply to Justin M. Forbes from comment #42) > This is not some conspiracy to get users to switch hardware. The simple > explanation is nvidia's driver is BROKEN. We know this, and as a result, we > carry a nasty hack to make things work. That hack is not, and never will be > upstream, and we do not carry it in rawhide. Every rebase is an opportunity > to check and see if nvidia has fixed their driver. As you can see, they have > not. I had brought in the patch for the hack with 6.1.5, but forgot the > config changes to make the hack work. Everything will be working with 6.1.7 > when it comes out this week. So no, it has nothing to do with my trying to > force users to switch hardware. It is just more hope that nvidia will fix > their driver. Sorry but I have to disagree here, the only part that is broken here is Fedora's kernel config breaking backwards compatibility and the NVIDIA driver in the process. If every single other kernel from other distros such as Debian, Arch, Manjaro, PopOS, Ubuntu, RHEL, Gentoo, and likely others work FLAWLESSLY, then it is not NVIDIA's driver that is broken, but Fedora's kernel. NVIDIA has nothing to fix, their driver works flawlessly, even a driver as old as R340 works all the way to kernel 6.2, which is supported by Debian. If Debian has such an old driver working with the latest kernel, it is clear to me as a user who is at fault here for the NVIDIA driver not working.
(In reply to nesdeq from comment #54) > The patch you have been using is obsolete and has been for some time. This > is because nvidia driver installer and also the akmod-nvidia from rpmfusion > will add the needed kernel cmdline to grub automatically, i.e. initcall > blacklist simpledrm. So NO user interaction is needed at all! > > You just were not aware of this. So what needs to be done is > > 1. drop the patch/"hack" forever > 2. just enable config_fb_efi=y > > and everyone is happy. I was indeed not aware of this, as I am not a user of their driver, I did not know that they had changed it to accommodate the upstream workaround. This is part of the reason that I was not willing to carry this patch in rawhide though. Eventually it wouldn't be needed, and we might not know if we were always carrying that patch. Thanks for the update, I will drop the patch and keep the config updates for 6.1.7 and see how that goes.
(In reply to danielsuarez369 from comment #55) > (In reply to Justin M. Forbes from comment #42) > > This is not some conspiracy to get users to switch hardware. The simple > > explanation is nvidia's driver is BROKEN. We know this, and as a result, we > > carry a nasty hack to make things work. That hack is not, and never will be > > upstream, and we do not carry it in rawhide. Every rebase is an opportunity > > to check and see if nvidia has fixed their driver. As you can see, they have > > not. I had brought in the patch for the hack with 6.1.5, but forgot the > > config changes to make the hack work. Everything will be working with 6.1.7 > > when it comes out this week. So no, it has nothing to do with my trying to > > force users to switch hardware. It is just more hope that nvidia will fix > > their driver. > > Sorry but I have to disagree here, the only part that is broken here is > Fedora's kernel config breaking backwards compatibility and the NVIDIA > driver in the process. > > If every single other kernel from other distros such as Debian, Arch, > Manjaro, PopOS, Ubuntu, RHEL, Gentoo, and likely others work FLAWLESSLY, > then it is not NVIDIA's driver that is broken, but Fedora's kernel. > Well, Fedora is also the distribution that engages with all the ecosystems before almost all those you list above, so it's not surprising stuff like this shows up here first. > NVIDIA has nothing to fix, their driver works flawlessly, even a driver as > old as R340 works all the way to kernel 6.2, which is supported by Debian. > If Debian has such an old driver working with the latest kernel, it is clear > to me as a user who is at fault here for the NVIDIA driver not working. This is definitely not true. NVIDIA doesn't maintain driver branches that old, which leaves other people to create "fixes" to make it work with newer kernels. RPM Fusion, for example, maintains a patch set against R340 specifically for this: https://pkgs.rpmfusion.org/cgit/nonfree/nvidia-340xx-kmod.git/log/ Debian does the same thing: https://salsa.debian.org/nvidia-team/nvidia-graphics-drivers/-/blob/340xx/main/debian/changelog Every distribution has to do stuff like this to make the NVIDIA drivers work. It's *not* easy, and it's not *flawless*. You're seeing what happens when part of the hand-holding we do falls out unintentionally.
I've come across this problem and tried to figure out what's going on from above. Update to kernel 6.1.5 & nvidia driver from rpmfusion 525.78.01 on 16 Jan 2023 18:00 Australian Eastern Standard time. I use LUKS. I get same symptoms as desribed above. Repeatable. Specifically can not enter passcode for luks. Reverting in grub to kernel 6.0.18 while using nvidia driver 525.78.01 resolved issue with attach information using this kernel. Unable to provide information re kernel 6.1.5 as unable to boot. In all the comments above I sympathise with the comments: 1/ "I see a lot of opinions and zero solutions. What is the guidance on this topic?" For non experts no simple guidance to get broken system up and running other then revert back to 6.0.18 is not clear. 2/ This issue has now happened twice due to the desribed issues with the Nvidia driver. 3/ My system is unstable with nuveau driver. Only Nvidia driver works provided kernel updates do not break whatever it is. What I what to know is will 6.1.7 fix this permanently given the problem appears understood? If Fedora believes that Nvidia needs to address this please make this a clear statement of policy. This will enable me to make a decision if to move on from Fedora. I have limited time or expertise to go through this again on 6.2. Fix the issue or state that Nvidia drivers are incompatible with Fedora 3X. Please advise definitive way forward for users of Nvidia cards requiring proprietary drivers as this advice is currently not clear and buried in varying opinions or political/ideological views in above comments that do not resolve the problem.
(In reply to Robert Koppelhuber from comment #58) > I've come across this problem and tried to figure out what's going on from > above. > Update to kernel 6.1.5 & nvidia driver from rpmfusion 525.78.01 on 16 Jan > 2023 18:00 Australian Eastern Standard time. > I use LUKS. > I get same symptoms as desribed above. Repeatable. Specifically can not > enter passcode for luks. > Reverting in grub to kernel 6.0.18 while using nvidia driver 525.78.01 > resolved issue with attach information using this kernel. Unable to provide > information re kernel 6.1.5 as unable to boot. > > In all the comments above I sympathise with the comments: > 1/ "I see a lot of opinions and zero solutions. What is the guidance on this > topic?" > For non experts no simple guidance to get broken system up and running other > then revert back to 6.0.18 is not clear. > 2/ This issue has now happened twice due to the desribed issues with the > Nvidia driver. > 3/ My system is unstable with nuveau driver. Only Nvidia driver works > provided kernel updates do not break whatever it is. > > > What I what to know is will 6.1.7 fix this permanently given the problem > appears understood? > If Fedora believes that Nvidia needs to address this please make this a > clear statement of policy. This will enable me to make a decision if to move > on from Fedora. I have limited time or expertise to go through this again on > 6.2. Fix the issue or state that Nvidia drivers are incompatible with Fedora > 3X. > > Please advise definitive way forward for users of Nvidia cards requiring > proprietary drivers as this advice is currently not clear and buried in > varying opinions or political/ideological views in above comments that do > not resolve the problem. I also have limited knowledge and an affected user, but my understanding from reading https://bugzilla.redhat.com/show_bug.cgi?id=2161104#c54 and https://bugzilla.redhat.com/show_bug.cgi?id=2161104#c56 is that the fix should be permanent. Please correct me if I am wrong.
(In reply to Justin M. Forbes from comment #56) > (In reply to nesdeq from comment #54) > > The patch you have been using is obsolete and has been for some time. This > > is because nvidia driver installer and also the akmod-nvidia from rpmfusion > > will add the needed kernel cmdline to grub automatically, i.e. initcall > > blacklist simpledrm. So NO user interaction is needed at all! > > > > You just were not aware of this. So what needs to be done is > > > > 1. drop the patch/"hack" forever > > 2. just enable config_fb_efi=y > > > > and everyone is happy. > > I was indeed not aware of this, as I am not a user of their driver, I did > not know that they had changed it to accommodate the upstream workaround. > This is part of the reason that I was not willing to carry this patch in > rawhide though. Eventually it wouldn't be needed, and we might not know if > we were always carrying that patch. Thanks for the update, I will drop the > patch and keep the config updates for 6.1.7 and see how that goes. Thx, that's awesome.
@jforbes: Everything is cool, mistakes happen. I guess we overreacted a bit in his br, as we all love fedora so much, that we could not believe what was written ;) I like your postmortem, about how this could have happened. It showed the correct amount of professionalism at the right time & place. So, well done. As the process got one step less to fail ... \o/ (In reply to Justin M. Forbes from comment #52) > 1 and forget the other as happened this time. And ideally, if some nvidia > users participated in test week, this might be caught before it is shipped > to users. We really didn't get a clear indication during test week (where we if we need to test a new kernel without the "hack"-patch, drop me a line. I.e. in here. BTW: grubby is not always able to tell BLS to boot the correct kernel. it says, it did the right thing, but grub does not respect this. On my Surface tablet, it works, as it should work, but on my main it does not, which is in this special case we have right now, a very bad timing :) I will open a new br for this.
(In reply to nesdeq from comment #54) > (In reply to Justin M. Forbes from comment #52) [...] > Thx for fixing in 6.1.7. However I cannot follow your explanation. The > following is true: > > 1. kernel config config_fb_efi=y is needed > 2. simpledrm must not be loaded > > The patch you have been using is obsolete and has been for some time. This > is because nvidia driver installer and also the akmod-nvidia from rpmfusion > will add the needed kernel cmdline to grub automatically, i.e. initcall > blacklist simpledrm. So NO user interaction is needed at all! > > You just were not aware of this. So what needs to be done is > > 1. drop the patch/"hack" forever > 2. just enable config_fb_efi=y > > and everyone is happy. You know what? I'm actually getting quite tired of the entitled attitude of some Fedora users about this issue. And no, what you said is not correct but wrong actually. Some facts: 1) This is a bug in the Nvidia propietary driver that is not implementing its own emulated fbdev device. They are relying on efifb and vesa and that's just not robust as shown by this particular fallout (and others). That's not Fedora (or mainline Linux) fault and nothing that we can do about it. 2) If CONFIG_FB_EFI=y and the simpledrm driver initcall is disabled "initcall_blacklist=simpledrm_platform_driver_init", that will only prevent the simpledrm driver to be probed but *it will not* cause the efifb or vesafb drivers to be probed. 3) The drivers bind against different platform devices: efifb binds against "efi-frambuffer", vesafb binds against "vesa-framebuffer" and simpledrm against "simple-framebuffer". So preventing the simpledrm driver to be loaded is not enough, one also has to make sure that the correct platform device is registered for the needed drivers (fbdev, vesafb) to probe. 4) That's exactly what the hack (and *it is* a hack, I'm actually the author of that patch) that we are carrying in the Fedora kernel does: https://gitlab.com/cki-project/kernel-ark/-/commit/811fe0e4dcfd86a0db5135d3bfef4936794efdb6. If the "nvidia-drm.modeset=1" cmdline param is set, it avoids registering a "simple-framebuffer" device and instead registers either an "efi-framebuffer" (for EFI) or "vesa-framebuffer" (for legacy BIOS with vesa=$mode). All this is explained in detail in the commit message of the hack, so you could had learn from it by reading it instead of writing misleading and wrong information here and make me repeat it again. 5) SUSE has a similar patch, they don't have an upstream solution either. What they did is to force Nvidia users to use a "nosimpledrm=1" cmdline. Our approach was just different because we decided to utilize an existing cmdline param that Nvidia users are setting anyways. 6) Another way to have the same behaviour is by disabling CONFIG_SYSFB_SIMPLEFB=y. That will cause the "simple-framebuffer" platform device to not be registered, and instead "efi-framebuffer" or "vesa-framebuffer" will be registered. I guess that's what some have in their kernel config and that's why they think that just CONFIG_FB_EFI=y would be enough? 7) The Fedora developers actually care about Nvidia users. I find offending that people are implying otherwise. We care that much that went as far as to spend time understanding the issue and writing a patch that we have to carry downstream (for at least 3 kernel releases now). But mainline development doesn't care about proprietary out-of-tree drivers, they do what's better for all the drivers that are in Linux mainline. So please assume that the Fedora developers have good intention and know what they are doing. Complaining and stating as true facts that are wrong is not helping, on the contrary it just wastes other people time. Like mine that I had to write this long rant instead of doing more important and useful things.
(In reply to Neal Gompa from comment #57) > (In reply to danielsuarez369 from comment #55) > > (In reply to Justin M. Forbes from comment #42) > > > This is not some conspiracy to get users to switch hardware. The simple > > > explanation is nvidia's driver is BROKEN. We know this, and as a result, we > > > carry a nasty hack to make things work. That hack is not, and never will be > > > upstream, and we do not carry it in rawhide. Every rebase is an opportunity > > > to check and see if nvidia has fixed their driver. As you can see, they have > > > not. I had brought in the patch for the hack with 6.1.5, but forgot the > > > config changes to make the hack work. Everything will be working with 6.1.7 > > > when it comes out this week. So no, it has nothing to do with my trying to > > > force users to switch hardware. It is just more hope that nvidia will fix > > > their driver. > > > > Sorry but I have to disagree here, the only part that is broken here is > > Fedora's kernel config breaking backwards compatibility and the NVIDIA > > driver in the process. > > > > If every single other kernel from other distros such as Debian, Arch, > > Manjaro, PopOS, Ubuntu, RHEL, Gentoo, and likely others work FLAWLESSLY, > > then it is not NVIDIA's driver that is broken, but Fedora's kernel. > > > > Well, Fedora is also the distribution that engages with all the ecosystems > before almost all those you list above, so it's not surprising stuff like > this shows up here first. > > > NVIDIA has nothing to fix, their driver works flawlessly, even a driver as > > old as R340 works all the way to kernel 6.2, which is supported by Debian. > > If Debian has such an old driver working with the latest kernel, it is clear > > to me as a user who is at fault here for the NVIDIA driver not working. > > This is definitely not true. NVIDIA doesn't maintain driver branches that > old, which leaves other people to create "fixes" to make it work with newer > kernels. > > RPM Fusion, for example, maintains a patch set against R340 specifically for > this: https://pkgs.rpmfusion.org/cgit/nonfree/nvidia-340xx-kmod.git/log/ > > Debian does the same thing: > https://salsa.debian.org/nvidia-team/nvidia-graphics-drivers/-/blob/340xx/ > main/debian/changelog > > Every distribution has to do stuff like this to make the NVIDIA drivers > work. It's *not* easy, and it's not *flawless*. > > You're seeing what happens when part of the hand-holding we do falls out > unintentionally. The end user does not care how a driver works on their system, they only care that it works. Yes, patches are required from distro maintainers, but this specific patch that broke the NVIDIA driver is Fedora specific because **Fedora** decided to break backwards compatibility to a kernel API that has been supported in the Linux kernel for over a decade now. Yes, other distros also ship patches to make the NVIDIA driver work especially in the case of no longer supported branches like the R340, but the reason this specific patch is necessary is not because of NVIDIA not caring about the latest kernel, it's because of Fedora's decision to break backwards compatibility. You could argue it's justified sure, but the difference here is that the kernel maintainer himself said: > I was indeed not aware of this, as I am not a user of their driver, I did not know that they had changed it to accommodate the upstream workaround. Which shows to me that he doesn't even bother testing kernel releases on the NVIDIA driver, meanwhile on other distributions I see the kernel maintainer is also often times the maintainer of the NVIDIA package. I know this sounds very entitled, but I frankly do not think this is a good situation.
(In reply to Javier Martinez Canillas from comment #62) > (In reply to nesdeq from comment #54) > > You just were not aware of this. So what needs to be done is > > > > 1. drop the patch/"hack" forever > > 2. just enable config_fb_efi=y > > 4) That's exactly what the hack (and *it is* a hack, I'm actually the author > of that patch) that we are carrying in the Fedora kernel does: > https://gitlab.com/cki-project/kernel-ark/-/commit/ > 811fe0e4dcfd86a0db5135d3bfef4936794efdb6. If the "nvidia-drm.modeset=1" > cmdline param is set, it avoids registering a "simple-framebuffer" device > and instead registers either an "efi-framebuffer" (for EFI) or > "vesa-framebuffer" (for legacy BIOS with vesa=$mode). All this is explained > in detail in the commit message of the hack, so you could had learn from it > by reading it instead of writing misleading and wrong information here and > make me repeat it again. Excuse my ignorance, but I fail to see what difference your patch versus setting "initcall_blacklist=simpledrm_platform_driver_init" accomplishes? Wouldn't it ease your burden if *your hack* was removed and instead the nvidia driver maintainers (rpmfusion, negativo17 etc) made sure that this kernel-parameter was set (in combination with you building with CONFIG_FB_EFI=y and CONFIG_FB_VESA=y of course)?
I am not and have not been having these problems. I updated to the 6.1.5-200 kernel with no issues... inxi -G -S -xx System: Host: Dingo.home Kernel: 6.1.5-200.fc37.x86_64 arch: x86_64 bits: 64 compiler: gcc v: 2.38-25.fc37 Desktop: GNOME v: 43.2 tk: GTK v: 3.24.36 wm: gnome-shell dm: GDM Distro: Fedora release 37 (Thirty Seven) Graphics: Device-1: NVIDIA TU116 [GeForce GTX 1650 SUPER] vendor: Dell driver: nvidia v: 525.78.01 arch: Turing ports: active: none off: DP-1,HDMI-A-1 empty: DVI-D-1 bus-ID: 0000:01:00.0 chip-ID: 10de:2187 Device-2: EMEET HD Webcam C960 type: USB driver: snd-usb-audio,uvcvideo bus-ID: 1-8.1.4:15 chip-ID: 328f:003f Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.7 compositor: gnome-shell driver: X: loaded: nvidia gpu: nvidia,nvidia-nvswitch display-ID: :1 screens: 1 Screen-1: 0 s-res: 3840x1080 s-dpi: 96 Monitor-1: DP-1 note: disabled pos: primary,right model: Acer T232HL res: 1920x1080 dpi: 96 diag: 584mm (23") Monitor-2: HDMI-A-1 mapped: HDMI-0 note: disabled pos: left model: Samsung LF24T450F res: 1920x1080 dpi: 93 diag: 604mm (23.8") API: OpenGL v: 4.6.0 NVIDIA 525.78.01 renderer: NVIDIA GeForce GTX 1650 SUPER/PCIe/SSE2 direct render: Yes
(In reply to Pierre from comment #64) > (In reply to Javier Martinez Canillas from comment #62) > > (In reply to nesdeq from comment #54) > > > You just were not aware of this. So what needs to be done is > > > > > > 1. drop the patch/"hack" forever > > > 2. just enable config_fb_efi=y > > > > 4) That's exactly what the hack (and *it is* a hack, I'm actually the author > > of that patch) that we are carrying in the Fedora kernel does: > > https://gitlab.com/cki-project/kernel-ark/-/commit/ > > 811fe0e4dcfd86a0db5135d3bfef4936794efdb6. If the "nvidia-drm.modeset=1" > > cmdline param is set, it avoids registering a "simple-framebuffer" device > > and instead registers either an "efi-framebuffer" (for EFI) or > > "vesa-framebuffer" (for legacy BIOS with vesa=$mode). All this is explained > > in detail in the commit message of the hack, so you could had learn from it > > by reading it instead of writing misleading and wrong information here and > > make me repeat it again. > > Excuse my ignorance, but I fail to see what difference your patch versus > setting "initcall_blacklist=simpledrm_platform_driver_init" accomplishes? > Wouldn't it ease your burden if *your hack* was removed and instead the > nvidia driver maintainers (rpmfusion, negativo17 etc) made sure that this > kernel-parameter was set (in combination with you building with > CONFIG_FB_EFI=y and CONFIG_FB_VESA=y of course)? As I mentioned before, "initcall_blacklist=simpledrm_platform_driver_init" and the patch we are carrying are not doing the same. Please go and read Comment 62 or the https://gitlab.com/cki-project/kernel-ark/-/commit/811fe0e4dcfd commit message, in both places I explain in detail why that patch is needed.
FEDORA-2023-58eac2b872 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2023-58eac2b872
FEDORA-2023-0597579983 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-0597579983
F36: WORKS
FEDORA-2023-0597579983 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-0597579983` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-0597579983 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
(In reply to Javier Martinez Canillas from comment #66) > (In reply to Pierre from comment #64) > > (In reply to Javier Martinez Canillas from comment #62) > > > (In reply to nesdeq from comment #54) > > > > You just were not aware of this. So what needs to be done is > > > > > > > > 1. drop the patch/"hack" forever > > > > 2. just enable config_fb_efi=y > > > > > > 4) That's exactly what the hack (and *it is* a hack, I'm actually the author > > > of that patch) that we are carrying in the Fedora kernel does: > > > https://gitlab.com/cki-project/kernel-ark/-/commit/ > > > 811fe0e4dcfd86a0db5135d3bfef4936794efdb6. If the "nvidia-drm.modeset=1" > > > cmdline param is set, it avoids registering a "simple-framebuffer" device > > > and instead registers either an "efi-framebuffer" (for EFI) or > > > "vesa-framebuffer" (for legacy BIOS with vesa=$mode). All this is explained > > > in detail in the commit message of the hack, so you could had learn from it > > > by reading it instead of writing misleading and wrong information here and > > > make me repeat it again. > > > > Excuse my ignorance, but I fail to see what difference your patch versus > > setting "initcall_blacklist=simpledrm_platform_driver_init" accomplishes? > > Wouldn't it ease your burden if *your hack* was removed and instead the > > nvidia driver maintainers (rpmfusion, negativo17 etc) made sure that this > > kernel-parameter was set (in combination with you building with > > CONFIG_FB_EFI=y and CONFIG_FB_VESA=y of course)? > > As I mentioned before, "initcall_blacklist=simpledrm_platform_driver_init" > and > the patch we are carrying are not doing the same. Please go and read Comment > 62 > or the https://gitlab.com/cki-project/kernel-ark/-/commit/811fe0e4dcfd commit > message, in both places I explain in detail why that patch is needed. So from https://gitlab.com/cki-project/kernel-ark/-/blob/v6.1.7/drivers/firmware/sysfb.c this patch seems to be removed and 6.1.7-200 works for nvidia again. So patch is not needed or could you kindly nudge me as to what I am missing here?
FEDORA-2023-58eac2b872 has been pushed to the Fedora 36 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-58eac2b872` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-58eac2b872 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-58eac2b872 has been pushed to the Fedora 36 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-0597579983 has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report.
*** Bug 2162874 has been marked as a duplicate of this bug. ***
I have been tracking a similar issue in https://bugzilla.redhat.com/show_bug.cgi?id=2161104 since 6.0.18 but when I update to 6.1.7-200.fc37.x86_64 I still a lockup where the prompt for the LUKS password is not visible and I have to blindly type in my password a few times to get it to work. What am I missing? I reinstalled the nvidia-akmod packages already and my GRUB_CMDLINE_LINUX GRUB_CMDLINE_LINUX="rd.luks.uuid=luks-eeeeee-d32f-44f0-eee-05e1f676bda8 rd.luks.uuid=luks-eeee-2ee6-eee-892f-21adc34fbc8e rd.luks.options=fido2-device=auto rhgb console=tty0 console=ttyS1,115200n8 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 initcall_blacklist=simpledrm_platform_driver_init" seems to have the correct things? What am I missing?
Having just investigated why I didn't have VTs showing up (which was nothing to do with nvidia - my AMD Zen4 iGPU taking them until I disabled it), I've discovered something that may be relevant here. It appears that the 525 nvidia driver branch has now implemented support for using simpledrm - I rebuilt the 6.1.12 kernel with neither the workaround patch nor EFI / VESA drivers compiled in, and all worked fine (booted to X11 running KDE, VTs worked fine): ``` $ grep CONFIG_FB_EFI /boot/config-6.1.12-201.fc37.x86_64 # CONFIG_FB_EFI is not set $ dmesg | egrep '(efifb|simple|fb[0-9]:)' [ 0.557075] [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0 [ 0.557715] simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device [ 2.914845] nvme 0000:04:00.0: platform quirk: setting simple suspend [ 17.691081] simple-framebuffer simple-framebuffer.0: swiotlb buffer is full (sz: 10063872 bytes), total 32768 (slots), used 108 (slots) $ nvidia-smi Thu Feb 23 23:36:28 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ ``` and this may also explain the people above who reported that it worked for them as in both cases, they were already on the 525 driver. I wonder if the people having the original problems were on 515 or 520 branches which do still need the workarounds in place? Either way, it _looks_ to me (tho others should also verify somehow!) that the workaround won't be needed for long now.
I checked the logs: 2023-01-13T20:01:51+0100 SUBDEBUG Upgrade: akmod-nvidia-3:525.78.01-1.fc36.x86_64 2023-01-13T20:02:40+0100 SUBDEBUG Upgrade: kmod-nvidia-6.0.18-200.fc36.x86_64-3:525.78.01-1.fc36.x86_64 kernel 6.1.5 came a day later: 2023-01-15T12:01:31+0100 SUBDEBUG Installed: kernel-core-6.1.5-100.fc36.x86_64 So nvidia 525 was already installed, but it did not work without the FB config options. BUT: It's possible, that inside that nvidia driver, the newer cardmodels ie. the 1600er are handled differently, I have a GTX1050 inside, which is ~7 years old, which means, my card could be handled differently than a newer one.
(In reply to customercare from comment #78) > I checked the logs: > > 2023-01-13T20:01:51+0100 SUBDEBUG Upgrade: > akmod-nvidia-3:525.78.01-1.fc36.x86_64 > 2023-01-13T20:02:40+0100 SUBDEBUG Upgrade: > kmod-nvidia-6.0.18-200.fc36.x86_64-3:525.78.01-1.fc36.x86_64 > > kernel 6.1.5 came a day later: > > 2023-01-15T12:01:31+0100 SUBDEBUG Installed: > kernel-core-6.1.5-100.fc36.x86_64 > > So nvidia 525 was already installed, but it did not work without the FB > config options. > > BUT: > > It's possible, that inside that nvidia driver, the newer cardmodels ie. the > 1600er are handled differently, > I have a GTX1050 inside, which is ~7 years old, which means, my card could > be handled differently than a newer one. Maybe the simpledrm blacklisting prevented it from working. https://pkgs.rpmfusion.org/cgit/nonfree/xorg-x11-drv-nvidia.git/commit/?id=f63b9a1271bf00d8c6b22f1f62a17f5070f31d23
(In reply to customercare from comment #78) > It's possible, that inside that nvidia driver, the newer cardmodels ie. the > 1600er are handled differently, > I have a GTX1050 inside, which is ~7 years old, which means, my card could > be handled differently than a newer one. I actually just upgraded my NVIDIA 1080 Ti (~7 year old card too) to a NVIDIA GeForce RTX 3090 Ti and even with removing the simpledrm blacklisting I am not getting any kind of prompt for typing in my LUKS passphrase. I have to blindly type it in a few times to get fully booted.
> I actually just upgraded my NVIDIA 1080 Ti (~7 year old card too) to a NVIDIA GeForce RTX 3090 Ti and even with removing the simpledrm blacklisting I am not getting any kind of prompt for typing in my LUKS passphrase. I have to blindly type it in a few times to get fully booted. Could you tell us: 1. kernel command line - you can get this with something like: ``` $ dmesg | grep Command.line [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1.13-200.fc37.x86_64 root=UUID=678cfab0-3f3b-af45-78bd-d4a5eb651834 ro rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau ``` 2. nvidia driver version, and where you installed it from (rpmfusion, negativo17 repository, or manually by downloading from nvidia.com yourself?) 3. Whether you have any integrated GPU on your CPU (in my case, I have an AMD iGPU and when that was enabled in the bios, the VTs and boot info was appearing on that rather than on my nvidia card)
(In reply to leigh scott from comment #79) > (In reply to customercare from comment #78) > > BUT: > > > > It's possible, that inside that nvidia driver, the newer cardmodels ie. the > > 1600er are handled differently, > > I have a GTX1050 inside, which is ~7 years old, which means, my card could > > be handled differently than a newer one. > Maybe the simpledrm blacklisting prevented it from working. > > https://pkgs.rpmfusion.org/cgit/nonfree/xorg-x11-drv-nvidia.git/commit/ > ?id=f63b9a1271bf00d8c6b22f1f62a17f5070f31d23 I reinstalled 6.1.5, rebooted and removed the initcall_blacklist options from the cmd line: same result => black screen ... BUT...It booted. Just the initramfs plymouth phase was dark. Means: ## setups without the initcall_blacklist options ## could boot without noticing the issue, but anyone with a LUKS unlock dialog observed the issue for sure. My Log from right now: ]$ journalctl -k --no-hostname | grep -E "(Command|drm|nvidia|nv)" Feb 28 10:57:26 kernel: Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.5-100.fc36.x86_64 root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 Feb 28 10:57:26 kernel: The simpledrm driver will not be probed Feb 28 10:57:26 kernel: Kernel command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.5-100.fc36.x86_64 root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 Feb 28 10:57:26 kernel: iommu: DMA domain TLB invalidation policy: lazy mode Feb 28 10:57:26 kernel: ACPI: bus type drm_connector registered Feb 28 10:57:26 kernel: rtc_cmos 00:03: alarms up to one month, y3k, 114 bytes nvram, hpet irqs Feb 28 10:57:26 kernel: with environment: Feb 28 10:57:36 systemd[1]: Starting modprobe - Load Kernel Module drm... Feb 28 10:57:37 kernel: nvidia: module license 'NVIDIA' taints kernel. Feb 28 10:57:37 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 235 Feb 28 10:57:37 kernel: nvidia 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem Feb 28 10:57:37 kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint. Feb 28 10:57:37 kernel: nvidia-uvm: Loaded the UVM driver, major device number 511. Feb 28 10:57:37 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 525.89.02 Wed Feb 1 23:09:40 UTC 2023 Feb 28 10:57:37 kernel: [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver Feb 28 10:57:39 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:04:00.0 on minor 0 Interessting is this line: Feb 28 10:57:26 kernel: The simpledrm driver will not be probed Why not? Nothing in the kernel cmd line that stops it.
Thanks for the detailed reply. The experience there does make sense to me - explanation ... > Interessting is this line: > Feb 28 10:57:26 kernel: The simpledrm driver will not be probed > Why not? Nothing in the kernel cmd line that stops it. This line indiciates that the workaround patch from javierm has activated, to prevent simpledrm from registering it's FB driver (which would stop the efifb / vesa driver from having a chance). The workaround activates in the presence of `nvidia-drm.modeset=1` on the kernel command line, which is there at the end of yours: Feb 28 10:57:26 kernel: Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.5-100.fc36.x86_64 root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 In 6.1.5, the workaround patch to stop simpledrm from activating it's framebuffer was present, but the EFI and VESA drivers (which _should_ then have taken over) weren't compiled in as the patch to enable them had been overlooked and was re-instated in 6.1.7 I believe, which is why you just got a black screen and no LUKS prompt at all. HOWEVER - my hope is that as you're using the 525 nvidia driver, we can in fact skip the workaround as it has support for simpledrm - so would you mind doing a test where you boot 6.1.5 again, but edit the command line in the grub menu to remove the `nvidia-drm.modeset=1` part from the end of the line, and report what the experience is like ? I'm _hoping_ that everything will work fine - you'll get a LUKS prompt, you'll boot up, and you'll have virtual terminals. Presuming it does boot at all, would you mind also doing the same `journalctl -k --no-hostname | grep -E "(Command|drm|nvidia|nv)"` to see what it reports as well ?
That was exactly what i wanted to do next :) And ... it ... just ... **WORKED** \o/ I will now do some performance tests, but i don't expect and issues there. $ journalctl -k --no-hostname | grep -E "(Command|drm|nvidia|nv)" Feb 28 15:01:34 kernel: Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.5-100.fc36.x86_64 root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau Feb 28 15:01:34 kernel: [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0 Feb 28 15:01:34 kernel: simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device Feb 28 15:01:34 kernel: with environment: Feb 28 15:01:50 systemd[1]: Starting modprobe - Load Kernel Module drm... Feb 28 15:01:51 kernel: nvidia: module license 'NVIDIA' taints kernel. Feb 28 15:01:52 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 235 Feb 28 15:01:52 kernel: nvidia 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem Feb 28 15:01:52 kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint. Feb 28 15:01:52 kernel: nvidia-uvm: Loaded the UVM driver, major device number 511. Feb 28 15:01:52 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 525.89.02 Wed Feb 1 23:09:40 UTC 2023 Feb 28 15:01:52 kernel: [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver Feb 28 15:01:52 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:04:00.0 on minor 1
(In reply to Steve Storey from comment #81) > 2. nvidia driver version, and where you installed it from (rpmfusion, > negativo17 repository, or manually by downloading from nvidia.com yourself?) I am using RPMFusion: # rpm -qa |grep nvidia|sort akmod-nvidia-525.89.02-1.fc37.x86_64 kmod-nvidia-6.1.11-200.fc37.x86_64-525.89.02-1.fc37.x86_64 kmod-nvidia-6.1.13-200.fc37.x86_64-525.89.02-1.fc37.x86_64 kmod-nvidia-6.1.14-200.fc37.x86_64-525.89.02-1.fc37.x86_64 kmod-nvidia-6.1.7-200.fc37.x86_64-525.78.01-1.fc37.x86_64 kmod-nvidia-6.1.8-200.fc37.x86_64-525.85.05-1.fc37.x86_64 kmod-nvidia-6.1.9-200.fc37.x86_64-525.89.02-1.fc37.x86_64 nvidia-gpu-firmware-20230210-147.fc37.noarch nvidia-persistenced-525.89.02-1.fc37.x86_64 nvidia-settings-525.89.02-1.fc37.x86_64 xorg-x11-drv-nvidia-525.89.02-1.fc37.x86_64 xorg-x11-drv-nvidia-cuda-525.89.02-1.fc37.x86_64 xorg-x11-drv-nvidia-cuda-libs-525.89.02-1.fc37.i686 xorg-x11-drv-nvidia-cuda-libs-525.89.02-1.fc37.x86_64 xorg-x11-drv-nvidia-kmodsrc-525.89.02-1.fc37.x86_64 xorg-x11-drv-nvidia-libs-525.89.02-1.fc37.i686 xorg-x11-drv-nvidia-libs-525.89.02-1.fc37.x86_64 xorg-x11-drv-nvidia-power-525.89.02-1.fc37.x86_64 > 3. Whether you have any integrated GPU on your CPU (in my case, I have an > AMD iGPU and when that was enabled in the bios, the VTs and boot info was > appearing on that rather than on my nvidia card) Interesting. It turns out I do have an integrated VGA controller on the board # lspci|grep VGA 01:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090 Ti] (rev a1) 2a:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41) and the only way to turn it off was via dipswitch on the motherboard itself. I did remove nvidia-drm.modeset=1 from my grub and I rebooted but the problems persisted. I then physically disabled the ASPEED VGA controller on the motherboard and rebooted and I am now able to see my LUKS prompt again! Woohoo! Thanks Steve!! The only downside to turning this ASPEED controller off is it seems that breaks my KVM functions on the built in BMC on this board. My motherboard manual mentioned this, but I can live with it. Here is the output from the journal as requested. # journalctl -k --no-hostname | grep -E "(Command|drm|nvidia|nv)" Feb 28 15:39:38 kernel: Command line: BOOT_IMAGE=(hd1,msdos2)/vmlinuz-6.1.14-200.fc37.x86_64 root=UUID=532c1b37-faf8-4e26-a0e7-3b54a5521c64 ro rootflags=subvol=root rd.luks.uuid=luks-12a79b0a-d32f-44f0-b038-05e1f676bda8 rd.luks.uuid=luks-b88f07d7-2ee6-4339-892f-21adc34fbc8e rd.luks.options=fido2-device=auto rhgb console=tty0 console=ttyS1,115200n8 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau Feb 28 15:39:38 kernel: iommu: DMA domain TLB invalidation policy: lazy mode Feb 28 15:39:38 kernel: ACPI: bus type drm_connector registered Feb 28 15:39:38 kernel: rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram Feb 28 15:39:38 kernel: [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0 Feb 28 15:39:38 kernel: simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device Feb 28 15:39:38 kernel: usb 9-2: Manufacturer: American Power Conversion Feb 28 15:39:38 kernel: hid-generic 0003:051D:0002.0003: hiddev97,hidraw2: USB HID v1.00 Device [American Power Conversion Back-UPS RS 1500MS2 FW:969.e2 .D USB FW:e2 ] on usb-0000:03:00.3-2/input0 Feb 28 15:39:38 kernel: with environment: Feb 28 15:39:39 kernel: nvme nvme0: pci function 0000:23:00.0 Feb 28 15:39:39 kernel: nvme nvme1: pci function 0000:2a:00.0 Feb 28 15:39:39 kernel: nvme nvme2: pci function 0000:2b:00.0 Feb 28 15:39:39 kernel: nvme nvme0: Shutdown timeout set to 10 seconds Feb 28 15:39:39 kernel: nvme nvme1: Shutdown timeout set to 10 seconds Feb 28 15:39:39 kernel: nvme nvme2: Shutdown timeout set to 10 seconds Feb 28 15:39:39 kernel: nvme nvme0: 128/0/0 default/read/poll queues Feb 28 15:39:39 kernel: nvme nvme1: 128/0/0 default/read/poll queues Feb 28 15:39:39 kernel: nvme nvme2: 128/0/0 default/read/poll queues Feb 28 15:39:39 kernel: nvme0n1: p1 p2 p3 Feb 28 15:39:39 kernel: nvme1n1: p1 Feb 28 15:39:39 kernel: nvme2n1: p1 Feb 28 15:39:39 kernel: nvme2n1: p1 size 1953525168 extends beyond EOD, truncated Feb 28 15:40:17 systemd[1]: Starting modprobe - Load Kernel Module drm... Feb 28 15:40:18 kernel: EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Quota mode: none. Feb 28 15:40:19 kernel: nvidia: module license 'NVIDIA' taints kernel. Feb 28 15:40:19 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 234 Feb 28 15:40:19 kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none Feb 28 15:40:19 kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint. Feb 28 15:40:19 kernel: nvidia-uvm: Loaded the UVM driver, major device number 510. Feb 28 15:40:19 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 525.89.02 Wed Feb 1 23:09:40 UTC 2023 Feb 28 15:40:19 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver Feb 28 15:40:19 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1 Feb 28 15:40:32 kernel: WARNING: CPU: 7 PID: 11724 at drivers/gpu/drm/drm_gem_shmem_helper.c:304 drm_gem_shmem_vmap+0x18d/0x1b0 Feb 28 15:40:32 kernel: Modules linked in: overlay tls wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nf_nat_tftp nf_conntrack_tftp nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter iptable_filter qrtr bnep tun nct6775 nct6775_core hwmon_vid sunrpc binfmt_misc nvidia_drm(PO) nvidia_modeset(PO) nvidia_uvm(PO) nvidia(PO) vfat fat intel_rapl_msr intel_rapl_common ipmi_ssif amd64_edac iwlmvm edac_mce_amd snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg kvm_amd mac80211 snd_intel_sdw_acpi snd_hda_codec libarc4 snd_usb_audio btusb eeepc_wmi kvm snd_usbmidi_lib snd_hda_core uvcvideo btrtl snd_rawmidi asus_wmi snd_hwdep btbcm Feb 28 15:40:32 kernel: videobuf2_vmalloc snd_seq videobuf2_memops ledtrig_audio btintel snd_seq_device videobuf2_v4l2 btmtk irqbypass sparse_keymap platform_profile wmi_bmof pcspkr videobuf2_common rapl iwlwifi acpi_ipmi snd_pcm videodev ipmi_si snd_timer cfg80211 bluetooth razermouse(O) joydev mc snd ipmi_devintf rfkill i2c_piix4 video soundcore k10temp ipmi_msghandler acpi_cpufreq zram dm_crypt crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 ixgbe nvme uas mxm_wmi mdio nvme_core ccp usb_storage nvme_common dca sp5100_tco wmi scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables dm_multipath fuse Feb 28 15:40:32 kernel: RIP: 0010:drm_gem_shmem_vmap+0x18d/0x1b0 Feb 28 15:40:32 kernel: drm_gem_vmap+0x1e/0x40 Feb 28 15:40:32 kernel: drm_gem_fb_vmap+0x3d/0x110 Feb 28 15:40:32 kernel: drm_atomic_helper_prepare_planes+0x74/0x160 Feb 28 15:40:32 kernel: drm_atomic_helper_commit+0x72/0x140 Feb 28 15:40:32 kernel: drm_atomic_commit+0x86/0xa0 Feb 28 15:40:32 kernel: ? drm_plane_get_damage_clips.cold+0x1c/0x1c Feb 28 15:40:32 kernel: drm_atomic_helper_set_config+0x70/0xb0 Feb 28 15:40:32 kernel: drm_mode_setcrtc+0x4e9/0x7b0 Feb 28 15:40:32 kernel: ? drm_mode_getcrtc+0x180/0x180 Feb 28 15:40:32 kernel: drm_ioctl_kernel+0xa9/0x150 Feb 28 15:40:32 kernel: drm_ioctl+0x22f/0x410 Feb 28 15:40:32 kernel: ? drm_mode_getcrtc+0x180/0x180
(In reply to customercare from comment #82) > (In reply to leigh scott from comment #79) > > (In reply to customercare from comment #78) > > > BUT: > > > > > > It's possible, that inside that nvidia driver, the newer cardmodels ie. the > > > 1600er are handled differently, > > > I have a GTX1050 inside, which is ~7 years old, which means, my card could > > > be handled differently than a newer one. > > Maybe the simpledrm blacklisting prevented it from working. > > > > https://pkgs.rpmfusion.org/cgit/nonfree/xorg-x11-drv-nvidia.git/commit/ > > ?id=f63b9a1271bf00d8c6b22f1f62a17f5070f31d23 > > I reinstalled 6.1.5, rebooted and removed the initcall_blacklist options > from the cmd line: > > same result => black screen ... BUT...It booted. Just the initramfs plymouth > phase was dark. > > Means: > > ## setups without the initcall_blacklist options ## could boot without > noticing the issue, but anyone with a LUKS unlock dialog observed the issue > for sure. > > My Log from right now: > > ]$ journalctl -k --no-hostname | grep -E "(Command|drm|nvidia|nv)" > Feb 28 10:57:26 kernel: Command line: > BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.5-100.fc36.x86_64 > root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro > vconsole.font=latarcyrheb-sun16 > rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 > rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash > audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau > modprobe.blacklist=nouveau nvidia-drm.modeset=1 > Feb 28 10:57:26 kernel: The simpledrm driver will not be probed > Feb 28 10:57:26 kernel: Kernel command line: > BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.5-100.fc36.x86_64 > root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro > vconsole.font=latarcyrheb-sun16 > rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 > rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash > audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau > modprobe.blacklist=nouveau nvidia-drm.modeset=1 > Feb 28 10:57:26 kernel: iommu: DMA domain TLB invalidation policy: lazy mode > Feb 28 10:57:26 kernel: ACPI: bus type drm_connector registered > Feb 28 10:57:26 kernel: rtc_cmos 00:03: alarms up to one month, y3k, 114 > bytes nvram, hpet irqs > Feb 28 10:57:26 kernel: with environment: > Feb 28 10:57:36 systemd[1]: Starting modprobe - Load Kernel > Module drm... > Feb 28 10:57:37 kernel: nvidia: module license 'NVIDIA' taints kernel. > Feb 28 10:57:37 kernel: nvidia-nvlink: Nvlink Core is being initialized, > major device number 235 > Feb 28 10:57:37 kernel: nvidia 0000:04:00.0: vgaarb: changed VGA decodes: > olddecodes=io+mem,decodes=none:owns=io+mem > Feb 28 10:57:37 kernel: nvidia_uvm: module uses symbols > nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting > taint. > Feb 28 10:57:37 kernel: nvidia-uvm: Loaded the UVM driver, major device > number 511. > Feb 28 10:57:37 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting > Driver for UNIX platforms 525.89.02 Wed Feb 1 23:09:40 UTC 2023 > Feb 28 10:57:37 kernel: [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver > Feb 28 10:57:39 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for > 0000:04:00.0 on minor 0 > > Interessting is this line: > > Feb 28 10:57:26 kernel: The simpledrm driver will not be probed > > Why not? Nothing in the kernel cmd line that stops it. Fedora will need to drop the patch, the f38 kernel works fine. $ journalctl -k --no-hostname | grep -E "(Command|drm|nvidia|nv)" Feb 26 16:30:39 kernel: Command line: BOOT_IMAGE=(hd3,gpt2)/vmlinuz-6.2.0-63.fc38.x86_64 root=UUID=fe89b9fa-7cee-40bd-912f-4caa21ccdda0 ro resume=UUID=3a00d179-9c86-43a0-87a2-fed4dad7cdb5 rhgb quiet loglevel=3 libahci.ignore_sss=1 selinux=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 clocksource=tsc tsc=reliable Feb 26 16:30:39 kernel: Kernel command line: BOOT_IMAGE=(hd3,gpt2)/vmlinuz-6.2.0-63.fc38.x86_64 root=UUID=fe89b9fa-7cee-40bd-912f-4caa21ccdda0 ro resume=UUID=3a00d179-9c86-43a0-87a2-fed4dad7cdb5 rhgb quiet loglevel=3 libahci.ignore_sss=1 selinux=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 clocksource=tsc tsc=reliable Feb 26 16:30:39 kernel: iommu: DMA domain TLB invalidation policy: lazy mode Feb 26 16:30:39 kernel: ACPI: bus type drm_connector registered Feb 26 16:30:39 kernel: rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram, hpet irqs Feb 26 16:30:39 kernel: [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0 Feb 26 16:30:39 kernel: simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device Feb 26 16:30:39 kernel: with environment: Feb 26 16:30:39 kernel: nvme nvme0: pci function 0000:09:00.0 Feb 26 16:30:39 kernel: nvme nvme1: pci function 0000:0a:00.0 Feb 26 16:30:39 kernel: nvme nvme1: Shutdown timeout set to 10 seconds Feb 26 16:30:39 kernel: nvme nvme0: 7/0/0 default/read/poll queues Feb 26 16:30:39 kernel: nvme nvme1: 32/0/0 default/read/poll queues Feb 26 16:30:39 kernel: nvme1n1: p1 p2 p3 p4 p5 p6 Feb 26 16:30:39 kernel: nvme0n1: p1 p2 p3 p4 Feb 26 16:30:40 kernel: EXT4-fs (nvme0n1p3): mounted filesystem fe89b9fa-7cee-40bd-912f-4caa21ccdda0 with ordered data mode. Quota mode: none. Feb 26 16:30:42 systemd[1]: Starting modprobe - Load Kernel Module drm... Feb 26 16:30:42 kernel: EXT4-fs (nvme0n1p3): re-mounted fe89b9fa-7cee-40bd-912f-4caa21ccdda0. Quota mode: none. Feb 26 16:30:42 kernel: audit: type=1130 audit(1677429042.058:10): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=modprobe@drm comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Feb 26 16:30:42 kernel: EXT4-fs (nvme0n1p2): mounted filesystem e49cb48f-6870-4cbd-ac76-e76007ddbcc7 with ordered data mode. Quota mode: none. Feb 26 16:30:42 kernel: EXT4-fs (nvme0n1p4): mounted filesystem 566f7281-3501-4822-80b8-018222a116a2 with ordered data mode. Quota mode: none. Feb 26 16:30:42 kernel: EXT4-fs (nvme1n1p1): mounted filesystem c74fb059-7e23-4055-a52a-9d23eb22be2a with ordered data mode. Quota mode: none. Feb 26 16:30:42 kernel: F2FS-fs (nvme1n1p3): Mounted with checkpoint version = 493a38c2 Feb 26 16:30:43 kernel: EXT4-fs (nvme1n1p4): mounted filesystem 2440c679-aa48-4e02-a400-715d85c1d0f2 with ordered data mode. Quota mode: none. Feb 26 16:30:43 kernel: EXT4-fs (nvme1n1p6): mounted filesystem 6843a9f2-0adb-423f-92ac-241853595344 with ordered data mode. Quota mode: none. Feb 26 16:30:43 kernel: EXT4-fs (nvme1n1p5): mounted filesystem 02f8f890-6870-44b5-9cdc-ddfaf099f135 with ordered data mode. Quota mode: none. Feb 26 16:30:43 kernel: nvidia: loading out-of-tree module taints kernel. Feb 26 16:30:43 kernel: nvidia: module license 'NVIDIA' taints kernel. Feb 26 16:30:43 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 236 Feb 26 16:30:43 kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none Feb 26 16:30:43 kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint. Feb 26 16:30:43 kernel: nvidia-uvm: Loaded the UVM driver, major device number 234. Feb 26 16:30:43 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 525.89.02 Wed Feb 1 23:09:40 UTC 2023 Feb 26 16:30:43 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver Feb 26 16:30:44 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1 Feb 26 16:35:19 kernel: WARNING: CPU: 11 PID: 1532 at drivers/gpu/drm/drm_gem_shmem_helper.c:304 drm_gem_shmem_vmap+0x18d/0x1b0 Feb 26 16:35:19 kernel: Modules linked in: snd_seq_dummy snd_hrtimer vhost_net tun vhost vhost_iotlb macvtap macvlan tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_compat bridge stp llc wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel nf_nat_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_tftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct sunrpc nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nvidia_drm(PO) nvidia_modeset(PO) nf_tables nfnetlink nvidia_uvm(PO) qrtr nvidia(PO) vfat fat f2fs crc32_generic lz4hc_compress lz4_compress snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr snd_hda_codec_hdmi intel_rapl_common edac_mce_amd snd_hda_intel snd_intel_dspcfg kvm_amd snd_intel_sdw_acpi snd_hda_codec kvm snd_hda_core eeepc_wmi snd_hwdep snd_seq snd_seq_device asus_wmi snd_pcm snd_timer irqbypass ledtrig_audio sparse_keymap asus_wmi_sensors Feb 26 16:35:19 kernel: platform_profile rapl rfkill wmi_bmof snd mxm_wmi i2c_piix4 k10temp soundcore joydev gpio_amdpt gpio_generic acpi_cpufreq loop zram crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme igb nvme_core ccp ghash_clmulni_intel sha512_ssse3 dca tg3 sp5100_tco nvme_common video wmi scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath fuse Feb 26 16:35:19 kernel: RIP: 0010:drm_gem_shmem_vmap+0x18d/0x1b0 Feb 26 16:35:19 kernel: drm_gem_vmap+0x1e/0x40 Feb 26 16:35:19 kernel: drm_gem_vmap_unlocked+0x26/0x40 Feb 26 16:35:19 kernel: drm_gem_fb_vmap+0x3d/0x110 Feb 26 16:35:19 kernel: drm_atomic_helper_prepare_planes+0x176/0x210 Feb 26 16:35:19 kernel: drm_atomic_helper_commit+0x74/0x140 Feb 26 16:35:19 kernel: drm_atomic_commit+0x96/0xc0 Feb 26 16:35:19 kernel: ? __pfx___drm_printfn_info+0x10/0x10 Feb 26 16:35:19 kernel: drm_atomic_helper_set_config+0x70/0xb0 Feb 26 16:35:19 kernel: drm_mode_setcrtc+0x3c4/0x7f0 Feb 26 16:35:19 kernel: ? __pfx_drm_mode_setcrtc+0x10/0x10 Feb 26 16:35:19 kernel: drm_ioctl_kernel+0xc9/0x170 Feb 26 16:35:19 kernel: drm_ioctl+0x235/0x410 Feb 26 16:35:19 kernel: ? __pfx_drm_mode_setcrtc+0x10/0x10 Feb 26 16:35:19 kernel: ? nvkms_ioctl+0x135/0x180 [nvidia_modeset] Feb 26 16:35:19 kernel: ? nvidia_frontend_unlocked_ioctl+0x38/0x50 [nvidia]
This does not only happen with NVIDIA graphics, but also on HP PCs with embedded graphics. Graphics: Device-1: Intel Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics vendor: Hewlett-Packard driver: N/A alternate: i915 arch: Gen-7.5 process: Intel 22nm built: 2013 bus-ID: 00:02.0 chip-ID: 8086:0412 class-ID: 0300 Display: server: X.org v: 1.20.14 with: Xwayland v: 22.1.9 driver: X: loaded: N/A unloaded: fbdev,modesetting,vesa gpu: N/A tty: 80x24 API: OpenGL Message: GL data unavailable in console for root. Since kernel 6.1.x there is no graphics output anymore on that machine.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days