Bug 2211591
| Summary: | F37+38: kernel 6.3.4 no support for nvidia while booting | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | customercare | ||||
| Component: | grub2 | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
| Status: | NEW --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 37 | CC: | acaringi, adscvr, airlied, alciregi, bskeggs, fmartine, hdegoede, hpa, jarodwilson, josef, kernel-maint, lgoncalv, linville, lkundrak, masami256, mchehab, mlewando, nfrayer, pgnet.dev, pjones, ptalbert, rharwood, steved | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
customercare
2023-06-01 07:33:28 UTC
can one add "leigh123linux" to the cc, does not work for me. HINT FOUND: this is from the BLS config file: options root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 initcall_blacklist=simpledrm_platform_driver_init simpledrm is blocked. BUT that has never been given to grub defaults: GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true #GRUB_TERMINAL_OUTPUT="console" #GRUB_CMDLINE_LINUX="vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 initcall_blacklist=simpledrm_platform_driver_init" GRUB_CMDLINE_LINUX="vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 nouveau.modeset=0 rd.driver.blacklist=nouveau modprobe.blacklist=nouveau" GRUB_DISABLE_RECOVERY="true" GRUB_VIDEO_BACKEND=vbe GRUB_FONT_PATH=/boot/grub2/fonts/unicode.pf2 GRUB_GFXMODE=0x11b GRUB_GFXPAYLOAD_LINUX="keep" GRUB_TERMINAL_OUTPUT="gfxterm" #GRUB_GFXMODE="1440x900x32" GRUB_ENABLE_BLSCFG=true There is no file in /etc/ that adds this to the linux kernel line. that grub config is untouched since the 6.1.5 kernel nvidia issue and worked AND other pcs do not have this commentline at all and can't see the boot screen too. rebooting confirmed: BLS config caused issue. manually removing the false options , and system boots again as it should. I checked if grub has used the commented CMDLINE with the old arguments, but grub.cfg shows the correct normal line: # grep kernelopts= /boot/grub2/grub.cfg set default_kernelopts="root=UUID=9d2595b2-a35c-48c1-a839-bb54c1a96597 ro rd.driver.blacklist=nouveau modprobe.blacklist=nouveau vconsole.font=latarcyrheb-sun16 rd.luks.uuid=luks-ed009ed3-118c-465d-9b89-9b2a4f5cc3f3 rd.luks.uuid=luks-9d2595b2-a35c-48c1-a839-bb54c1a96597 rhgb quiet splash audit=0 rd.driver.blacklist=nouveau nouveau.modeset=0 modprobe.blacklist=nouveau " Question: Who did create the BLS configfiles with commented CMDLINE instead of the real one? - switched to grub2 - as grub2-mkconfig produced clean BLS config files after removing the #GRUB_CMDLINE_LINUX= line from the default file. grub2-common-2.06-94.fc37.noarch grub2-efi-x64-2.06-94.fc37.x86_64 grub2-pc-2.06-94.fc37.x86_64 grub2-pc-modules-2.06-94.fc37.noarch grub2-starfield-theme-2.02-0.43.fc26.x86_64 grub2-tools-2.06-94.fc37.x86_64 grub2-tools-efi-2.06-94.fc37.x86_64 grub2-tools-extra-2.06-94.fc37.x86_64 grub2-tools-minimal-2.06-94.fc37.x86_64 Hi, If you # grubby --update-kernel /path/to/kernel --remove-args "the args you want to remove" does the machine consistently boot as it should? if not, could you please show the output of the following: # grubby --info ALL and mention what steps you took when installing the new kernel? (just dnf update kernel ?) Also, I understand from your comments that your /etc/default/grub has not been changed in a long time? Is that correct? Specifically which GRUB_CMDLINE_LINUX entry is commented has not changed? (In reply to Marta Lewandowska from comment #8) > Hi, > > If you > # grubby --update-kernel /path/to/kernel --remove-args "the args you want to > remove" > does the machine consistently boot as it should? I can't answere this, as no BLS config file nor grubenv was changed when i executed the command with one of the older kernels. It's problem some of the fedora system i maintain have: you can't select a default kernel anymore via grubby. No clue, why this isn't working anymore. Most system just boot the newest kernel aka first entry in the list. > and mention what steps you took when installing the new kernel? (just dnf > update kernel ?) "dnf update -y" is run in a 2h periode via cron. > Also, I understand from your comments that your /etc/default/grub has not > been changed in a long time? Is that correct? Specifically which > GRUB_CMDLINE_LINUX entry is commented has not changed? in January there was the kernel 6.1.5 issue, where justin forgot to add some old drivers in the kernel. In that process, the default/grub file was changed a few day later, and I commented the old kernel line out and added the new value, as you can see above. After that, the default was untouched. I did the change to test, if the removal of the simpledrm argument let nvidia work with kernel 6.1.5, which I got confirmed via a normal boot with that old kernel. Since then, default/grub was unchanged. Yesterday, I removed the old commented out line in default/grub, recreated the config via grub2-mkconfig and e voila, it booted normally again. I checked the kernel logs and found that at LEAST on 26. May the kernelline was tampered with the old simpledrm block argument again. Unlucky for us, older boot logs are not available anymore. Ok, so you're saying you can't use grubby for anything anymore, but it used to work? This seems like a problem. Is this / these system(s) UEFI or BIOS? If UEFI, could you please #cat /boot/efi/EFI/fedora/grub.cfg Also, when you run grub2-mkconfig, how do you run it exactly? What's the output target? # grub2-mkconfig -o [what's here?] Sorry, my mistake, i did not recognize "update-kernel" i read "default-kernel". (In reply to Marta Lewandowska from comment #10) > Ok, so you're saying you can't use grubby for anything anymore, but it used to work? This seems like a problem. On my Laptop i can't switch the boot kernel anymore with grubby. it sticks to index#2 . But thats a different Story, I open a new bug for that. Grubby CAN switch the default kernel in general. I tested it a few minutes ago, with F37 on my Surface tablet.. Worked as expected. The laptop issue must be caused by a very special constellation. > Is this / these system(s) UEFI or BIOS? If UEFI, could you please > #cat /boot/efi/EFI/fedora/grub.cfg It's a Bios-Legacy boot. /boot/efi/EFI/fedora/grub.cfg does not exist. # ll /boot/grub2/ insgesamt 68 -rw-r--r--. 1 root root 84 7. Dez 2015 device.map drwx------. 2 root root 4096 29. Apr 10:17 fonts -rw-r--r--. 1 root root 4707 24. Jun 2021 grub.cfg -rw-r--r--. 1 root root 6515 2. Apr 2017 grub.cfg_new -rw-r--r--. 1 root root 6515 2. Apr 2017 grub.cfg_old -rw-r--r--. 1 root root 5862 28. Nov 2019 grub.cfg.rpmsave -rw-------. 1 root root 1024 2. Jun 08:59 grubenv -rw-r--r--. 1 root root 1024 24. Jan 2018 grubenv.rpmsave drwxr-xr-x. 2 root root 12288 28. Nov 2019 i386-pc drwxr-xr-x. 2 root root 4096 2. Apr 2017 locale drwxr-xr-x. 4 root root 4096 9. Mai 2012 themes # ls -la /boot/efi/EFI/fedora/ insgesamt 6308 drwx------. 3 root root 4096 29. Apr 10:17 . drwxr-xr-x. 4 root root 4096 21. Jul 2022 .. -rwx------. 1 root root 110 7. Jul 2022 BOOTX64.CSV drwx------. 2 root root 4096 8. Aug 2018 fw -rwx------. 1 root root 65824 8. Aug 2018 fwupia32.efi -rwx------. 1 root root 77496 8. Aug 2018 fwupx64.efi -rw-------. 1 root root 1024 30. Nov 2021 grubenv -rwx------. 1 root root 3530048 10. Apr 19:08 grubx64.efi -rwx------. 1 root root 857248 7. Jul 2022 mmx64.efi -rwx------. 1 root root 946712 7. Jul 2022 shim.efi -rwx------. 1 root root 946712 7. Jul 2022 shimx64.efi > > Also, when you run grub2-mkconfig, how do you run it exactly? What's the > output target? > # grub2-mkconfig -o [what's here?] For the fix of the BLS config files, i only executed "grub2-mkconfig" with any arguments, so it printed the grub.cfg to stdout. What the kernel install scripts do, i have no clue. The system in question was installed with Fedora 18 and got upgraded via dnf since. As you can see above, there are some pretty old files in those directories, it's possible that the upgrades did not end up with the same config files you would get from a fresh install. Same for the mentioned laptop, which even started with fedora 15, and runs now on fedora 38. Thanks for sharing the directory structure. That's actually really helpful. If these are really Legacy BIOS, then your directories should look something like this: [root@hp-dlg5-01 ~]# ls -l /boot/grub2 total 36 -rw-r--r--. 1 root root 64 Jun 2 13:26 device.map drwxr-xr-x. 2 root root 4096 Jun 2 13:27 fonts -rw-------. 1 root root 6441 Jun 2 13:28 grub.cfg -rw-r--r--. 1 root root 1024 Jun 2 13:27 grubenv drwxr-xr-x. 2 root root 12288 Jun 2 13:27 i386-pc drwxr-xr-x. 2 root root 4096 Jun 2 13:26 locale [root@hp-dlg5-01 ~]# ls -l /boot/efi/EFI/fedora/ total 0 You've had them for a while so you have some old stuff kicking around or maybe installed efi binaries that you don't need, but even if /boot/efi/EFI/fedora has some stuff in it for whatever reason, it shouldn't have a grubenv file. I think that might be the reason grubby is confused and you aren't able to use it properly. The reason I'm focusing on grubby is because it's the tool you should be using to manipulate BLS entries. And when you install a new kernel, grub and grubby pass the arguments from your default kernel to the new one, so what happened to you (or what I understood anyway) shouldn't happen. If you have the kernel command line set correctly for your present kernel, you should end up with the same command line for the newly installed kernel. If you need to fix stuff for all the kernels, you can use commands like # grubby --update-kernel ALL --args "args to add" --remove-args "args to remove" or you can update kernels one at a time, as you see fit. I hope this helps..? (In reply to Marta Lewandowska from comment #12) > You've had them for a while so you have some old stuff kicking around or > maybe installed efi binaries that you don't need, but even if > /boot/efi/EFI/fedora has some stuff in it for whatever reason, it shouldn't > have a grubenv file. I think that might be the reason grubby is confused and > you aren't able to use it properly. That file is not causing it: I just removed it, changed kernel and the choosen kernel got ignored on boot. After the boot, the choosen kernel is named as the default kernel, which it clearly isn't: Last login: Fri Jun 9 09:39:34 2023 from 127.0.0.1 [root@eve ~]# grubby --default-kernel /boot/vmlinuz-6.2.15-200.fc37.x86_64 [root@eve ~]# uname -a Linux eve.xxxxxxxxxxxxxxxxxxx 6.3.5-100.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 30 15:43:51 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux [root@eve ~]# cat /boot/grub2/grubenv # GRUB Environment Block # WARNING: Do not edit this file by tools other than grub-editenv!!! saved_entry=7e390913b33b4e5ba8f960a9ba97aeee-6.2.15-200.fc37.x86_64 boot_success=1 ###############################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################[root@eve ~]# Can this be fixed by wiping /boot/efi /boot/grub2 and reinstalling grub2-pc & grub2-efi-x64 ?? grub is a protected package (of course you can work around this but...) so removing it or just wiping files is not a great idea. But you can certainly remove (using yum/dnf) grub-efi* since you don't need those packages on BIOS, and then you can reinstall grub2* and it should install only packages you need. Your directory structure should look like in comment#12, so you shouldn't have grubenv or grub.cfg in /boot/efi/EFI/fedora -- shouldn't really have anything in there. Maybe also check for a soft link from /etc/grub*cfg to /boot/grub2/grub.cfg |