Bug 2037703 - Strange messages on boot console
Summary: Strange messages on boot console
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 37
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-06 11:25 UTC by George R. Goffe
Modified: 2023-12-05 21:03 UTC (History)
39 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-12-05 21:03:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
tar.gz file with dmesg.txt and 2 screen shots of error (183.99 KB, application/gzip)
2022-01-06 11:25 UTC, George R. Goffe
no flags Details
a screenshot of garbled text that's been right before (18.19 KB, image/png)
2022-08-25 12:33 UTC, David Tardon
no flags Details
another example of kmsg garbled screen (23.37 KB, image/jpeg)
2023-03-23 15:18 UTC, Milan Broz
no flags Details

Description George R. Goffe 2022-01-06 11:25:24 UTC
Created attachment 1849239 [details]
tar.gz file with dmesg.txt and 2 screen shots of error

1. Please describe the problem:
Booting a VM under VirtualBox shows some strange character strings with no content. See attached jpg.

2. What is the Version-Release number of the kernel:
5.16.0-0.rc8.55.fc36.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
 
Problem has appeared on several earlier kernels

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes. Problem appears on console during boot... Requires no other action from me to cause this.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Yes.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

Yes, I think... Kernel modules related to virtualization via VirtualBox.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 George R. Goffe 2022-02-17 10:22:33 UTC
This problem existed only in a VirtualBox VM. I have upgraded the host system to the "latest" Fedora Core (Now 37). The problem now appears on the host.

The current kernel is now 5.17.0-0.rc4.96.fc37.x86_64. I DO NOT believe it is kernel related. It is NOT at all clear just which application causes this error to appear. Perhaps an examination of the legible messages surrounding the gibberish messages would prove useful?

Again, the gibberish messages do NOT appear in the dmesg output.

A hint or tip would be MOST helpful to me to identify the specific application(?) that causes this error to appear.

The host AND the guest both seem to be running correctly.

Best regards,

George...

Comment 2 George R. Goffe 2022-02-21 23:25:40 UTC
I think I found the cause of this problem.

In the grub.cfg file there's a line that has "quiet" in it. I remove this so I can see all the boot messages. When I do this, the gibberish messages appear.

Does anyone know where this bug should be placed? Grub?

George...

Comment 3 Gordon Messmer 2022-02-21 23:59:27 UTC
Looking at the screenshots, there's not merely garbled output from a single application, every line of text has some corruption visible.

This is almost certainly not a bug in the kernel (or any other Fedora component), but a bug in the VirtualBox emulated VGA console, similar to the one discussed here:

https://forums.virtualbox.org/viewtopic.php?f=11&t=96739

Comment 4 George R. Goffe 2022-02-22 09:10:35 UTC
Gordon,

Thanks for your response.

The problem exists in the FC37 host I'm running. When I boot "natively" with "quiet" removed, I get these gibberish message output to the console.

This started happening with the 5.17 kernels.

It is NOT VirtualBox that is causing this problem.

Best regards,

George...

Comment 5 Hans de Goede 2022-02-22 09:54:21 UTC
(In reply to George R. Goffe from comment #4)
> Gordon,
> 
> Thanks for your response.
> 
> The problem exists in the FC37 host I'm running. When I boot "natively" with
> "quiet" removed, I get these gibberish message output to the console.
> 
> This started happening with the 5.17 kernels.

Thank you for reporting this.

I guess you are booting your host in classic/legacy BIOS mode rather then through EFI?

You can do: "ls /sys/firmware/efi/efivars/" if that gives a "No such file or directory" error then you are using classic BIOS boot.

VirtualBox guests also default to classic BIOS mode. So this likely means that the recent change to switch the pre native-drm driver boot console to simpledrm is causing issues with the boot messages shown before the native drm driver loads:

https://fedoraproject.org/wiki/Changes/ReplaceFbdevDrivers

I'm assigning this to Javier who is responsible for this change. Hopefully he can figure out a fix.

Javier, maybe we need to keep the vga-console setting for classic BIOS boot ?  That is not a framebuffer driver, but rather an alternative console implementation, which will get replaced with fbcon on drm when the native drm driver loads.

Comment 6 Javier Martinez Canillas 2022-02-22 15:37:28 UTC
Thanks Hans for pointing me this issue and also providing the needed context.

I've installed both Fedora 35 and Fedora 36 as VMs and booting with legacy
BIOS and tested with different kernel command line parameters to figure out
where could be the problem.

tl; dr: please test with "nomodeset vga=0x318" in your kernel command line.

long version:

The simpledrm change ended being a red herring and the actual bug is in the
VGA console driver (vgacon) AFAICT. What I found is the following:

1) On x86, when booting with legacy BIOS the vgacon driver is always used.

2) Then at some point either a fbdev or DRM driver with fbdev emulation will
   be registered and this will cause fbcon to take over the console and the
   vgacon is unregistered.

3) Using the "vga=" kernel command line parameters, causes the vgacon to not
   be enabled and instead fbcon + vesafb fbdev driver is used.

I'm testing with a virtio video device and the boot looks like the following
for both Fedora 35 and 36 without any additional command line parameters:

$ cat /proc/fb 
0 virtio_gpudrmfb

$ dmesg | grep Console:
[    0.064609] Console: colour VGA+ 80x25
[    1.631950] Console: switching to colour dummy device 80x25
[    1.637304] Console: switching to colour frame buffer device 128x48

the first line is for vgacon, the second one for dummycon (that's set when
no console is registered) and then finally fbcon takes over.

Using nomodeset will cause the virtio_gpu DRM driver to not be probed and
only the vgacon driver will be used (i.e in Fedora 35):

$ cat /proc/fb
$

$ dmesg | grep Console:
[    0.066102] Console: colour VGA+ 80x25

when using a "vga=" parameter, then the vesafb driver will be probed and
the flow is dummycon -> fbcon + vesafb (i.e in Fedora 35 with vga=0x318):

$ cat /proc/fb 
0 VESA VGA

$ dmesg | grep -i console:
[    0.054056] Console: colour dummy device 80x25
[    0.425638] Console: switching to colour frame buffer device 128x48

So there are ways to disable both the DRM driver (nomodeset) and the
vgacon driver (passing a mode with vga=).

The vesafb driver is replaced by the simpledrm driver in Fedora 36 and
nomodeset param only affect real DRM drivers, not simpledrm. So in 36
is to only use the simpledrm driver with the "nomodeset vga=0x318":

$ cat /proc/fb 
0 simpledrmdrmfb

$ dmesg | grep Console:
[    0.047019] Console: colour dummy device 80x25
[    0.673643] Console: switching to colour frame buffer device 128x48

Please give it a try in your systems to check that the behavior is the
same I have with kvm and SeaBIOS. I just chose 0x318 randomly but you
can check the available modes with vga=ask and use whatever is suitable.

Comment 7 Javier Martinez Canillas 2022-02-22 16:02:42 UTC
Hans suggested that I could test an older kernel build on F36 and see
how it behaves and surprisingly it does exhibited the same issue than
with the F36 kernel package.

That is, with 5.16.9-200.fc35.x86_64 and the F36 user-space the same
wrong characters where shown when booting with "nomodeset" (only the
vgacon driver).

But it wasn't present when booting with "vga=0x318 nomodeset" (only
the fbcon + vesafb drivers).

Hans then said that I could try downgrading the kbd packages, so I
did the following (please also test George in your affected system):

$ wget https://kojipkgs.fedoraproject.org//packages/kbd/2.4.0/7.fc35/noarch/kbd-misc-2.4.0-7.fc35.noarch.rpm
$ wget https://kojipkgs.fedoraproject.org//packages/kbd/2.4.0/7.fc35/x86_64/kbd-2.4.0-7.fc35.x86_64.rpm
$ dnf downgrade kbd-*.rpm
$ dracut -f

and then modify your kernel cmdline to just have "nomodeset" without
"vga=" to make the kernel only use the VGA console driver.

Comment 8 Javier Martinez Canillas 2022-02-22 16:18:37 UTC
I was able to reproduce again in the F36 VM even with the downgraded
kbd packages, the issue doesn't seem to be in that package after all.

I also tested with installing the 5.17.0-0.rc4.96.fc36.x86_64 kernel
in the F35 VM and the are no issues with that combination. The issue
has to be somewhere in the user-space but isn't clear to me where...

Comment 9 Javier Martinez Canillas 2022-02-22 17:02:32 UTC
I tried now updating the F35 VM with latest systemd:

 $ dnf --disablerepo=* --enablerepo=rawhide upgrade systemd

and after that the issue started to appear on the F35 VM, both with
the 5.16.9-200.fc35.x86_64 and 5.17.0-0.rc4.96.fc36.x86_64 kernels.

Comment 10 Javier Martinez Canillas 2022-02-22 17:19:55 UTC
Changing the component to systemd since the issue seems to be related and
not really something that's in the kernel according to my testing.

Comment 11 George R. Goffe 2022-02-27 02:37:53 UTC
Howdy,

Is anyone working this bug?

Best regards and STAY SAFE!

George...

Comment 12 Zbigniew Jędrzejewski-Szmek 2022-02-27 09:50:09 UTC
Javier: so what is the reproduction recipe:
which kernel version,
which systemd version,
libvirt VM?
any special config on the kernel command line?

Comment 13 George R. Goffe 2022-03-01 21:15:13 UTC
Zbigniew,

Thanks for responding to this bug.

I first started seeing the "bug" with an upgrade to the 5.x kernels in VBox VMs but now it's in the "native" host as well. I am currently at the "latest" upgrade to Fedora Core 37. I usually remove "rhgb and quiet" and add "net.ifnames=0 biosdevname=0" to the kernel command line but I don't think it influences the appearance of this "bug"... REMOVING "quiet" seems to be the minimum change required to show the "bug".

The ifnames and biosdevname are added to rename the interface name to "eth0". Here's the "rules" Addition to affect the rename, if you're interested.

Best regards and STAY SAFE!

George...


# /etc/udev/rules.d/99-rename-to-eth0.rules
# grubs: net.ifnames=0 biosdevname=0
# vi /etc/default/grub
# grub2-mkconfig -o /boot/grub2/grub.cfg
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="54:04:a6:10:61:87", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

Comment 14 George R. Goffe 2022-03-16 18:05:16 UTC
Howdy,

I hate to be a nag but, has there been any progress on solving this bug?

I DO understand the concept of "More pressing duties" but I just thought I'd ask.

Thanks,

George...

Comment 15 Javier Martinez Canillas 2022-03-23 13:18:17 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #12)
> Javier: so what is the reproduction recipe:

Sorry I missed your comment before. I explained in detail how
to reproduce on Comment 6.

> which kernel version,

5.17.0-0.rc4.96.fc36.x86_64

> which systemd version,

whatever was in F36 at the time, I don't have the VM around anymore.

> libvirt VM?

Yes. Legacy BIOS install, with EFI it does work correctly.

> any special config on the kernel command line?

Just "nomodeset"

Comment 16 George R. Goffe 2022-03-25 18:22:33 UTC
jJavier,

Thanks for responding to this bug.

Message garbage still appears. Current kernel 5.17.0-128.fc37.x86_64, Fedora Core 37. 


The key to causing this problem is removing "QUIET" from the grub  menu entry. As far as I know, "nomodeset" has NO effect.

Best regards,

George...

Comment 17 George R. Goffe 2022-04-12 14:25:44 UTC
Javier,

Can I get a "current" status of this bug report please?

To reiterate on the way to cause this "bug" to appear: nomodeset is added to the kernel command line in grub AND quiet is removed. 

The problem STILL exists when nomodeset AND with quiet removed... Adding quiet causes the problem to disappear.

Best regards,

George...

"current" status of this system:
: 
fc37-bash 5.1 ~# uname -r
5.18.0-0.rc1.20220408git1831fed559732b1.20.fc37.x86_64

fc37-bash 5.1 ~# rpm -q libvirt
libvirt-8.2.0-1.fc37.x86_64

fc37-bash 5.1 ~# rpm -q systemd
systemd-250.4-2.fc37.x86_64

Comment 18 George R. Goffe 2022-07-29 19:14:59 UTC
Howdy,

It's been some time since there was a response to this bug. Could I get a status please?

The bug appears on "native" systems and "virtual" systems... All affected systems are FC37 and are generally FULLY upgraded from the repositories.

Best regards,

George...

Comment 19 Lawrence Lagerlof 2022-07-29 23:49:40 UTC
This bug is happening to me on a real machine. My PC has a legacy BIOS.

Comment 20 Ben Cotton 2022-08-09 13:12:27 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 37 development cycle.
Changing version to 37.

Comment 21 David Tardon 2022-08-25 12:31:39 UTC
I can easily reproduce this with a freshly-installed F-36 in virt-manager (that is, QEMU/KVM). All that's needed is to remove "quiet" from the kernel cmdline and add "rd.break=cmdline" (to stop the boot). Moreover, when I scroll back, output that had been fine before is garbled too now (e.g., the "Welcome to ..." line in the attached screenshot). My conclusion: this is something in the video driver.

Comment 22 David Tardon 2022-08-25 12:33:08 UTC
Created attachment 1907544 [details]
a screenshot of garbled text that's been right before

Comment 23 David Tardon 2022-08-25 12:34:54 UTC
Btw, I see this in RHEL-9 too (but not in RHEL-8).

Comment 24 Paulo Castro 2023-01-28 15:58:30 UTC
Hi all,

Same issue here with a F37 install on a new physical machine doing UEFI boot.

    PECastro

Comment 25 George R. Goffe 2023-02-03 00:26:27 UTC
Howdy,

I am continuing to see these messages while booting a "current (as of 2 Feb 2023)" Fedora Core 38 install DVD.

Does anyone know when this problem will be fixed?

Best regards,

George...

Comment 26 Milan Broz 2023-03-21 10:53:06 UTC
While the gibberish text apparently comes from kernel messages (at least in my case), I do not think it is a kernel bug (should be reaasigned?)...

Anyway, I see the same problem in VMware machine and also with bare metal (non-UEFI), both booting in text mode.

I can try to run some debug, if it helps anything. Any idea?

Comment 27 Hans de Goede 2023-03-22 10:46:11 UTC
(In reply to Milan Broz from comment #26)
> While the gibberish text apparently comes from kernel messages (at least in
> my case), I do not think it is a kernel bug (should be reaasigned?)...
> 
> Anyway, I see the same problem in VMware machine and also with bare metal
> (non-UEFI), both booting in text mode.
> 
> I can try to run some debug, if it helps anything. Any idea?

I have been seeing this on some real (non VM) hw when booting in BIOS mode too.

I believe this is related to console font loading interacting badly with the VGA console. The messages are good until something messes them up and then the messages become readable again when switching to the fbcon rendered text on top of the /dev/fb# registered by the drm/kms driver for your gfxcard.

To test this theory you could mask systemd-vconsole-setup.service and then regenerate the initrd (note this will also stop console keymap loading so your stuck with the US qwerty keymap then).

I believe chances are good that disabling systemd-vconsole-setup.service will workaround this. Which at least will pin point the issue to being one of font loading.

Another option, which is being discussed for other reasons in bug 2176782 is to make the kernel switch to a vesa-mode and then use simpledrm with fbcon on top of the vesa framebuffer. This bypassed vgacon all together and makes the boot path / boot console usage / experience between UEFI and BIOS much more consistent.

You can test this by e.g. adding "vga=791" to the kernel commandline which will make the kernel switch to a vesa LFB 1024x768 @ 32bpp mode at boot and you should then get console output from the following stack of layers fbcon -> fb0 -> simpledrm -> vesa LFB.

Actually given the discussions in bug 2176782 I think that we should just work towards making simpledrm work on BIOS, because the more things move toward UEFI the less the other code paths get tested and consistently using simpledrm everywhere would be good.

Comment 28 Milan Broz 2023-03-23 15:18:58 UTC
Created attachment 1953195 [details]
another example of kmsg garbled screen

I do not think the font is the problem - in my case it is direct kmesg that si being garbled, see the attachement - text mode, only output from kernel is garbled. Looks like just some bits of the message are flipped (an no, this is not serial console with wrong baudrate ;-)...

Comment 29 Milan Broz 2023-03-23 15:25:11 UTC
And disabling systemd-vconsole-setup.service does not help (for my setup).

Comment 30 Milan Broz 2023-03-23 16:16:47 UTC
If I set LogColor=no in /etc/systemd/system.conf (and rebuild initramdisk to propagate it to boot) I can no longer reproduce the issue.

So it seems to me that some format messages from kernel/systemd were mixed the way it confuses terminal formatting? ...

Comment 31 Hans de Goede 2023-04-04 17:18:52 UTC
(In reply to Milan Broz from comment #30)
> If I set LogColor=no in /etc/systemd/system.conf (and rebuild initramdisk to
> propagate it to boot) I can no longer reproduce the issue.

That is an interesting find, thank you for figuring this out.

Unfortunately I don't have time to look further into this issue.

Comment 32 Aoife Moloney 2023-11-23 00:07:56 UTC
This message is a reminder that Fedora Linux 37 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 37 on 2023-12-05.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '37'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 37 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 33 Aoife Moloney 2023-12-05 21:03:17 UTC
Fedora Linux 37 entered end-of-life (EOL) status on None.

Fedora Linux 37 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.