Bug 2037703 - Strange messages on boot console
Summary: Strange messages on boot console
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 37
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-06 11:25 UTC by George R. Goffe
Modified: 2022-10-24 20:19 UTC (History)
37 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
tar.gz file with dmesg.txt and 2 screen shots of error (183.99 KB, application/gzip)
2022-01-06 11:25 UTC, George R. Goffe
no flags Details
a screenshot of garbled text that's been right before (18.19 KB, image/png)
2022-08-25 12:33 UTC, David Tardon
no flags Details

Description George R. Goffe 2022-01-06 11:25:24 UTC
Created attachment 1849239 [details]
tar.gz file with dmesg.txt and 2 screen shots of error

1. Please describe the problem:
Booting a VM under VirtualBox shows some strange character strings with no content. See attached jpg.

2. What is the Version-Release number of the kernel:
5.16.0-0.rc8.55.fc36.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
 
Problem has appeared on several earlier kernels

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes. Problem appears on console during boot... Requires no other action from me to cause this.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Yes.

6. Are you running any modules that not shipped with directly Fedora's kernel?:

Yes, I think... Kernel modules related to virtualization via VirtualBox.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 George R. Goffe 2022-02-17 10:22:33 UTC
This problem existed only in a VirtualBox VM. I have upgraded the host system to the "latest" Fedora Core (Now 37). The problem now appears on the host.

The current kernel is now 5.17.0-0.rc4.96.fc37.x86_64. I DO NOT believe it is kernel related. It is NOT at all clear just which application causes this error to appear. Perhaps an examination of the legible messages surrounding the gibberish messages would prove useful?

Again, the gibberish messages do NOT appear in the dmesg output.

A hint or tip would be MOST helpful to me to identify the specific application(?) that causes this error to appear.

The host AND the guest both seem to be running correctly.

Best regards,

George...

Comment 2 George R. Goffe 2022-02-21 23:25:40 UTC
I think I found the cause of this problem.

In the grub.cfg file there's a line that has "quiet" in it. I remove this so I can see all the boot messages. When I do this, the gibberish messages appear.

Does anyone know where this bug should be placed? Grub?

George...

Comment 3 Gordon Messmer 2022-02-21 23:59:27 UTC
Looking at the screenshots, there's not merely garbled output from a single application, every line of text has some corruption visible.

This is almost certainly not a bug in the kernel (or any other Fedora component), but a bug in the VirtualBox emulated VGA console, similar to the one discussed here:

https://forums.virtualbox.org/viewtopic.php?f=11&t=96739

Comment 4 George R. Goffe 2022-02-22 09:10:35 UTC
Gordon,

Thanks for your response.

The problem exists in the FC37 host I'm running. When I boot "natively" with "quiet" removed, I get these gibberish message output to the console.

This started happening with the 5.17 kernels.

It is NOT VirtualBox that is causing this problem.

Best regards,

George...

Comment 5 Hans de Goede 2022-02-22 09:54:21 UTC
(In reply to George R. Goffe from comment #4)
> Gordon,
> 
> Thanks for your response.
> 
> The problem exists in the FC37 host I'm running. When I boot "natively" with
> "quiet" removed, I get these gibberish message output to the console.
> 
> This started happening with the 5.17 kernels.

Thank you for reporting this.

I guess you are booting your host in classic/legacy BIOS mode rather then through EFI?

You can do: "ls /sys/firmware/efi/efivars/" if that gives a "No such file or directory" error then you are using classic BIOS boot.

VirtualBox guests also default to classic BIOS mode. So this likely means that the recent change to switch the pre native-drm driver boot console to simpledrm is causing issues with the boot messages shown before the native drm driver loads:

https://fedoraproject.org/wiki/Changes/ReplaceFbdevDrivers

I'm assigning this to Javier who is responsible for this change. Hopefully he can figure out a fix.

Javier, maybe we need to keep the vga-console setting for classic BIOS boot ?  That is not a framebuffer driver, but rather an alternative console implementation, which will get replaced with fbcon on drm when the native drm driver loads.

Comment 6 Javier Martinez Canillas 2022-02-22 15:37:28 UTC
Thanks Hans for pointing me this issue and also providing the needed context.

I've installed both Fedora 35 and Fedora 36 as VMs and booting with legacy
BIOS and tested with different kernel command line parameters to figure out
where could be the problem.

tl; dr: please test with "nomodeset vga=0x318" in your kernel command line.

long version:

The simpledrm change ended being a red herring and the actual bug is in the
VGA console driver (vgacon) AFAICT. What I found is the following:

1) On x86, when booting with legacy BIOS the vgacon driver is always used.

2) Then at some point either a fbdev or DRM driver with fbdev emulation will
   be registered and this will cause fbcon to take over the console and the
   vgacon is unregistered.

3) Using the "vga=" kernel command line parameters, causes the vgacon to not
   be enabled and instead fbcon + vesafb fbdev driver is used.

I'm testing with a virtio video device and the boot looks like the following
for both Fedora 35 and 36 without any additional command line parameters:

$ cat /proc/fb 
0 virtio_gpudrmfb

$ dmesg | grep Console:
[    0.064609] Console: colour VGA+ 80x25
[    1.631950] Console: switching to colour dummy device 80x25
[    1.637304] Console: switching to colour frame buffer device 128x48

the first line is for vgacon, the second one for dummycon (that's set when
no console is registered) and then finally fbcon takes over.

Using nomodeset will cause the virtio_gpu DRM driver to not be probed and
only the vgacon driver will be used (i.e in Fedora 35):

$ cat /proc/fb
$

$ dmesg | grep Console:
[    0.066102] Console: colour VGA+ 80x25

when using a "vga=" parameter, then the vesafb driver will be probed and
the flow is dummycon -> fbcon + vesafb (i.e in Fedora 35 with vga=0x318):

$ cat /proc/fb 
0 VESA VGA

$ dmesg | grep -i console:
[    0.054056] Console: colour dummy device 80x25
[    0.425638] Console: switching to colour frame buffer device 128x48

So there are ways to disable both the DRM driver (nomodeset) and the
vgacon driver (passing a mode with vga=).

The vesafb driver is replaced by the simpledrm driver in Fedora 36 and
nomodeset param only affect real DRM drivers, not simpledrm. So in 36
is to only use the simpledrm driver with the "nomodeset vga=0x318":

$ cat /proc/fb 
0 simpledrmdrmfb

$ dmesg | grep Console:
[    0.047019] Console: colour dummy device 80x25
[    0.673643] Console: switching to colour frame buffer device 128x48

Please give it a try in your systems to check that the behavior is the
same I have with kvm and SeaBIOS. I just chose 0x318 randomly but you
can check the available modes with vga=ask and use whatever is suitable.

Comment 7 Javier Martinez Canillas 2022-02-22 16:02:42 UTC
Hans suggested that I could test an older kernel build on F36 and see
how it behaves and surprisingly it does exhibited the same issue than
with the F36 kernel package.

That is, with 5.16.9-200.fc35.x86_64 and the F36 user-space the same
wrong characters where shown when booting with "nomodeset" (only the
vgacon driver).

But it wasn't present when booting with "vga=0x318 nomodeset" (only
the fbcon + vesafb drivers).

Hans then said that I could try downgrading the kbd packages, so I
did the following (please also test George in your affected system):

$ wget https://kojipkgs.fedoraproject.org//packages/kbd/2.4.0/7.fc35/noarch/kbd-misc-2.4.0-7.fc35.noarch.rpm
$ wget https://kojipkgs.fedoraproject.org//packages/kbd/2.4.0/7.fc35/x86_64/kbd-2.4.0-7.fc35.x86_64.rpm
$ dnf downgrade kbd-*.rpm
$ dracut -f

and then modify your kernel cmdline to just have "nomodeset" without
"vga=" to make the kernel only use the VGA console driver.

Comment 8 Javier Martinez Canillas 2022-02-22 16:18:37 UTC
I was able to reproduce again in the F36 VM even with the downgraded
kbd packages, the issue doesn't seem to be in that package after all.

I also tested with installing the 5.17.0-0.rc4.96.fc36.x86_64 kernel
in the F35 VM and the are no issues with that combination. The issue
has to be somewhere in the user-space but isn't clear to me where...

Comment 9 Javier Martinez Canillas 2022-02-22 17:02:32 UTC
I tried now updating the F35 VM with latest systemd:

 $ dnf --disablerepo=* --enablerepo=rawhide upgrade systemd

and after that the issue started to appear on the F35 VM, both with
the 5.16.9-200.fc35.x86_64 and 5.17.0-0.rc4.96.fc36.x86_64 kernels.

Comment 10 Javier Martinez Canillas 2022-02-22 17:19:55 UTC
Changing the component to systemd since the issue seems to be related and
not really something that's in the kernel according to my testing.

Comment 11 George R. Goffe 2022-02-27 02:37:53 UTC
Howdy,

Is anyone working this bug?

Best regards and STAY SAFE!

George...

Comment 12 Zbigniew Jędrzejewski-Szmek 2022-02-27 09:50:09 UTC
Javier: so what is the reproduction recipe:
which kernel version,
which systemd version,
libvirt VM?
any special config on the kernel command line?

Comment 13 George R. Goffe 2022-03-01 21:15:13 UTC
Zbigniew,

Thanks for responding to this bug.

I first started seeing the "bug" with an upgrade to the 5.x kernels in VBox VMs but now it's in the "native" host as well. I am currently at the "latest" upgrade to Fedora Core 37. I usually remove "rhgb and quiet" and add "net.ifnames=0 biosdevname=0" to the kernel command line but I don't think it influences the appearance of this "bug"... REMOVING "quiet" seems to be the minimum change required to show the "bug".

The ifnames and biosdevname are added to rename the interface name to "eth0". Here's the "rules" Addition to affect the rename, if you're interested.

Best regards and STAY SAFE!

George...


# /etc/udev/rules.d/99-rename-to-eth0.rules
# grubs: net.ifnames=0 biosdevname=0
# vi /etc/default/grub
# grub2-mkconfig -o /boot/grub2/grub.cfg
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="54:04:a6:10:61:87", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

Comment 14 George R. Goffe 2022-03-16 18:05:16 UTC
Howdy,

I hate to be a nag but, has there been any progress on solving this bug?

I DO understand the concept of "More pressing duties" but I just thought I'd ask.

Thanks,

George...

Comment 15 Javier Martinez Canillas 2022-03-23 13:18:17 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #12)
> Javier: so what is the reproduction recipe:

Sorry I missed your comment before. I explained in detail how
to reproduce on Comment 6.

> which kernel version,

5.17.0-0.rc4.96.fc36.x86_64

> which systemd version,

whatever was in F36 at the time, I don't have the VM around anymore.

> libvirt VM?

Yes. Legacy BIOS install, with EFI it does work correctly.

> any special config on the kernel command line?

Just "nomodeset"

Comment 16 George R. Goffe 2022-03-25 18:22:33 UTC
jJavier,

Thanks for responding to this bug.

Message garbage still appears. Current kernel 5.17.0-128.fc37.x86_64, Fedora Core 37. 


The key to causing this problem is removing "QUIET" from the grub  menu entry. As far as I know, "nomodeset" has NO effect.

Best regards,

George...

Comment 17 George R. Goffe 2022-04-12 14:25:44 UTC
Javier,

Can I get a "current" status of this bug report please?

To reiterate on the way to cause this "bug" to appear: nomodeset is added to the kernel command line in grub AND quiet is removed. 

The problem STILL exists when nomodeset AND with quiet removed... Adding quiet causes the problem to disappear.

Best regards,

George...

"current" status of this system:
: 
fc37-bash 5.1 ~# uname -r
5.18.0-0.rc1.20220408git1831fed559732b1.20.fc37.x86_64

fc37-bash 5.1 ~# rpm -q libvirt
libvirt-8.2.0-1.fc37.x86_64

fc37-bash 5.1 ~# rpm -q systemd
systemd-250.4-2.fc37.x86_64

Comment 18 George R. Goffe 2022-07-29 19:14:59 UTC
Howdy,

It's been some time since there was a response to this bug. Could I get a status please?

The bug appears on "native" systems and "virtual" systems... All affected systems are FC37 and are generally FULLY upgraded from the repositories.

Best regards,

George...

Comment 19 Lawrence Lagerlof 2022-07-29 23:49:40 UTC
This bug is happening to me on a real machine. My PC has a legacy BIOS.

Comment 20 Ben Cotton 2022-08-09 13:12:27 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 37 development cycle.
Changing version to 37.

Comment 21 David Tardon 2022-08-25 12:31:39 UTC
I can easily reproduce this with a freshly-installed F-36 in virt-manager (that is, QEMU/KVM). All that's needed is to remove "quiet" from the kernel cmdline and add "rd.break=cmdline" (to stop the boot). Moreover, when I scroll back, output that had been fine before is garbled too now (e.g., the "Welcome to ..." line in the attached screenshot). My conclusion: this is something in the video driver.

Comment 22 David Tardon 2022-08-25 12:33:08 UTC
Created attachment 1907544 [details]
a screenshot of garbled text that's been right before

Comment 23 David Tardon 2022-08-25 12:34:54 UTC
Btw, I see this in RHEL-9 too (but not in RHEL-8).


Note You need to log in before you can comment on or make changes to this bug.