RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1881912 - Guest hit blue screen during booting phrase with virtio-vga device
Summary: Guest hit blue screen during booting phrase with virtio-vga device
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.3
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: rc
: 8.4
Assignee: Gerd Hoffmann
QA Contact: bfu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-23 11:17 UTC by xianwang
Modified: 2021-05-26 07:40 UTC (History)
13 users (show)

Fixed In Version: qemu 5.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-26 07:40:05 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
screen1 (39.17 KB, image/png)
2020-09-23 11:17 UTC, xianwang
no flags Details
screen2 (41.79 KB, image/png)
2020-09-23 11:21 UTC, xianwang
no flags Details
black screen after slof (15.40 KB, image/png)
2020-12-21 03:46 UTC, bfu
no flags Details

Description xianwang 2020-09-23 11:17:30 UTC
Created attachment 1715999 [details]
screen1

Description of problem:
Boot a guest with virtio-vga device, during loading kernel, guest hit blue screen.

Version-Release number of selected component (if applicable):
Host:
4.18.0-238.el8.ppc64le
qemu-kvm-5.1.0-8.module+el8.3.0+8141+3cd9cd43.ppc64le
SLOF-20200717-1.gite18ddad8.module+el8.3.0+7638+07cf13d2.noarch

Guest:
4.18.0-240.el8.ppc64le

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest with qemu cli:
[root@ibm-p9b-20 home]# cat test.sh 
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-machine pseries  \
-nodefaults \
-device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x1 \
-m 4096  \
-smp 80,maxcpus=80,cores=40,threads=1,sockets=2  \
-cpu 'host' \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 \
-chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 \
-device virtserialport,bus=virtio-serial0.0,chardev=qga0,name=org.qemu.guest_agent.0 \
-chardev socket,path=/var/tmp/monitor-qmpmonitor1,nowait,id=qmp_id_qmpmonitor1,server  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,path=/var/tmp/serial-serial0,nowait,id=chardev_serial0,server \
-device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 \
-device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device usb-kbd,id=usb-kbd1,bus=usb1.0,port=2 \
-device usb-mouse,id=usb-mouse1,bus=usb1.0,port=3 \
-object iothread,id=iothread0 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \
-blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/xianwang/rhel830-ppc64le-virtio-scsi_p9.qcow2,cache.direct=on,cache.no-flush=off \
-blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-device virtio-net-pci,mac=9a:58:80:27:08:7c,id=idji2KPU,netdev=idYBxx2l,bus=pci.0,addr=0x5  \
-netdev tap,id=idYBxx2l,vhost=on \
-vnc :11  \
-rtc base=utc,clock=host  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-qmp tcp:0:8881,server,nowait \
-monitor stdio

2.
3.

Actual results:
During guest boot phrase, it hit screen, but guest boot successfully.
As attachment.

Expected results:
Guest boot successfully without any blue screen.

Additional info:
Guest boot successfully and without blue screen with "-device VGA,id=vga1,bus=pci.0,addr=0x1"

Comment 1 xianwang 2020-09-23 11:21:21 UTC
Created attachment 1716001 [details]
screen2

Comment 2 xianwang 2020-09-23 11:23:39 UTC
Hi, zhguo,
Could you help to confirm, does x86_64 hit this issue?

Comment 3 xianwang 2020-09-23 11:45:48 UTC
1)I have test this scenario on the following build, it also exists this issue, so this is not a regression.
Host:
[root@ibm-p9b-20 home]# uname -r
4.18.0-238.el8.ppc64le
qemu-kvm-4.2.0-29.module+el8.2.1+7990+27f1e480.4.ppc64le
SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd.noarch

Guest:
4.18.0-193.15.1.el8_2.ppc64le

2)When blue screen produced, console output is:
...........
[    2.212966] sd 2:0:0:0: [sdd] Attached SCSI disk
[    2.229651] usb 2-2: New USB device found, idVendor=0627, idProduct=0001, bcdDevice= 0.00
[    2.229791] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=5
[    2.229868] usb 2-2: Product: QEMU USB Mouse
[    2.229911] usb 2-2: Manufacturer: QEMU
[    2.229943] usb 2-2: SerialNumber: 42
[    2.231090] input: QEMU QEMU USB Mouse as /devices/pci0000:00/0000:00:05.0/usb2/2-2/2-2:1.0/input/input1
[    2.233158] hid-generic 0003:0627:0001.0002: input,hidraw1: USB HID v0.01 Mouse [QEMU QEMU USB Mouse] on usb-0000:00:05.0-2/input0

[    2.359098] usb 2-3: new high-speed USB device number 4 using xhci_hcd

[    2.509563] usb 2-3: New USB device found, idVendor=0627, idProduct=0001, bcdDevice= 0.00
[    2.509637] usb 2-3: New USB device strings: Mfr=1, Product=4, SerialNumber=5
[    2.509701] usb 2-3: Product: QEMU USB Keyboard
[    2.509745] usb 2-3: Manufacturer: QEMU
[    2.509783] usb 2-3: SerialNumber: 42
[    2.510837] input: QEMU QEMU USB Keyboard as /devices/pci0000:00/0000:00:05.0/usb2/2-3/2-3:1.0/input/input2
.............

Comment 4 xianwang 2020-09-23 11:48:21 UTC
Both remote-viewer vnc://10.16.214.112:5911 and vncviewer 10.16.214.112:5911 both could catch the blue screen.

Comment 5 Ademar Reis 2020-09-24 18:45:17 UTC
To clarify: after this, the guest continues to boot normally, so this blue screen is literally just a blue background, it's not a crash like the famous "windows blue screen of death", correct? It seems to be a result of some harmless display initialization.

Gerd: what do you think?

Comment 6 Guo, Zhiyi 2020-09-24 23:02:25 UTC
(In reply to xianwang from comment #2)
> Hi, zhguo,
> Could you help to confirm, does x86_64 hit this issue?

Not able to reproduce this issue against qemu-kvm-5.1.0-9.module+el8.3.0+8182+ac9ced32.x86_64
VM used is rhel 8.3 VM with kernel 4.18.0-239.el8.x86_64

Comment 7 xianwang 2020-09-25 08:12:58 UTC
(In reply to Ademar Reis from comment #5)
> To clarify: after this, the guest continues to boot normally, so this blue
> screen is literally just a blue background, it's not a crash like the famous
> "windows blue screen of death", correct? It seems to be a result of some
> harmless display initialization.
> 
> Gerd: what do you think?

Yes, the guest continues to boot normally, it just display a blue background with twinkling, but I still think it is an issue, it must be something wrong with display.

Comment 8 Gerd Hoffmann 2020-09-25 10:26:57 UTC
Byteorder mismatch.

Most likely the vga runs the framebuffer in bigendian mode (b/c ppc is bigendian traditionally, and I think the SLOF firmware runs in big endian mode still), whereas linux (offb) assumes the boot framebuffer is in little endian.  As soon as the virtio-gpu drm driver loads and takes over the display from offb everything is fine, so this is more a consmetical glitch not something serious.

So, the interesting question is why this happens with virtio-vga but doesn't with stdvga.  The vga part of the virtio-vga is fully compatible with the stdvga, so there is no reason for this to happen.  Probably there is a quirk somewhere in place (SLOF? offb?) which is active for stdvga but isn't for virtio-vga ...

I guess this is one for the ppc experts to look at.

Comment 9 Gerd Hoffmann 2020-09-25 14:28:53 UTC
> Most likely the vga runs the framebuffer in bigendian mode (b/c ppc is
> bigendian traditionally, and I think the SLOF firmware runs in big endian
> mode still), whereas linux (offb) assumes the boot framebuffer is in little
> endian.

Uh, no, other way around.  virtio-vga runs in little endian b/c it doesn't
implement the big-endian-framebuffer qom property used by pseries machine
type to switch framebuffer endian-ness.  So something to fix on the qemu
side not inside the guest.

Comment 12 bfu 2020-11-30 08:16:00 UTC
Re-verify this bug with:
host kernel: 4.18.0-252.el8.ppc64le
guest kernel: 4.18.0-252.el8.ppc64le
qemu version: qemu-kvm-5.2.0-0.scrmod+el8.4.0+8862+2dc743cb.wrb201125.ppc64le

Result:
Guest boot successfully without any blue screen.

Comment 14 bfu 2020-12-21 03:34:49 UTC
Re-verify this bug with:
host kernel: 4.18.0-262.el8.ppc64le
guest kernel: 4.18.0-262.el8.ppc64le
qemu version: qemu-img-5.2.0-2.module+el8.4.0+9186+ec44380f.ppc64le

Step: same with description

Result:
Guest boot successfully with black screen after slof finished login
@kraxel

Comment 15 bfu 2020-12-21 03:46:01 UTC
Created attachment 1740836 [details]
black screen after slof

Comment 16 Gerd Hoffmann 2021-01-11 14:13:06 UTC
(In reply to bfu from comment #14)
> Re-verify this bug with:
> host kernel: 4.18.0-262.el8.ppc64le
> guest kernel: 4.18.0-262.el8.ppc64le
> qemu version: qemu-img-5.2.0-2.module+el8.4.0+9186+ec44380f.ppc64le
> 
> Step: same with description
> 
> Result:
> Guest boot successfully with black screen after slof finished login
> @kraxel

Hmm, unrelated guest issue maybe?
Can you retest with known-good RHEL-8.3 in the guest?
It's a host bug, so the guest version should not matter.

(the patches did land upstream in 5.2, so 5.2 builds should work).

Comment 17 xianwang 2021-01-14 01:16:34 UTC
(In reply to Guo, Zhiyi from comment #6)
> (In reply to xianwang from comment #2)
> > Hi, zhguo,
> > Could you help to confirm, does x86_64 hit this issue?
> 
> Not able to reproduce this issue against
> qemu-kvm-5.1.0-9.module+el8.3.0+8182+ac9ced32.x86_64
> VM used is rhel 8.3 VM with kernel 4.18.0-239.el8.x86_64

Referring to this comment, update hardware to ppc64le.

Comment 18 bfu 2021-01-21 08:09:47 UTC
(In reply to Gerd Hoffmann from comment #16)
> (In reply to bfu from comment #14)
> > Re-verify this bug with:
> > host kernel: 4.18.0-262.el8.ppc64le
> > guest kernel: 4.18.0-262.el8.ppc64le
> > qemu version: qemu-img-5.2.0-2.module+el8.4.0+9186+ec44380f.ppc64le
> > 
> > Step: same with description
> > 
> > Result:
> > Guest boot successfully with black screen after slof finished login
> > @kraxel
> 
> Hmm, unrelated guest issue maybe?
> Can you retest with known-good RHEL-8.3 in the guest?
> It's a host bug, so the guest version should not matter.
> 
> (the patches did land upstream in 5.2, so 5.2 builds should work).

Hi Gerd, sorry for the late reply, but I'm curious about your reply, are you telling me to retest with the same host kernel and qemu version but known-good RHEL.8.3.0 guest?
Since this bug is a qemu bug I think it's not related to the kernel version and I've retested with the newest version and it seems ok to me
host kernel: 4.18.0-275.el8.ppc64le
guest kernel: 4.18.0-275.el8.ppc64le
qemu version: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.ppc64le

Comment 19 Gerd Hoffmann 2021-01-21 09:29:22 UTC
> > Hmm, unrelated guest issue maybe?
> > Can you retest with known-good RHEL-8.3 in the guest?
> > It's a host bug, so the guest version should not matter.
> > 
> > (the patches did land upstream in 5.2, so 5.2 builds should work).
> 
> Hi Gerd, sorry for the late reply, but I'm curious about your reply, are you
> telling me to retest with the same host kernel and qemu version but
> known-good RHEL.8.3.0 guest?

Yes.

We don't yet have 8.4 composes tested & approved by rel-eng yet.
When picking a random nightly compose you might get one which is
broken for unrelated reasons.  So just using 8.3 is the easy way
to make sure you have a working compose.

If you have a nightly where you know it installs fine on ppc you
can use that too.

> Since this bug is a qemu bug I think it's not related to the kernel version

Correct.

> and I've retested with the newest version and it seems ok to me
> host kernel: 4.18.0-275.el8.ppc64le
> guest kernel: 4.18.0-275.el8.ppc64le
> qemu version: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.ppc64le

Thanks.

Comment 20 bfu 2021-01-22 03:42:36 UTC
(In reply to Gerd Hoffmann from comment #19)
> > > Hmm, unrelated guest issue maybe?
> > > Can you retest with known-good RHEL-8.3 in the guest?
> > > It's a host bug, so the guest version should not matter.
> > > 
> > > (the patches did land upstream in 5.2, so 5.2 builds should work).
> > 
> > Hi Gerd, sorry for the late reply, but I'm curious about your reply, are you
> > telling me to retest with the same host kernel and qemu version but
> > known-good RHEL.8.3.0 guest?
> 
> Yes.
> 
> We don't yet have 8.4 composes tested & approved by rel-eng yet.
> When picking a random nightly compose you might get one which is
> broken for unrelated reasons.  So just using 8.3 is the easy way
> to make sure you have a working compose.
> 
> If you have a nightly where you know it installs fine on ppc you
> can use that too.
> 
> > Since this bug is a qemu bug I think it's not related to the kernel version
> 
> Correct.
> 
> > and I've retested with the newest version and it seems ok to me
> > host kernel: 4.18.0-275.el8.ppc64le
> > guest kernel: 4.18.0-275.el8.ppc64le
> > qemu version: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.ppc64le
> 
> Thanks.

Hi Gerd,
Thanks for your reply, cause this bug was in "assigned" status for a really long time, If you sure about the patch was merged, I'm curious about why you don't turn the status into "modified" or "ON_QA"? If you do be sure about this bug was fixed in the newest qemu package, I could turn it to "close current release", just wanna make sure there's no risk that users would meet it

Comment 21 Gerd Hoffmann 2021-01-22 09:01:15 UTC
> Hi Gerd,
> Thanks for your reply, cause this bug was in "assigned" status for a really
> long time, If you sure about the patch was merged, I'm curious about why you
> don't turn the status into "modified" or "ON_QA"? If you do be sure about
> this bug was fixed in the newest qemu package, I could turn it to "close
> current release", just wanna make sure there's no risk that users would meet
> it

Usual process to handle that is set fixed field and status to post,
then those bugs will automatically handled on rebase.

Forgot to do that, did it now.
Not fully sure this actually works if the rebase did happen already, @areis?

Comment 25 bfu 2021-02-01 11:02:01 UTC
@kraxel, cause the internal target release was not set, I could not set the ITM

verified this bug with:
hsot kernel: 4.18.0-280.el8.ppc64le
guest kernel: 4.18.0-280.el8.ppc64le
qemu version: qemu-img-5.2.0-4.module+el8.4.0+9676+589043b9.ppc64le

Test result:
guest could boot successfully without black or green screen display

Comment 27 Gerd Hoffmann 2021-02-10 11:13:35 UTC
(In reply to bfu from comment #25)
> @kraxel, cause the internal target release was not set, I could
> not set the ITM

areis did that menawhile (so clearing the needinfo).

Comment 28 CongLi 2021-05-26 06:13:28 UTC
Hi,

Since the issue described in this bug should be resolved (VERIFIED), could you please close this bug with resolution 'CURRENTRELEASE' if this bug got fixed ?

If the fix for this is not released yet, check if this will ever get fixed. In case of a negative answer then please change it as WONTFIX.

If there's anything else to be done on this BZ, if it's still active, not released yet and we actually intend to release it, then please ignore my message.

Please note: for those bugs which are not included in errata, please add 'TestOnly' keyword, and those bugs with 'TestOnly' keyword will be closed automatically after GA.
TestOnly: Use this when there is no code delivery involved, or for use when code is already upstream and will be incorporated automatically to the next release for testing purposes only.

Thank you.

Comment 29 Gerd Hoffmann 2021-05-26 07:40:05 UTC
> Since the issue described in this bug should be resolved (VERIFIED), could
> you please close this bug with resolution 'CURRENTRELEASE' if this bug got
> fixed ?

Done.


Note You need to log in before you can comment on or make changes to this bug.