Bug 1881912
Summary: | Guest hit blue screen during booting phrase with virtio-vga device | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | xianwang <xianwang> | ||||||||
Component: | qemu-kvm | Assignee: | Gerd Hoffmann <kraxel> | ||||||||
qemu-kvm sub component: | Graphics | QA Contact: | bfu <bfu> | ||||||||
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||||||
Severity: | medium | ||||||||||
Priority: | medium | CC: | areis, bfu, dgibson, gerd.hoffmann, kraxel, mdeng, ngu, qzhang, virt-maint, xuma, yihyu, zhenyzha, zhguo | ||||||||
Version: | 8.3 | ||||||||||
Target Milestone: | rc | ||||||||||
Target Release: | 8.4 | ||||||||||
Hardware: | ppc64le | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | qemu 5.2 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2021-05-26 07:40:05 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Created attachment 1716001 [details]
screen2
Hi, zhguo, Could you help to confirm, does x86_64 hit this issue? 1)I have test this scenario on the following build, it also exists this issue, so this is not a regression. Host: [root@ibm-p9b-20 home]# uname -r 4.18.0-238.el8.ppc64le qemu-kvm-4.2.0-29.module+el8.2.1+7990+27f1e480.4.ppc64le SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd.noarch Guest: 4.18.0-193.15.1.el8_2.ppc64le 2)When blue screen produced, console output is: ........... [ 2.212966] sd 2:0:0:0: [sdd] Attached SCSI disk [ 2.229651] usb 2-2: New USB device found, idVendor=0627, idProduct=0001, bcdDevice= 0.00 [ 2.229791] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=5 [ 2.229868] usb 2-2: Product: QEMU USB Mouse [ 2.229911] usb 2-2: Manufacturer: QEMU [ 2.229943] usb 2-2: SerialNumber: 42 [ 2.231090] input: QEMU QEMU USB Mouse as /devices/pci0000:00/0000:00:05.0/usb2/2-2/2-2:1.0/input/input1 [ 2.233158] hid-generic 0003:0627:0001.0002: input,hidraw1: USB HID v0.01 Mouse [QEMU QEMU USB Mouse] on usb-0000:00:05.0-2/input0 [ 2.359098] usb 2-3: new high-speed USB device number 4 using xhci_hcd [ 2.509563] usb 2-3: New USB device found, idVendor=0627, idProduct=0001, bcdDevice= 0.00 [ 2.509637] usb 2-3: New USB device strings: Mfr=1, Product=4, SerialNumber=5 [ 2.509701] usb 2-3: Product: QEMU USB Keyboard [ 2.509745] usb 2-3: Manufacturer: QEMU [ 2.509783] usb 2-3: SerialNumber: 42 [ 2.510837] input: QEMU QEMU USB Keyboard as /devices/pci0000:00/0000:00:05.0/usb2/2-3/2-3:1.0/input/input2 ............. Both remote-viewer vnc://10.16.214.112:5911 and vncviewer 10.16.214.112:5911 both could catch the blue screen. To clarify: after this, the guest continues to boot normally, so this blue screen is literally just a blue background, it's not a crash like the famous "windows blue screen of death", correct? It seems to be a result of some harmless display initialization. Gerd: what do you think? (In reply to xianwang from comment #2) > Hi, zhguo, > Could you help to confirm, does x86_64 hit this issue? Not able to reproduce this issue against qemu-kvm-5.1.0-9.module+el8.3.0+8182+ac9ced32.x86_64 VM used is rhel 8.3 VM with kernel 4.18.0-239.el8.x86_64 (In reply to Ademar Reis from comment #5) > To clarify: after this, the guest continues to boot normally, so this blue > screen is literally just a blue background, it's not a crash like the famous > "windows blue screen of death", correct? It seems to be a result of some > harmless display initialization. > > Gerd: what do you think? Yes, the guest continues to boot normally, it just display a blue background with twinkling, but I still think it is an issue, it must be something wrong with display. Byteorder mismatch. Most likely the vga runs the framebuffer in bigendian mode (b/c ppc is bigendian traditionally, and I think the SLOF firmware runs in big endian mode still), whereas linux (offb) assumes the boot framebuffer is in little endian. As soon as the virtio-gpu drm driver loads and takes over the display from offb everything is fine, so this is more a consmetical glitch not something serious. So, the interesting question is why this happens with virtio-vga but doesn't with stdvga. The vga part of the virtio-vga is fully compatible with the stdvga, so there is no reason for this to happen. Probably there is a quirk somewhere in place (SLOF? offb?) which is active for stdvga but isn't for virtio-vga ... I guess this is one for the ppc experts to look at. > Most likely the vga runs the framebuffer in bigendian mode (b/c ppc is
> bigendian traditionally, and I think the SLOF firmware runs in big endian
> mode still), whereas linux (offb) assumes the boot framebuffer is in little
> endian.
Uh, no, other way around. virtio-vga runs in little endian b/c it doesn't
implement the big-endian-framebuffer qom property used by pseries machine
type to switch framebuffer endian-ness. So something to fix on the qemu
side not inside the guest.
https://patchwork.ozlabs.org/project/qemu-devel/patch/20200928085335.21961-2-kraxel@redhat.com/ https://patchwork.ozlabs.org/project/qemu-devel/patch/20200928085335.21961-3-kraxel@redhat.com/ Re-verify this bug with: host kernel: 4.18.0-252.el8.ppc64le guest kernel: 4.18.0-252.el8.ppc64le qemu version: qemu-kvm-5.2.0-0.scrmod+el8.4.0+8862+2dc743cb.wrb201125.ppc64le Result: Guest boot successfully without any blue screen. Re-verify this bug with: host kernel: 4.18.0-262.el8.ppc64le guest kernel: 4.18.0-262.el8.ppc64le qemu version: qemu-img-5.2.0-2.module+el8.4.0+9186+ec44380f.ppc64le Step: same with description Result: Guest boot successfully with black screen after slof finished login @kraxel Created attachment 1740836 [details]
black screen after slof
(In reply to bfu from comment #14) > Re-verify this bug with: > host kernel: 4.18.0-262.el8.ppc64le > guest kernel: 4.18.0-262.el8.ppc64le > qemu version: qemu-img-5.2.0-2.module+el8.4.0+9186+ec44380f.ppc64le > > Step: same with description > > Result: > Guest boot successfully with black screen after slof finished login > @kraxel Hmm, unrelated guest issue maybe? Can you retest with known-good RHEL-8.3 in the guest? It's a host bug, so the guest version should not matter. (the patches did land upstream in 5.2, so 5.2 builds should work). (In reply to Guo, Zhiyi from comment #6) > (In reply to xianwang from comment #2) > > Hi, zhguo, > > Could you help to confirm, does x86_64 hit this issue? > > Not able to reproduce this issue against > qemu-kvm-5.1.0-9.module+el8.3.0+8182+ac9ced32.x86_64 > VM used is rhel 8.3 VM with kernel 4.18.0-239.el8.x86_64 Referring to this comment, update hardware to ppc64le. (In reply to Gerd Hoffmann from comment #16) > (In reply to bfu from comment #14) > > Re-verify this bug with: > > host kernel: 4.18.0-262.el8.ppc64le > > guest kernel: 4.18.0-262.el8.ppc64le > > qemu version: qemu-img-5.2.0-2.module+el8.4.0+9186+ec44380f.ppc64le > > > > Step: same with description > > > > Result: > > Guest boot successfully with black screen after slof finished login > > @kraxel > > Hmm, unrelated guest issue maybe? > Can you retest with known-good RHEL-8.3 in the guest? > It's a host bug, so the guest version should not matter. > > (the patches did land upstream in 5.2, so 5.2 builds should work). Hi Gerd, sorry for the late reply, but I'm curious about your reply, are you telling me to retest with the same host kernel and qemu version but known-good RHEL.8.3.0 guest? Since this bug is a qemu bug I think it's not related to the kernel version and I've retested with the newest version and it seems ok to me host kernel: 4.18.0-275.el8.ppc64le guest kernel: 4.18.0-275.el8.ppc64le qemu version: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.ppc64le > > Hmm, unrelated guest issue maybe? > > Can you retest with known-good RHEL-8.3 in the guest? > > It's a host bug, so the guest version should not matter. > > > > (the patches did land upstream in 5.2, so 5.2 builds should work). > > Hi Gerd, sorry for the late reply, but I'm curious about your reply, are you > telling me to retest with the same host kernel and qemu version but > known-good RHEL.8.3.0 guest? Yes. We don't yet have 8.4 composes tested & approved by rel-eng yet. When picking a random nightly compose you might get one which is broken for unrelated reasons. So just using 8.3 is the easy way to make sure you have a working compose. If you have a nightly where you know it installs fine on ppc you can use that too. > Since this bug is a qemu bug I think it's not related to the kernel version Correct. > and I've retested with the newest version and it seems ok to me > host kernel: 4.18.0-275.el8.ppc64le > guest kernel: 4.18.0-275.el8.ppc64le > qemu version: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.ppc64le Thanks. (In reply to Gerd Hoffmann from comment #19) > > > Hmm, unrelated guest issue maybe? > > > Can you retest with known-good RHEL-8.3 in the guest? > > > It's a host bug, so the guest version should not matter. > > > > > > (the patches did land upstream in 5.2, so 5.2 builds should work). > > > > Hi Gerd, sorry for the late reply, but I'm curious about your reply, are you > > telling me to retest with the same host kernel and qemu version but > > known-good RHEL.8.3.0 guest? > > Yes. > > We don't yet have 8.4 composes tested & approved by rel-eng yet. > When picking a random nightly compose you might get one which is > broken for unrelated reasons. So just using 8.3 is the easy way > to make sure you have a working compose. > > If you have a nightly where you know it installs fine on ppc you > can use that too. > > > Since this bug is a qemu bug I think it's not related to the kernel version > > Correct. > > > and I've retested with the newest version and it seems ok to me > > host kernel: 4.18.0-275.el8.ppc64le > > guest kernel: 4.18.0-275.el8.ppc64le > > qemu version: qemu-kvm-5.2.0-3.module+el8.4.0+9499+42e58f08.ppc64le > > Thanks. Hi Gerd, Thanks for your reply, cause this bug was in "assigned" status for a really long time, If you sure about the patch was merged, I'm curious about why you don't turn the status into "modified" or "ON_QA"? If you do be sure about this bug was fixed in the newest qemu package, I could turn it to "close current release", just wanna make sure there's no risk that users would meet it > Hi Gerd,
> Thanks for your reply, cause this bug was in "assigned" status for a really
> long time, If you sure about the patch was merged, I'm curious about why you
> don't turn the status into "modified" or "ON_QA"? If you do be sure about
> this bug was fixed in the newest qemu package, I could turn it to "close
> current release", just wanna make sure there's no risk that users would meet
> it
Usual process to handle that is set fixed field and status to post,
then those bugs will automatically handled on rebase.
Forgot to do that, did it now.
Not fully sure this actually works if the rebase did happen already, @areis?
@kraxel, cause the internal target release was not set, I could not set the ITM verified this bug with: hsot kernel: 4.18.0-280.el8.ppc64le guest kernel: 4.18.0-280.el8.ppc64le qemu version: qemu-img-5.2.0-4.module+el8.4.0+9676+589043b9.ppc64le Test result: guest could boot successfully without black or green screen display (In reply to bfu from comment #25) > @kraxel, cause the internal target release was not set, I could > not set the ITM areis did that menawhile (so clearing the needinfo). Hi, Since the issue described in this bug should be resolved (VERIFIED), could you please close this bug with resolution 'CURRENTRELEASE' if this bug got fixed ? If the fix for this is not released yet, check if this will ever get fixed. In case of a negative answer then please change it as WONTFIX. If there's anything else to be done on this BZ, if it's still active, not released yet and we actually intend to release it, then please ignore my message. Please note: for those bugs which are not included in errata, please add 'TestOnly' keyword, and those bugs with 'TestOnly' keyword will be closed automatically after GA. TestOnly: Use this when there is no code delivery involved, or for use when code is already upstream and will be incorporated automatically to the next release for testing purposes only. Thank you. > Since the issue described in this bug should be resolved (VERIFIED), could
> you please close this bug with resolution 'CURRENTRELEASE' if this bug got
> fixed ?
Done.
|
Created attachment 1715999 [details] screen1 Description of problem: Boot a guest with virtio-vga device, during loading kernel, guest hit blue screen. Version-Release number of selected component (if applicable): Host: 4.18.0-238.el8.ppc64le qemu-kvm-5.1.0-8.module+el8.3.0+8141+3cd9cd43.ppc64le SLOF-20200717-1.gite18ddad8.module+el8.3.0+7638+07cf13d2.noarch Guest: 4.18.0-240.el8.ppc64le How reproducible: 100% Steps to Reproduce: 1.Boot a guest with qemu cli: [root@ibm-p9b-20 home]# cat test.sh /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine pseries \ -nodefaults \ -device virtio-vga,id=video0,max_outputs=1,bus=pci.0,addr=0x1 \ -m 4096 \ -smp 80,maxcpus=80,cores=40,threads=1,sockets=2 \ -cpu 'host' \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 \ -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 \ -device virtserialport,bus=virtio-serial0.0,chardev=qga0,name=org.qemu.guest_agent.0 \ -chardev socket,path=/var/tmp/monitor-qmpmonitor1,nowait,id=qmp_id_qmpmonitor1,server \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,path=/var/tmp/serial-serial0,nowait,id=chardev_serial0,server \ -device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 \ -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device usb-kbd,id=usb-kbd1,bus=usb1.0,port=2 \ -device usb-mouse,id=usb-mouse1,bus=usb1.0,port=3 \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \ -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/xianwang/rhel830-ppc64le-virtio-scsi_p9.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device virtio-net-pci,mac=9a:58:80:27:08:7c,id=idji2KPU,netdev=idYBxx2l,bus=pci.0,addr=0x5 \ -netdev tap,id=idYBxx2l,vhost=on \ -vnc :11 \ -rtc base=utc,clock=host \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -qmp tcp:0:8881,server,nowait \ -monitor stdio 2. 3. Actual results: During guest boot phrase, it hit screen, but guest boot successfully. As attachment. Expected results: Guest boot successfully without any blue screen. Additional info: Guest boot successfully and without blue screen with "-device VGA,id=vga1,bus=pci.0,addr=0x1"