In the last few months I've seen openQA tests sometimes failing because the console display in the VM got messed up. What seems to be going on is, when the test does a VT switch, the display gets corrupted; instead of the display being properly cleared to black and the login prompt of the new VT showing up, the current contents of the display either aren't cleared at all or get sort of messed up, and the login prompt on the new VT is drawn over top of the existing screen content. I'll attach screenshots and videos of this happening. The tests run using qemu with qxl as the video adapter and VNC (not SPICE) as the server used for the controller process to 'see' the video from the VM. The openQA worker boxes currently run F28 and are updated to latest stable periodically. The earliest occurrence of this problem I've found so far was on 2018-07-31, though I haven't looked comprehensively, there *may* be an earlier one. The openQA worker boxes were upgraded from F27 to F28 on 2018-07-12, which incorporated an update to qemu 2.11.2 and a kernel update from 4.16.17-200.fc27 to 4.17.4-200.fc28.x86_64. I don't really see any other update between 2018-07-12 and 2018-07-31 that'd be relevant (there was a kernel update, but the system wasn't rebooted, so it didn't take any effect till much later).
Created attachment 1493100 [details] screenshot after the bug happened (case where pre-existing content is corrupted but not fully cleared)
Created attachment 1493101 [details] screenshot after the bug happened (case where new VT contents seems to be just drawn over previous VT contents without clear or corruption)
Videos are too large to attach, but can be found at the following URLs for a while at least until they get garbage-collected. Corruption case: https://openqa.fedoraproject.org/tests/292996/file/video.ogv (bug happens around 2:39) No corruption, no clear case: https://openqa.fedoraproject.org/tests/279262/file/video.ogv (bug happens around 2:02)
CCing some spice+graphics folks Gerd does this type of graphical corruption narrow it down in any way?
Hmm, never seen this before. Host kernel should not matter. Bug might be in qemu, or spice (unlikely though), or guest kernel. Does it happen with all guests?
It's happening on F27, F28, F29 and Rawhide tests, yeah, and it doesn't seem like it appeared first for Rawhide, then 29, then 28, then 27 (as you'd sort of expect if it was a guest-side issue). I'm going to test switching staging to 'std' instead of 'qxl' graphics and see if it seems like this bug stops happening. As it's an intermittent bug, I'll need a few days' worth of data to be sure whether that changes things.
This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
(In reply to Adam Williamson from comment #6) > It's happening on F27, F28, F29 and Rawhide tests, yeah, and it doesn't seem > like it appeared first for Rawhide, then 29, then 28, then 27 (as you'd sort > of expect if it was a guest-side issue). Any change when running a 5.1 guest kernel?
Seems I did indeed switch openQA to 'std' graphics for almost all cases at some point (probably in response to this), and indeed haven't been having this problem since doing so. So...I don't know. Sorry :/ I've run into so many different graphics issues with openQA tests at this point I keep forgetting what I set to what to avoid what...I could try setting staging back to qxl for a bit to see if this is still happening, I guess.
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
FWIW, we just ran into a showstopper with std: https://bugzilla.redhat.com/show_bug.cgi?id=1732113 so I'm gonna switch back to qxl and see how that goes. Guess we'll find out if this bug is still happening too.
Huh, well, one immediate result of switching back to qxl was this: https://openqa.stg.fedoraproject.org/tests/574574#step/_console_wait_login/8 note the way the bootsplash hasn't cleared properly. Looks a bit like a similar bug we ran into when we tried virtio, actually: https://bugzilla.redhat.com/show_bug.cgi?id=1403365
Hmm, doesn't reproduce on a quick try. How does openqa generate the screenshots?
Oh, I should mention, it doesn't happen all the time, in fact it seems pretty rare (haven't spotted another case since then). openQA gets the screenshots from the VNC stream, I think.
(In reply to Adam Williamson from comment #14) > Oh, I should mention, it doesn't happen all the time, in fact it seems > pretty rare (haven't spotted another case since then). > > openQA gets the screenshots from the VNC stream, I think. The screenshot looks like it could be a vnc problem. A reliable reproducer would be very helpful to pin it down though.