RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1335830 - Kexecing RHEL7 into RHEL6 fails with CIRRUS video type (KVM/QEMU)
Summary: Kexecing RHEL7 into RHEL6 fails with CIRRUS video type (KVM/QEMU)
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kexec-tools
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: pre-dev-freeze
: 7.4
Assignee: Pingfan Liu
QA Contact: Emma Wu
URL:
Whiteboard:
Depends On:
Blocks: 1334477 1394638 1473055
TreeView+ depends on / blocked
 
Reported: 2016-05-13 10:30 UTC by Lukas Zapletal
Modified: 2020-05-14 15:11 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-15 06:58:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
stdvga fix (1.28 KB, patch)
2017-03-14 16:08 UTC, Gerd Hoffmann
no flags Details | Diff
partial cirrus support (2.80 KB, patch)
2017-03-14 16:10 UTC, Gerd Hoffmann
no flags Details | Diff

Description Lukas Zapletal 2016-05-13 10:30:57 UTC
Hello,

1) In RHEV 3.6+ create new VM with VNC screen (do not use SPICE - it works)
2) Install or run RHEL 7.0 from an image
3) Download Anaconda initram and kernel from RHEL 6.0 kickstart repository on the guest (it must be RHEL version 6.x not 7.x)
4) Install kexec-tools
5) Run: kexec  -d --force ./initrd.img ./vmlinuz

The system freezes.

Expected behavior:

You see Anaconda initializing network devices and trying to download kickstart or welcome screen (depending on the kernel command line options).

Satellite 6 uses kexec to provision systems on PXE/DHCP-less networks

Comment 2 Michal Skrivanek 2016-05-14 07:30:13 UTC
so the only difference between working and non-working setup is VNC vs SPICE? 
Worth trying with VNC on QXL (in UI), SPICE+VNC setting, or plain VNC and use vga instead of cirrus(changed in 4.0 by default, or use a simple vdsm hook to replace cirrus with vga on VM start)

Comment 3 Lukas Zapletal 2016-05-17 09:16:57 UTC
Reproduced with new VM with defaults from RHEV:

OS: RHEL7
Video Type: CIRRUS
Graphics Protocol: VNC

Commands:

curl http://download.englab.brq.redhat.com/pub/rhel/released/RHEL-6/6.8/Server/x86_64/os/images/pxeboot/initrd.img -o initrd.img
curl http://download.englab.brq.redhat.com/pub/rhel/released/RHEL-6/6.8/Server/x86_64/os/images/pxeboot/vmlinuz -o vmlinuz
kexec --debug --initrd initrd.img vmlinuz

Switching Video Type to QXL solved the problem. Looks like it does not matter if Graphic Protocol is VNC or Spice, the problem is the Video Type.

Tried --reset-vga kexec option without any luck.

Comment 6 Lukas Zapletal 2016-05-17 10:44:55 UTC
UPDATED REPRODUCER:

1) Create new VM with CIRRUS video type
2) Install or run RHEL 7.x from an image
3) Download Anaconda initram and kernel from RHEL 6.0 kickstart repository on the guest (it must be RHEL version 6.x not 7.x)
4) Install kexec-tools
5) Run: kexec  -d --force ./initrd.img ./vmlinuz

Here is the snippet to run:

curl http://download.englab.brq.redhat.com/pub/rhel/released/RHEL-6/6.8/Server/x86_64/os/images/pxeboot/initrd.img -o initrd.img
curl http://download.englab.brq.redhat.com/pub/rhel/released/RHEL-6/6.8/Server/x86_64/os/images/pxeboot/vmlinuz -o vmlinuz
kexec --debug --initrd initrd.img vmlinuz

We only identified CIRRUS as the problematic one. QEMU offers more drivers, please test them all when assuring quality: "vga", "cirrus", "vmvga", "xen", "vbox", "qxl" or "virtio". Thanks.

Comment 7 Michal Skrivanek 2016-05-17 10:54:36 UTC
we're switching to "vga" in 4.0, can you confirm (e.g. on you own non-rhev setup) that "vga" works and "cirrus" doesn't?

Comment 9 Lukas Zapletal 2016-05-17 14:25:14 UTC
Additional testing on RHEL 7.1 (no updates applied) kexecing RHEL 6.8 kernel.

QXL: PASS
Cirrus: FAIL (all black, console does not respond at all)
VGA: FAIL (console does not respond after "Starting new kernel" message)
VMVGA: PASS
XEN: N/A

Comment 10 Gerd Hoffmann 2016-05-17 18:59:13 UTC
My testing shows that only the vga console not functional, for all three vga cards (qxl, stdvga, cirrus).  qxl shows something, but the text mode font is screwed up so it is unreadable.  The system is doing fine though, serial console works, I expect a fully automatic install works too even though you can't watch it on the vga console.

The fundamental problem with the vga console is that RHEL-7 has kernel mode setting drivers for the qemu emulated cards (bochs-drm.ko for stdvga, cirrus.ko and qxl.ko), whereas RHEL-6 depends on the vgabios to handle the card.  kexec seems to be able to handle the vgabios handover from one kernel to the next, but apparently only in case both kernels are using the vgabios.

When forcing RHEL-7 into vgabios mode by blacklisting the kms driver module the vga console works fine in the RHEL-6 kernel after kexec.

There is nothing we can do in qemu to fix that.  The possible options I see are:

(1) remove the qemu kms drivers from whatever image satellite uses for kexec.

(2) maybe a quirk can be added to kexec to handle that case.

Reassigning to kexec-tools for comments on (2).

Comment 11 Dave Young 2016-05-18 01:58:57 UTC
Gerd, I think your analysis make sense to me but I do not have any clue how to add a quirk to kexec. If the first kernel uses kms then kexec will always fail unless the 2nd kernel also has kms driver so that it can be reinitialized. I believe kexec can do nothing.

Could you clarify a bit about (2) in your mind?

Comment 12 Dave Young 2016-05-18 02:00:56 UTC
Here is a bug for the kms/kexec issue
https://bugzilla.redhat.com/show_bug.cgi?id=1279013

Comment 13 Dave Young 2016-05-18 02:02:16 UTC
Correct myself in comment #11, kexec will always fail means kexec kernel graphics will not work...

Comment 14 Gerd Hoffmann 2016-05-18 07:10:58 UTC
(In reply to Dave Young from comment #11)
> Gerd, I think your analysis make sense to me but I do not have any clue how
> to add a quirk to kexec. If the first kernel uses kms then kexec will always
> fail unless the 2nd kernel also has kms driver so that it can be
> reinitialized.

Yes, kexec rhel7 -> rhel7 works fine for that reason.

> I believe kexec can do nothing.

> Could you clarify a bit about (2) in your mind?

There is --reset-vga.  Maybe create a variant of that which does not only reset the vga, but also initializes the vga to 80x25 text mode?

After skimming over bug 1279013 I suspect (1) is the better way though, and *all* kms drivers not only the qemu ones should be removed.

Comment 15 Dave Young 2016-05-18 08:18:31 UTC
(In reply to Gerd Hoffmann from comment #14)
> There is --reset-vga.  Maybe create a variant of that which does not only
> reset the vga, but also initializes the vga to 80x25 text mode?

I'm not sure --reset-vga helps, I remember I tested it with nvidia card before, it just hung. I think it may help things very limited but I will do more test.

> 
> After skimming over bug 1279013 I suspect (1) is the better way though, and
> *all* kms drivers not only the qemu ones should be removed.

Yes, for this bug, if kms drivers can be excluded or blacklisted it will be the best approach.

Thanks
Dave

Comment 16 Gerd Hoffmann 2016-05-18 12:55:40 UTC
(In reply to Dave Young from comment #15)
> (In reply to Gerd Hoffmann from comment #14)
> > There is --reset-vga.  Maybe create a variant of that which does not only
> > reset the vga, but also initializes the vga to 80x25 text mode?
> 
> I'm not sure --reset-vga helps, I remember I tested it with nvidia card
> before, it just hung. I think it may help things very limited but I will do
> more test.

--reset-vga works on the qemu vga cards.  Reset state is *not* vga text mode though, so the vga console still doesn't work.  Switching the vga into text mode should be easy though, at least for the qemu vga cards which act like classic standard vga cards from early 90ies when it comes to text mode.

Physical hardware is a different story.  On a modern gpu alot more than programming a bunch of registers with a hard-codes sequence must be done.  Scan outputs, figure where a display is connected, configure scanouts accordingly, setup laptop panel, ...

I suspect getting real hardware to work without running the vgabios is next to impossible, and re-initializing the gpu using vgabios as part of the kexec sequence sounds scary to me.

Comment 17 Dave Young 2016-05-20 01:47:03 UTC
> --reset-vga works on the qemu vga cards.  Reset state is *not* vga text mode
> though, so the vga console still doesn't work.  Switching the vga into text
> mode should be easy though, at least for the qemu vga cards which act like
> classic standard vga cards from early 90ies when it comes to text mode.
> 

I will try if I find time on it, and google about how to do it. If you have some links I can refer to it will be also appreciated.

> Physical hardware is a different story.  On a modern gpu alot more than
> programming a bunch of registers with a hard-codes sequence must be done. 
> Scan outputs, figure where a display is connected, configure scanouts
> accordingly, setup laptop panel, ...
> 
> I suspect getting real hardware to work without running the vgabios is next
> to impossible, and re-initializing the gpu using vgabios as part of the
> kexec sequence sounds scary to me.

Yes, totally agree, that is also the reason why we have not get it work for long time.

Thanks
Dave

Comment 18 Gerd Hoffmann 2016-05-20 06:13:51 UTC
(In reply to Dave Young from comment #17)
> > --reset-vga works on the qemu vga cards.  Reset state is *not* vga text mode
> > though, so the vga console still doesn't work.  Switching the vga into text
> > mode should be easy though, at least for the qemu vga cards which act like
> > classic standard vga cards from early 90ies when it comes to text mode.
> > 
> 
> I will try if I find time on it, and google about how to do it. If you have
> some links I can refer to it will be also appreciated.

vgabios source code used by qemu is here:

https://code.coreboot.org/p/seabios/source/tree/master/vgasrc/

Check stdvga_set_mode() in stdvgamodes.c

Comment 19 Dave Young 2016-05-20 06:43:38 UTC
Gerd, thank you, will have a look.

Comment 20 Lukas Zapletal 2016-05-23 14:50:33 UTC
For the record, I tried --reset-vga with Cirrus and it did not help.

I think blacklisting driver on the discovery image is an easy task which I can implement easily if that provides better user experience. Can you tell me exactly what should I blacklist? I suppose we won't miss any important functionality, console is only used for simple TUI to show discovery status (no performance interest or similar).

I can also add --reset-vga to the command line just for case. I can also force to the simple 80x25 text mode if that helps (when I was testing this, I remember I was not in this mode).

Or is there a way to force RHEL to boot into some kind of super-generic (framebuffer perhaps) driver that works everywhere? This could be win for us and we could perhaps also drop all the video hardware drivers from the image (size matters a lot here). Performance does not really matter here, we only need it to work on both bare metal and virtualization environments (all of them).

For the record, I filed the tickets upstream under:

http://projects.theforeman.org/issues/15144
http://projects.theforeman.org/issues/15145

Thanks for help!

Comment 21 Gerd Hoffmann 2016-05-24 05:50:56 UTC
> I think blacklisting driver on the discovery image is an easy task which I
> can implement easily if that provides better user experience. Can you tell
> me exactly what should I blacklist?

qemu drm drivers are: bochs-drm.ko, cirrus.ko, qxl.ko, virtio-gpu.ko

Given this is a problem on real hardware too (see bug 1279013) I'd suggest to blacklist everything below drivers/gpu/drm.

> I can also add --reset-vga to the command line just for case. I can also
> force to the simple 80x25 text mode if that helps (when I was testing this,
> I remember I was not in this mode).

Probably not helpful.

> Or is there a way to force RHEL to boot into some kind of super-generic
> (framebuffer perhaps) driver that works everywhere? This could be win for us
> and we could perhaps also drop all the video hardware drivers from the image
> (size matters a lot here).

Dropping all drm drivers should do the trick.  The system should continue to run in vga textmode then.  Or when running on UEFI continue to use the firmware framebuffer (efifb), which (as far I know) kexec can handover from one kernel to the next.

Comment 22 Lukas Zapletal 2016-05-24 08:46:25 UTC
Thank you very much, that indeed fixes the issue on Cirrus when kexecing RHEL 6.

Comment 27 Gerd Hoffmann 2017-03-14 16:08:22 UTC
Created attachment 1263010 [details]
stdvga fix

finally found some time to look at this.  patch gets stdvga going.

stdvga reset method works for qxl-vga too.

stdvga reset works partly for virtio-vga.  It manages to successfully reset the vga emulation, but doesn't switch back from virtio mode to vga compat mode.  That'll happen when the linux kernel virtio-pci driver resets all virtio devices, at which point the kernel messages start to appear on the vga text console.

Doing the virtio reset in purgatory requires a pci mmio bar write, doesn't look like purgatory has the infrastructure to do that easily ...

Comment 28 Gerd Hoffmann 2017-03-14 16:10:49 UTC
Created attachment 1263011 [details]
partial cirrus support

patch gets the cirrus back to text mode, memory access still seems to be in some weird mode though, boot messages appear somewhat scrambled.  But I'm tired for today ...

Comment 29 Dave Young 2017-03-23 09:11:10 UTC
Gerd, thanks for the fix, std-vga works for me except when use vga=788, after kexec reboot the window keeps the text mode size and kernel does not change to 788 framebuffer. But I think that is acceptable consider very few users use this.

Rethinking about this, the most important part is the real hardware case, so it may not worth more effort on emulated cards so for the cirrus, maybe we can just leave it as is, one can still use the workaround.

For real hardware if it is not possible or very hard then we have to give up, Bhupesh is taking this bug I hope Bhupesh can do some investigation see if we can do something for real hardware as well. Or we can just fix the qemu std-vga only.

Comment 32 Dave Young 2017-08-15 06:58:01 UTC
Rethink about this, we prefer not to fix it only for the cirrus reset. We can use the workaround mentioned before.


Note You need to log in before you can comment on or make changes to this bug.