RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2084568 - Disable 5-level page tables when using -cpu max
Summary: Disable 5-level page tables when using -cpu max
Keywords:
Status: CLOSED ERRATA
Alias: None
Deadline: 2022-08-15
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libguestfs
Version: 9.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Richard W.M. Jones
QA Contact: YongkuiGuo
URL:
Whiteboard:
: 2084567 (view as bug list)
Depends On:
Blocks: 2082806 2085527
TreeView+ depends on / blocked
 
Reported: 2022-05-12 12:43 UTC by Richard W.M. Jones
Modified: 2022-11-15 10:14 UTC (History)
7 users (show)

Fixed In Version: libguestfs-1.48.4-2.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2085527 (view as bug list)
Environment:
Last Closed: 2022-11-15 09:52:35 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-121851 0 None None None 2022-05-12 12:47:41 UTC
Red Hat Product Errata RHSA-2022:7958 0 None None None 2022-11-15 09:53:18 UTC

Description Richard W.M. Jones 2022-05-12 12:43:08 UTC
Description of problem:

In https://bugzilla.redhat.com/show_bug.cgi?id=2082806 we are
tracking an insidious qemu bug which intermittently prevents the
libguestfs appliance from starting.  The symptoms are that SeaBIOS
starts and displays its messages, but the kernel isn't reached.  We
found that the kernel does in fact start, but when it tries to set up
page tables and jump to protected mode it gets a triple fault which
causes the emulated CPU in qemu to reset (qemu exits).

This seems to only affect TCG (not KVM).

This is caused by using -cpu max which enables the "la57" feature
(5-level page tables[0]), and that we can make the problem go away
using -cpu max,la57=off.

Note this is only a workaround for bug 2082806.  We may in future
fix that bug properly (in qemu).  However the proposed workaround
for libguestfs should not have any negative effects.

This bug affects all versions of libguestfs that run qemu with
-cpu max or the libvirt equivalent <cpu mode="maximum"/>, which
includes RHEL 8.7 (not 8.6), and RHEL 9.0 and 9.1.

Version-Release number of selected component (if applicable):

libguestfs-1.44.0-6.el8

How reproducible:

100%

Steps to Reproduce:

These two commands test the libvirt and direct paths (which are
fixed separately):

while LIBGUESTFS_BACKEND_SETTINGS=force_tcg ./run libguestfs-test-tool  >&/tmp/log ; do echo -n . ; done

while LIBGUESTFS_BACKEND=direct LIBGUESTFS_BACKEND_SETTINGS=force_tcg ./run libguestfs-test-tool  >&/tmp/log ; do echo -n . ; done

Upstream bug fix:
https://listman.redhat.com/archives/libguestfs/2022-May/028853.html

Comment 1 Andrew Jones 2022-05-12 13:42:46 UTC
Does libguestfs need to use -cpu max at all? Or would always using the most basic (which hopefully means the most stable) cpu model work? I ask, because I have bug 2066824. My plan was to leave the cortex-a57 cpu model, as well as the max model, because the cortex-a57 should be the most basic/stable. I assumed libguestfs and other products that need TCG would prefer this stable model over 'max' as we don't have resources allocated to support TCG cpu models.

I do realize 'max' is nifty because it can be seamlessly used for both tcg and kvm. I'm just not sure how much of a requirement that seamlessness is. Also, if we start adding a list of features to disable from max, then I'm not 100% sure we'll maintain the seamless KVM support.

Comment 2 Richard W.M. Jones 2022-05-12 14:07:10 UTC
(In reply to Andrew Jones from comment #1)
> Does libguestfs need to use -cpu max at all? Or would always using the most
> basic (which hopefully means the most stable) cpu model work?

The surprising answer is yes, for two reasons, of which one is not obvious:

 - Better performance for RAID-y type stuff.  Probably makes no real difference
   with TCG, but matters with KVM and slightly simpler if we use the same path
   (in libguestfs) for both.

 - We need to run programs from the guest sometimes, and with RHEL 9 guest on
   RHEL 8 host that requires emulating at least x86_64-v2 (bug 2075424).
   -cpu max was one easy way to get this.

Prior to using max, we used -cpu host  or libvirt's host-model (KVM), or
-cpu qemu64 (TCG).

Note the above all applies to x86-64.

Source: https://github.com/libguestfs/libguestfs/blob/master/lib/appliance-cpu.c

> I ask, because
> I have bug 2066824. My plan was to leave the cortex-a57 cpu model, as well
> as the max model, because the cortex-a57 should be the most basic/stable. I
> assumed libguestfs and other products that need TCG would prefer this stable
> model over 'max' as we don't have resources allocated to support TCG cpu
> models.

On aarch64 we currently use -cpu host (KVM) or -cpu cortex-a57 (TCG), so
yes please leave that CPU!  I'm not sure why we're not using -cpu max though,
maybe aarch64 didn't / doesn't support it?

But for aarch64 we're not really focussed at the moment on performance, just
making it work.

> I do realize 'max' is nifty because it can be seamlessly used for both tcg
> and kvm. I'm just not sure how much of a requirement that seamlessness is.
> Also, if we start adding a list of features to disable from max, then I'm
> not 100% sure we'll maintain the seamless KVM support.

Comment 3 YongkuiGuo 2022-05-13 04:17:38 UTC
Tested the following packages on RHEL9.1 host:

qemu-kvm-7.0.0-3.el9.x86_64
libvirt-8.3.0-1.el9.x86_64
kernel-5.14.0-87.el9.x86_64
seabios-bin-1.16.0-2.el9.noarch
libguestfs-1.48.2-2.el9.x86_64


# while LIBGUESTFS_BACKEND_SETTINGS=force_tcg libguestfs-test-tool  >&/tmp/log; do echo -n . ; done
............................................................................................
# while LIBGUESTFS_BACKEND=direct LIBGUESTFS_BACKEND_SETTINGS=force_tcg libguestfs-test-tool  >&/tmp/log; do echo -n . ; done
......................................................................................................

The above two commands work well. This issue has been fixed.

Comment 4 Richard W.M. Jones 2022-05-13 14:22:44 UTC
*** Bug 2084567 has been marked as a duplicate of this bug. ***

Comment 8 YongkuiGuo 2022-06-22 10:53:05 UTC
Verified this bug since the test case for this bug has been automated and passed in the latest compose test.

Comment 14 errata-xmlrpc 2022-11-15 09:52:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Low: libguestfs security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7958


Note You need to log in before you can comment on or make changes to this bug.