Bug 2174095

Summary: [leapp] IPU 8>9: Fatal glibc error: CPU does not support x86-64-v2
Product: Red Hat Enterprise Linux 8 Reporter: Christophe Besson <cbesson>
Component: leapp-repositoryAssignee: David Kubek <dkubek>
Status: CLOSED ERRATA QA Contact: Martin KlusoĊˆ <mkluson>
Severity: low Docs Contact:
Priority: medium    
Version: 8.6CC: mmacura, pholica, pstodulk
Target Milestone: rcKeywords: Reproducer, Triaged, WorkAround
Target Release: ---Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: leapp-repository-0.18.0-5.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-14 15:35:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christophe Besson 2023-02-28 17:28:21 UTC
Description of problem:
Customer is experiencing an ABI error during the preupgrade step:

    Fatal glibc error: CPU does not support x86-64-v2
    Error in PREIN scriptlet in rpm package libutempter
    Error in POSTIN scriptlet in rpm package p11-kit-trust
    Error in PREIN scriptlet in rpm package ca-certificates

It appears the VMware hypervisor presents a recent CPU (cascade lake) to the guest but masks some instructions (sse4*, avx* among others).

Version-Release number of selected component (if applicable):
leapp-upgrade-el8toel9-0.17.0-3.el8.noarch

How reproducible:
Always

Steps to Reproduce:
1. Using the libvirt/kVM, setup a standard VM

2. Use `virsh edit <vm-name>` to mask a CPU instruction that is required by the x86-64-v2 standard

  <cpu mode='host-model' check='partial'>
    <model fallback="allow"/>
    <feature policy='disable' name='popcnt'/>
  </cpu>

3. Install RHEL 8.6, leapp, and then try a preupgrade

Actual results:
    Details: Command ['systemd-nspawn', '--register=no', '--quiet', '--keep-unit', '--capability=all', '-D', '/var/lib/leapp/scratch/mounts/root_/system_overlay', '--setenv=LEAPP_HOSTNAME=localhost.localdomain', '--setenv=LEAPP_EXPERIMENTAL=0', '--setenv=LEAPP_UNSUPPORTED=0', '--setenv=LEAPP_NO_RHSM=0', '--setenv=LEAPP_UPGRADE_PATH_TARGET_RELEASE=9.0', '--setenv=LEAPP_UPGRADE_PATH_FLAVOUR=default', '--setenv=LEAPP_IPU_IN_PROGRESS=8to9', '--setenv=LEAPP_EXECUTION_ID=c64cb486-5031-4d66-9e61-e3e9356d5a29', '--setenv=LEAPP_COMMON_TOOLS=:/etc/leapp/repos.d/system_upgrade/el8toel9/tools', '--setenv=LEAPP_COMMON_FILES=:/etc/leapp/repos.d/system_upgrade/common/files:/etc/leapp/repos.d/system_upgrade/el8toel9/files', 'dnf', 'install', '-y', '--nogpgcheck', '--setopt=module_platform_id=platform:el9', '--setopt=keepcache=1', '--releasever', '9.0', '--installroot', '/el9target', '--disablerepo', '*', '--enablerepo', 'rhel-9-for-x86_64-baseos-rpms', '--enablerepo', 'rhel-9-for-x86_64-appstream-rpms', 'dnf-command(config-manager)', 'dnf', '-v'] failed with exit code 1.
    Stderr: Host and machine ids are equal (03b541e558e14fa9b3690b7fc8366e58): refusing to link journals
            warning: Generating 18 missing index(es), please wait...
            Fatal glibc error: CPU does not support x86-64-v2
            Error in PREIN scriptlet in rpm package libutempter
            Error in POSTIN scriptlet in rpm package p11-kit-trust
            Error in PREIN scriptlet in rpm package ca-certificates
            Error in POSTIN scriptlet in rpm package libblkid
            Error in POSTIN scriptlet in rpm package systemd-libs
            Error in POSTIN scriptlet in rpm package util-linux-core
            Error in PREIN scriptlet in rpm package systemd
            Error in POSTIN scriptlet in rpm package dbus-common
            Error in PREIN scriptlet in rpm package dbus-broker
            Error in POSTIN scriptlet in rpm package elfutils-default-yama-scope
            Error unpacking rpm package gnupg2-2.3.3-2.el9_0.x86_64
            Error in PREIN scriptlet in rpm package tpm2-tss
            Error in POSTIN scriptlet in rpm package dnf
            Error in POSTTRANS scriptlet in rpm package filesystem
            Error in POSTTRANS scriptlet in rpm package rpm
            Error in <unknown> scriptlet in rpm package glibc-common
            Error in <unknown> scriptlet in rpm package glib2
            Error: Transaction failed


Expected results:
An inhibitor explaining the issue would be easier to apprehend than a glibc error during the DNF transaction.

Additional info:
 * Workaround is to change the CPU model from the hypervisor side.

 * https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level#background_of_the_x86_64_microarchitecture_levels

 * as per https://gitlab.com/x86-psABIs/x86-64-ABI
   (to be checked)

    x86-64 baseline: cmov cx8 fpu fxsr mmx osfxsr sce sse sse2
    x86-64-v2:       lm pni cx16 lahf_lm popcnt sse3 sse4_1 sse4_2 ssse3

* this KCS is related:

    RHEL 9 guest panic's during boot with following error "Fatal glibc error: CPU does not support x86-64-v2" 
    https://access.redhat.com/solutions/6833751

 * lscpu from customer

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           2
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      GenuineIntel
CPU family:          6
Model:               15
Model name:          Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz
BIOS Model name:     Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz
Stepping:            1
CPU MHz:             2394.374
BogoMIPS:            4788.74
Hypervisor vendor:   VMware
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc arch_perfmon nopl tsc_reliable nonstop_tsc cpuid pni ssse3 cx16 tsc_deadline_timer hypervisor lahf_lm pti tsc_adjust arat

Comment 1 Petr Stodulka 2023-03-01 12:13:05 UTC
Hi Chris, thanks for the report with the investigation and added sources!! You are right, we should add the check for the version of intel architecture. Fortunately the upgrade is failing during the preupgrade phases so the system is not negatively affected, but having the proper inhibitor will save a time for people investigating this problem in future. I will discuss the prioritisation of the fix for 8.9 release.

Comment 2 Christophe Besson 2023-03-01 14:48:04 UTC
Hi Petr, indeed that would help us to have something self-explanatory.
And yes, I forgot to mention in the description there is no risk of damage (severity could be lowered).
Note for users: RHEL 9 cannot boot anyway, kernel panics directly as init/systemd crashes instantly when these CPU features are masked.

Comment 3 Petr Stodulka 2023-05-18 10:00:56 UTC
The fix has been merged in the upstream:
  https://github.com/oamg/leapp-repository/pull/1059

Comment 9 errata-xmlrpc 2023-11-14 15:35:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (leapp-repository bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7013