RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1441646 - Level-2 guest boot crashes libvirtd due to NULL vendor field in 'qemu64' CPU model
Summary: Level-2 guest boot crashes libvirtd due to NULL vendor field in 'qemu64' CPU ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Jing Qi
URL:
Whiteboard:
: 1441655 (view as bug list)
Depends On:
Blocks: 1441655
TreeView+ depends on / blocked
 
Reported: 2017-04-12 12:04 UTC by Kashyap Chamarthy
Modified: 2017-08-02 01:30 UTC (History)
8 users (show)

Fixed In Version: libvirt-3.2.0-2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1441655 (view as bug list)
Environment:
Last Closed: 2017-08-02 00:05:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
GDB traceback of libvirtd during guest boot (12.88 KB, text/plain)
2017-04-12 12:07 UTC, Kashyap Chamarthy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1846 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2017-08-01 18:02:50 UTC

Description Kashyap Chamarthy 2017-04-12 12:04:22 UTC
Description of problem
----------------------

In a nested virtualization environment, when booting a level-2 guest
with CPU mode as 'host-model', libvirt daemon in level-1 guest
crashes (SIGSEGV).


Version
-------

L0:

$ uname -r; rpm -q libvirt qemu-kvm
3.10.0-514.el7.x86_64
libvirt-2.0.0-10.el7.x86_64
qemu-kvm-1.5.3-126.el7.x86_64

L1:

$ uname -r; rpm -q libvirt qemu-kvm-rhev
3.10.0-514.10.2.el7.x86_64
libvirt-2.0.0-10.el7_3.6.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64


How reproducible: Consistently.


Steps to Reproduce
------------------

In a nested KVM environment (instructions are Boot a level-2 RHEL 7.3
guest with:

  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>

Actual results
--------------

Guest fails to boot, libvirtd crashes, with (from GDB analysis):

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fa3eeb53700 (LWP 177029)]
0x00007fa3ff4da823 in x86DataCpuid (cpuid=0x8, cpuid=0x8, data=data@entry=0x7fa3c4016c88) at cpu/cpu_x86.c:287
287     for (i = 0; i < data->len; i++) {
#0  0x00007fa3ff4da823 in x86DataCpuid (cpuid=0x8, cpuid=0x8, data=data@entry=0x7fa3c4016c88) at cpu/cpu_x86.c:287
#1  virCPUx86DataAddCPUID (data=data@entry=0x7fa3c4016c88, cpuid=0x8) at cpu/cpu_x86.c:355
#2  0x00007fa3ff4dd917 in x86Compute (host=<optimized out>, cpu=0x7fa3c400eea0, guest=0x7fa3eeb52360, message=<optimized out>) at cpu/cpu_x86.c:1604
#3  0x00007fa3d6a07843 in qemuBuildCpuModelArgStr (driver=driver@entry=0x7fa38c0f56c0, def=def@entry=0x7fa3dc00ddc0, buf=buf@entry=0x7fa3eeb524a0, qemuCaps=qemuCaps@entry=0x7fa3c4000bb0, 
[...]

Expected results
----------------

Guest boot suceeds, and libvirtd does not crash.

Comment 2 Kashyap Chamarthy 2017-04-12 12:07:49 UTC
Created attachment 1271121 [details]
GDB traceback of libvirtd during guest boot

Comment 3 Kashyap Chamarthy 2017-04-12 12:12:16 UTC
The root cause is: The cause for the crash is: Upon a guest boot, if you copy host vendor CPUID to the guest CPU, libvirtd would crash if that host CPU had a NULL vendor field.  Indeed, from another GDB session, we could see the 'vendor_id' to be '0x0'

-----
[...]
(gdb) p host
$4 = (virCPUDef *) 0x7fa38c1d2930
(gdb) p* host
$5 = {type = 0, mode = 0, match = 0, arch = VIR_ARCH_X86_64, model = 0x7fa38c1d3510 "qemu64", vendor_id = 0x0, fallback = 0, vendor = 0x7fa38c1d34f0 "AMD", sockets = 4, cores = 1, 
  threads = 1, nfeatures = 25, nfeatures_max = 0, features = 0x7fa38c1d35d0}
(gdb) p *cpu
$6 = {type = 1, mode = 1, match = 1, arch = VIR_ARCH_NONE, model = 0x7fa3c4014ab0 "qemu64", vendor_id = 0x0, fallback = 0, vendor = 0x7fa3c40149f0 "AMD", sockets = 1, cores = 1, 
  threads = 1, nfeatures = 25, nfeatures_max = 25, features = 0x7fa3c400f5b0}
(gdb) down
#2  0x00007fa3ff4dd917 in x86Compute (host=<optimized out>, cpu=0x7fa3c400eea0, guest=0x7fa3eeb52360, message=<optimized out>) at cpu/cpu_x86.c:1604
1604                virCPUx86DataAddCPUID(&guest_model->data,
[...]
-----


After a GDB session with Jiri Denemark (thanks!), he identified the commit that fixed it upstream libvirt:

$ git show 541e9ae6d4
commit 541e9ae6d4290b9004ed73648ea663563b329b3d
Author: Jim Fehlig <jfehlig>
Date:   Fri Aug 5 15:23:47 2016 -0600

    cpu_x86: fix libvirtd crash when host cpu vendor is not available
    
    When starting a guest and copying host vendor cpuid to the guest
    cpu, libvirtd would crash if the host cpu contained a NULL vendor
    field. Avoid the crash by checking for a valid vendor in the host
    cpu before copying the cpuid to the guest cpu.
    
    For completeness, here is a backtrace from the crash
    
    (gdb) bt
    f0  0x00007ffff739bf33 in x86DataCpuid (cpuid=0x8, cpuid=0x8,
        data=data@entry=0x7fffb800ee78) at cpu/cpu_x86.c:287
    f1  virCPUx86DataAddCPUID (data=data@entry=0x7fffb800ee78, cpuid=0x8)
        at cpu/cpu_x86.c:355
    f2  0x00007ffff739ef47 in x86Compute (host=<optimized out>, cpu=0x7fffb8000cc0,
        guest=0x7fffecca7348, message=<optimized out>) at cpu/cpu_x86.c:1580
    f3  0x00007fffd2b38e53 in qemuBuildCpuModelArgStr (migrating=false,
        hasHwVirt=<synthetic pointer>, qemuCaps=0x7fffb8001040, buf=0x7fffecca7360,
        def=0x7fffc400ce20, driver=0x1c) at qemu/qemu_command.c:6283
    f4  qemuBuildCpuCommandLine (cmd=cmd@entry=0x7fffb8002f60,
        driver=driver@entry=0x7fffc80882c0, def=def@entry=0x7fffc400ce20,
        qemuCaps=qemuCaps@entry=0x7fffb8001040, migrating=<optimized out>)
        at qemu/qemu_command.c:6445
    (gdb) f2
    (gdb) p *host_model
    $23 = {name = 0x7fffb800ec50 "qemu64", vendor = 0x0, signature = 0, data = {
        len = 2, data = 0x7fffb800e720}}

diff --git a/src/cpu/cpu_x86.c b/src/cpu/cpu_x86.c
index 670b02e..ee5b57d 100644
--- a/src/cpu/cpu_x86.c
+++ b/src/cpu/cpu_x86.c
@@ -1592,7 +1592,7 @@ x86Compute(virCPUDefPtr host,
         if (!(guest_model = x86ModelCopy(host_model)))
             goto error;
 
-        if (cpu->vendor &&
+        if (cpu->vendor && host_model->vendor &&
             virCPUx86DataAddCPUID(&guest_model->data,
                                   &host_model->vendor->cpuid) < 0)
             goto error;

Comment 4 Peter Krempa 2017-04-12 12:22:22 UTC
*** Bug 1441655 has been marked as a duplicate of this bug. ***

Comment 7 Jiri Denemark 2017-04-12 15:30:35 UTC
Some more details about this bug... libvirt stores its CPU model definitions in cpu_map.xml (installed in /usr/share/libvirt), where some models (usually older or artificial) are not defined with a specific <vendor>...</vendor> element. If libvirt decides to use one of these models as the model which best describes the host CPU, it will crash everytime it tries to start a domain.

So while this can easily be reproduced in a nested environment (it's trivial to change the host CPU nested libvirt will see), it is not completely impossible to hit this bug with a real hardware. Although the CPU would need to be either pretty old or very strange.

Comment 13 Jing Qi 2017-05-12 05:37:28 UTC
Verified with libvirt-3.2.0-4.el7.x86_64 and qemu-kvm-rhev-2.9.0-3.el7.x86_64 in host .
L1 xml is as below-
  <cpu mode='host-passthrough'>
    <model fallback='allow'/>
  </cpu>

L2 xml:

<cpu mode='host-model'>
    <model fallback='allow'/>
 </cpu>

L2 vm can be started successfully.

Comment 14 Jing Qi 2017-05-12 05:38:03 UTC
Verified with libvirt-3.2.0-4.el7.x86_64 and qemu-kvm-rhev-2.9.0-3.el7.x86_64 in host .
L1 xml is as below-
  <cpu mode='host-passthrough'>
    <model fallback='allow'/>
  </cpu>

L2 xml:

<cpu mode='host-model'>
    <model fallback='allow'/>
 </cpu>

L2 vm can be started successfully.

Comment 15 errata-xmlrpc 2017-08-02 00:05:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 16 errata-xmlrpc 2017-08-02 01:30:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846


Note You need to log in before you can comment on or make changes to this bug.