Bug 1441646
Summary: | Level-2 guest boot crashes libvirtd due to NULL vendor field in 'qemu64' CPU model | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Kashyap Chamarthy <kchamart> | ||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
Status: | CLOSED ERRATA | QA Contact: | Jing Qi <jinqi> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.4 | CC: | dyuan, jsuchane, kchamart, lhuang, mtessun, rbalakri, xuzhang, yalzhang | ||||
Target Milestone: | rc | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-3.2.0-2.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1441655 (view as bug list) | Environment: | |||||
Last Closed: | 2017-08-02 00:05:54 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1441655 | ||||||
Attachments: |
|
Description
Kashyap Chamarthy
2017-04-12 12:04:22 UTC
Created attachment 1271121 [details]
GDB traceback of libvirtd during guest boot
The root cause is: The cause for the crash is: Upon a guest boot, if you copy host vendor CPUID to the guest CPU, libvirtd would crash if that host CPU had a NULL vendor field. Indeed, from another GDB session, we could see the 'vendor_id' to be '0x0' ----- [...] (gdb) p host $4 = (virCPUDef *) 0x7fa38c1d2930 (gdb) p* host $5 = {type = 0, mode = 0, match = 0, arch = VIR_ARCH_X86_64, model = 0x7fa38c1d3510 "qemu64", vendor_id = 0x0, fallback = 0, vendor = 0x7fa38c1d34f0 "AMD", sockets = 4, cores = 1, threads = 1, nfeatures = 25, nfeatures_max = 0, features = 0x7fa38c1d35d0} (gdb) p *cpu $6 = {type = 1, mode = 1, match = 1, arch = VIR_ARCH_NONE, model = 0x7fa3c4014ab0 "qemu64", vendor_id = 0x0, fallback = 0, vendor = 0x7fa3c40149f0 "AMD", sockets = 1, cores = 1, threads = 1, nfeatures = 25, nfeatures_max = 25, features = 0x7fa3c400f5b0} (gdb) down #2 0x00007fa3ff4dd917 in x86Compute (host=<optimized out>, cpu=0x7fa3c400eea0, guest=0x7fa3eeb52360, message=<optimized out>) at cpu/cpu_x86.c:1604 1604 virCPUx86DataAddCPUID(&guest_model->data, [...] ----- After a GDB session with Jiri Denemark (thanks!), he identified the commit that fixed it upstream libvirt: $ git show 541e9ae6d4 commit 541e9ae6d4290b9004ed73648ea663563b329b3d Author: Jim Fehlig <jfehlig> Date: Fri Aug 5 15:23:47 2016 -0600 cpu_x86: fix libvirtd crash when host cpu vendor is not available When starting a guest and copying host vendor cpuid to the guest cpu, libvirtd would crash if the host cpu contained a NULL vendor field. Avoid the crash by checking for a valid vendor in the host cpu before copying the cpuid to the guest cpu. For completeness, here is a backtrace from the crash (gdb) bt f0 0x00007ffff739bf33 in x86DataCpuid (cpuid=0x8, cpuid=0x8, data=data@entry=0x7fffb800ee78) at cpu/cpu_x86.c:287 f1 virCPUx86DataAddCPUID (data=data@entry=0x7fffb800ee78, cpuid=0x8) at cpu/cpu_x86.c:355 f2 0x00007ffff739ef47 in x86Compute (host=<optimized out>, cpu=0x7fffb8000cc0, guest=0x7fffecca7348, message=<optimized out>) at cpu/cpu_x86.c:1580 f3 0x00007fffd2b38e53 in qemuBuildCpuModelArgStr (migrating=false, hasHwVirt=<synthetic pointer>, qemuCaps=0x7fffb8001040, buf=0x7fffecca7360, def=0x7fffc400ce20, driver=0x1c) at qemu/qemu_command.c:6283 f4 qemuBuildCpuCommandLine (cmd=cmd@entry=0x7fffb8002f60, driver=driver@entry=0x7fffc80882c0, def=def@entry=0x7fffc400ce20, qemuCaps=qemuCaps@entry=0x7fffb8001040, migrating=<optimized out>) at qemu/qemu_command.c:6445 (gdb) f2 (gdb) p *host_model $23 = {name = 0x7fffb800ec50 "qemu64", vendor = 0x0, signature = 0, data = { len = 2, data = 0x7fffb800e720}} diff --git a/src/cpu/cpu_x86.c b/src/cpu/cpu_x86.c index 670b02e..ee5b57d 100644 --- a/src/cpu/cpu_x86.c +++ b/src/cpu/cpu_x86.c @@ -1592,7 +1592,7 @@ x86Compute(virCPUDefPtr host, if (!(guest_model = x86ModelCopy(host_model))) goto error; - if (cpu->vendor && + if (cpu->vendor && host_model->vendor && virCPUx86DataAddCPUID(&guest_model->data, &host_model->vendor->cpuid) < 0) goto error; *** Bug 1441655 has been marked as a duplicate of this bug. *** Some more details about this bug... libvirt stores its CPU model definitions in cpu_map.xml (installed in /usr/share/libvirt), where some models (usually older or artificial) are not defined with a specific <vendor>...</vendor> element. If libvirt decides to use one of these models as the model which best describes the host CPU, it will crash everytime it tries to start a domain. So while this can easily be reproduced in a nested environment (it's trivial to change the host CPU nested libvirt will see), it is not completely impossible to hit this bug with a real hardware. Although the CPU would need to be either pretty old or very strange. Verified with libvirt-3.2.0-4.el7.x86_64 and qemu-kvm-rhev-2.9.0-3.el7.x86_64 in host . L1 xml is as below- <cpu mode='host-passthrough'> <model fallback='allow'/> </cpu> L2 xml: <cpu mode='host-model'> <model fallback='allow'/> </cpu> L2 vm can be started successfully. Verified with libvirt-3.2.0-4.el7.x86_64 and qemu-kvm-rhev-2.9.0-3.el7.x86_64 in host . L1 xml is as below- <cpu mode='host-passthrough'> <model fallback='allow'/> </cpu> L2 xml: <cpu mode='host-model'> <model fallback='allow'/> </cpu> L2 vm can be started successfully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1846 |