Bug 1267256

Summary: do not crash if a machine config in /etc/libvirt is missing a machine type
Product: Red Hat Enterprise Linux 7 Reporter: Stefan Assmann <sassmann>
Component: libvirtAssignee: Ján Tomko <jtomko>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: bloch, cbuissar, dyuan, fjin, jsuchane, mzhan
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.3.2-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 18:25:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
abrt report none

Description Stefan Assmann 2015-09-29 13:10:46 UTC
Description of problem:
After installing the latest RHEL7.2 build (RHEL-7.2-20150928) I've cp'ed my kvm guest over to the machine, imported it and tried to start it. Then I get a segfault.

# virt-install --import --print-xml --disk path=/var/lib/libvirt/images/rhel7-64-kvm.img --name rhel7-64-kvm --ram 1024 --vcpus=1 > /etc/libvirt/qemu/rhel7-64-kvm.xml ; service libvirtd restart
# virsh start rhel7-64-kvm
error: Failed to start domain rhel7-64-kvm
error: End of file while reading data: Input/output error
libvirtd[2050]: segfault at 0 ip 00007f7b046e2b07 sp 00007f7b0ce80d88 error 4 in libvirt_driver_qemu.so[7f7b046790]

# cat /var/log/libvirt/qemu/rhel7-64-kvm.log
empty

I copied over the guest image multiple times and also tried it on a 7.1 system. It worked fine there.

Version-Release number of selected component (if applicable):
libvirt-1.2.17-11.el7.x86_64

How reproducible:
always (dell-per720-03.klab.eng.bos.redhat.com)

Comment 2 Stefan Assmann 2015-09-29 13:49:03 UTC
Created attachment 1078349 [details]
abrt report

Comment 4 Ján Tomko 2015-09-29 14:21:15 UTC
Program terminated with signal 11, Segmentation fault.
#0  0x00007f04f5434b07 in qemuDomainMachineIsI440FX (def=def@entry=0x7f04ec241ef0) at qemu/qemu_domain.c:3325
3325        return (STREQ(def->os.machine, "pc") ||
(gdb) p def->os.machine
$1 = 0x0
It seems the machine type is empty.

I see the problem now, this output redirection writes directly into libvirt's domain config directory:
> /etc/libvirt/qemu/rhel7-64-kvm.xml

Please do not mess with internal libvirt configuration and use libvirt's APIs to define new machines (virt-install should take care of that, as should virsh define)

The following commit skipped checking the machine type completely:
commit f1a89a8b6d1a1097e41a171a13b1984b06e8ab3e
Author:     Cole Robinson <crobinso>
CommitDate: 2015-04-20 16:36:35 -0400

    domain: conf: Don't validate VM ostype/arch at daemon startup
    
    When parsing XML, we validate the passed ostype + arch combo against
    the detected hypervisor capabilities. This has led to the following
    problem:
    
    - Define x86 qemu guest
    - qemu is inadvertently removed from the host
    - libvirtd is restarted. fails to parse VM config since arch is removed
    - 'virsh list --all' is now empty, user is wondering where their VMs went
    
    Add a new internal flag VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS. Use
    it when loading VM and snapshot configs from disk.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1043572

git describe: v1.2.14-230-gf1a89a8 contains: v1.2.15-rc1~109
However we can still check for the presence of a machine string and refuse to parse such invalid config, instead of crashing.

Comment 5 Stefan Assmann 2015-09-29 14:26:04 UTC
Thanks Ján for the update.

This worked for me on 7.0 and 7.1. Can you make a suggestion for an alternate set of commands to import the image?

Comment 6 Ján Tomko 2015-09-29 14:33:48 UTC
Removing --print-xml (and the redirection) should do the trick. If you don't want virt-install to open the console and just import the machine to libvirt, add --noautoconsole.

Comment 7 Stefan Assmann 2015-09-29 15:06:30 UTC
Confirming that this work. Thanks again!

Comment 8 Ján Tomko 2015-10-16 10:32:16 UTC
*** Bug 1272364 has been marked as a duplicate of this bug. ***

Comment 9 Ján Tomko 2016-02-11 12:08:20 UTC
Upstream patch:
https://www.redhat.com/archives/libvir-list/2016-February/msg00583.html

Comment 10 Fangge Jin 2016-02-17 08:48:16 UTC
Reproduce steps:

0. Install libvirt
# # rpm -q libvirt
libvirt-1.2.17-11.el7.x86_64

1. Prepare a guest xml in directory /etc/libvirt/qemu/ :
# cat /etc/libvirt/qemu/rhel7.2-released.xml 
..
<domain type='kvm'>
...
  <os>
    <type arch='x86_64'>hvm</type> ====> without machine type
...

2. Restart libvirtd

3. list domains:
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rhel7.2-released               shut off

4. Try to start domain, libvirtd crashed:
# virsh start rhel7.2-released
error: Failed to start domain rhel7.2-released
error: End of file while reading data: Input/output error

Comment 11 Ján Tomko 2016-02-18 15:26:51 UTC
Fixed upstream by:
commit 55e6d8cd9eac7eb2aaa4d221585e9402cf7269d5
Author:     Ján Tomko <jtomko>
CommitDate: 2016-02-18 16:19:39 +0100

    Error out on missing machine type in machine configs
    
    Commit f1a89a8 allowed parsing configs from /etc/libvirt
    without validating the emulator capabilities.
    
    Check for the presence of os->type.machine even if the
    VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS flag is set,
    otherwise the daemon can crash on carelessly crafted input
    in the config directory.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1267256

git describe: v1.3.1-307-g55e6d8c

Comment 12 Ján Tomko 2016-02-25 12:31:09 UTC
The patch breaks persistent LXC domains:
https://www.redhat.com/archives/libvir-list/2016-February/msg01228.html

Comment 13 Ján Tomko 2016-02-26 09:45:20 UTC
New fix pushed:
commit 21b316f4d351859d9ccbf8a20199f7e8707fd51d
Author:     Ján Tomko <jtomko>
CommitDate: 2016-02-26 10:32:31 +0100

    qemu: error out on missing machine type in configs
    
    Commit f1a89a8 allowed parsing configs from /etc/libvirt
    without validating the emulator capabilities.
    
    Check for the presence of a machine type in the qemu driver's
    post parse function instead of crashing.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1267256

git describe: v1.3.2-rc1-17-g21b316f

Comment 15 Fangge Jin 2016-04-13 08:54:57 UTC
Verify pass on build libvirt-1.3.3-1.el7.x86_64


Scenario 1: qemu guest
1. Prepare a guest xml in directory /etc/libvirt/qemu/ :
#  cat /etc/libvirt/qemu/rhel7.2-1030.xml 
..
<domain type='kvm'>
  <name>rhel7.2-1030</name>
...
  <os>
    <type arch='x86_64'>hvm</type> ====> without machine type
...
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>

...

2. Restart libvirtd
3. # virsh list --all
 Id    Name                           State
----------------------------------------------------
(domain rhel7.2-1030 is not listed)

4. Check libvirtd.log:
2016-04-13 08:41:23.039+0000: 9959: error : qemuDomainDefPostParse:1431 : internal error: missing machine type


Scenario 2: do regression test for lxc guest
0.Prepare a lxc guest xml:
# cat lxc.xml
<domain type='lxc'>
<name>application</name>
...
<os>
<type arch='x86_64'>exe</type>  ===>lxc guest has no machine type attribute
<init>/bin/sh</init>
</os>
...
<devices>
<emulator>/usr/libexec/libvirt_lxc</emulator>

1.
# virsh -c lxc:///
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # define /tmp/lxc.xml 
Domain application defined from /tmp/lxc.xml

virsh # list --all
 Id    Name                           State
----------------------------------------------------
 -     application                    shut off

Comment 16 Fangge Jin 2016-04-13 09:13:41 UTC
Continue of comment 15:

Scenario 2: do regression test for lxc guest
2.After define a lxc guest, dumpxml:
virsh # dumpxml application
<domain type='lxc'>
<name>application</name>
...
<os>
<type arch='x86_64'>exe</type>  ===>lxc guest has no machine type attribute
<init>/bin/sh</init>
</os>

3. Restart libvirtd service
4.
# virsh -c lxc:/// list --all
 Id    Name                           State
----------------------------------------------------
 -     application                    shut off

# virsh -c lxc:/// dumpxml application
<domain type='lxc'>
<name>application</name>
...
<os>
<type arch='x86_64'>exe</type>  ===>lxc guest has no machine type attribute
<init>/bin/sh</init>
</os>


Scenario 3: define a qemu guest from a xml without machine type
# cat /tmp/rhel7.2-1030.xml
<domain type='kvm'>
  <name>rhel7.2-1030</name>
...
  <os>
    <type arch='x86_64'>hvm</type>
...
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>

# virsh define /tmp/rhel7.2-1030.xml

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rhel7.2-1030                   shut off

# virsh dumpxml rhel7.2-1030
 <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.2.0'>hvm</type> ===> machine type is added automatically

Comment 18 errata-xmlrpc 2016-11-03 18:25:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html