801160 – managedsave+restart of <cpu mode='host-model'> VM crashes libvirtd

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 801160 - managedsave+restart of <cpu mode='host-model'> VM crashes libvirtd

Summary: managedsave+restart of <cpu mode='host-model'> VM crashes libvirtd

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	6.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Denemark
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-03-07 19:43 UTC by Eduardo Habkost
Modified:	2012-06-20 06:49 UTC (History)
CC List:	9 users (show)
Fixed In Version:	libvirt-0.9.10-5.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-06-20 06:49:46 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2012:0748	0	normal	SHIPPED_LIVE	Low: libvirt security, bug fix, and enhancement update	2012-06-19 19:31:38 UTC

Description Eduardo Habkost 2012-03-07 19:43:48 UTC

Description of problem:
When using 'virsh managedsave' followed by 'virsh start' of a <cpu mode='host-model'> virtual machine, libvirtd crashes inside x86ModelFind().


Version-Release number of selected component (if applicable):
libvirt-0.9.10-4.el6.x86_64

How reproducible:
Always.

Steps to Reproduce:

# cat crash-libvirt.sh
cat > /tmp/crashvm.xml <<EOF
<domain type='kvm'>
  <name>crashvm</name>
  <memory>1048576</memory>
  <currentMemory>1048576</currentMemory>
  <vcpu>16</vcpu>
  <os>
    <type arch='x86_64' machine='pc'>hvm</type>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-model' match='exact'>
    <model fallback='forbid'/>
    <topology sockets='2' cores='4' threads='2'/>
  </cpu>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
</domain>
EOF
virsh define /tmp/crashvm.xml
virsh start crashvm
virsh managedsave crashvm
virsh start crashvm
# 
# sh -x crash-libvirt.sh
+ cat
+ virsh define /tmp/crashvm.xml
Domain crashvm defined from /tmp/crashvm.xml

+ virsh start crashvm
Domain crashvm started

+ virsh managedsave crashvm

Domain crashvm state saved by libvirt

+ virsh start crashvm
error: Failed to start domain crashvm
error: End of file while reading data: Input/output error

# 


Additional info:
I couldn't reproduce it on Fedora 16, using libvirt-0.9.6-4.fc16.x86_64.

gdb backtrace:

Program received signal SIGSEGV, Segmentation fault.  
[Switching to Thread 0x7f956298d700 (LWP 8093)]
0x00000030437249ea in __strcmp_sse42 () from /lib64/libc.so.6
(gdb) up
#1  0x00007f95688e8871 in x86ModelFind (cpu=0x7f954c000fe0, map=0x7f954c000ae0, policy=1) at cpu/cpu_x86.c:791
791             if (STREQ(model->name, name))
(gdb) p name
$1 = 0x0
(gdb) up
#2  x86ModelFromCPU (cpu=0x7f954c000fe0, map=0x7f954c000ae0, policy=1) at cpu/cpu_x86.c:810
810             if ((model = x86ModelFind(map, cpu->model)) == NULL) {
(gdb) p cpu->model
$2 = 0x0

(gdb) bt
#0  0x00000030437249ea in __strcmp_sse42 () from /lib64/libc.so.6
#1  0x00007f95688e8871 in x86ModelFind (cpu=0x7f954c000fe0, map=0x7f954c000ae0, policy=1) at cpu/cpu_x86.c:791
#2  x86ModelFromCPU (cpu=0x7f954c000fe0, map=0x7f954c000ae0, policy=1) at cpu/cpu_x86.c:810
#3  0x00007f95688e9215 in x86Compute (host=<value optimized out>, cpu=0x7f954c000fe0, guest=0x7f956298beb0) at cpu/cpu_x86.c:1164
#4  0x0000000000475800 in qemuBuildCpuArgStr (conn=0x7f952c000bd0, driver=0x7f9554007280, def=0x7f954c009e10, monitor_chr=0x7f954c005ec0, monitor_json=true, qemuCaps=0x7f954c005f10, 
    migrateFrom=0x4f3343 "stdio", migrateFd=20, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_RESTORE) at qemu/qemu_command.c:3677
#5  qemuBuildCommandLine (conn=0x7f952c000bd0, driver=0x7f9554007280, def=0x7f954c009e10, monitor_chr=0x7f954c005ec0, monitor_json=true, qemuCaps=0x7f954c005f10, migrateFrom=0x4f3343 "stdio", 
    migrateFd=20, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_RESTORE) at qemu/qemu_command.c:4009
#6  0x000000000048ca06 in qemuProcessStart (conn=0x7f952c000bd0, driver=0x7f9554007280, vm=0x7f9544001570, migrateFrom=0x4f3343 "stdio", start_paused=true, autodestroy=false, stdin_fd=20, 
    stdin_path=0x7f954c000b30 "/var/lib/libvirt/qemu/save/crashvm.save", snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_RESTORE) at qemu/qemu_process.c:3294
#7  0x000000000045dc1e in qemuDomainSaveImageStartVM (conn=0x7f952c000bd0, driver=0x7f9554007280, vm=0x7f9544001570, fd=0x7f956298c93c, header=0x7f956298c940, 
    path=0x7f954c000b30 "/var/lib/libvirt/qemu/save/crashvm.save", start_paused=false) at qemu/qemu_driver.c:4118
#8  0x000000000045e1d0 in qemuDomainObjRestore (conn=0x7f952c000bd0, driver=0x7f9554007280, vm=0x7f9544001570, flags=0) at qemu/qemu_driver.c:4406
#9  qemuDomainObjStart (conn=0x7f952c000bd0, driver=0x7f9554007280, vm=0x7f9544001570, flags=0) at qemu/qemu_driver.c:4708
#10 0x000000000045e862 in qemuDomainStartWithFlags (dom=0x7f954c0008c0, flags=0) at qemu/qemu_driver.c:4777
#11 0x00007f956890feb6 in virDomainCreate (domain=0x7f954c0008c0) at libvirt.c:8062
#12 0x0000000000439292 in remoteDispatchDomainCreate (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, rerr=0x7f956298cbc0, args=<value optimized out>, 
    ret=<value optimized out>) at remote_dispatch.h:852
#13 remoteDispatchDomainCreateHelper (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, rerr=0x7f956298cbc0, args=<value optimized out>, 
    ret=<value optimized out>) at remote_dispatch.h:830
#14 0x00007f956894f1f5 in virNetServerProgramDispatchCall (prog=0x14cfe00, server=0x14c4ba0, client=0x14ce280, msg=0x15100e0) at rpc/virnetserverprogram.c:416
#15 virNetServerProgramDispatch (prog=0x14cfe00, server=0x14c4ba0, client=0x14ce280, msg=0x15100e0) at rpc/virnetserverprogram.c:289
#16 0x00007f956894dff1 in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=<value optimized out>) at rpc/virnetserver.c:164
#17 0x00007f956889366c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144
#18 0x00007f9568892f89 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161
#19 0x0000003043a077f1 in start_thread () from /lib64/libpthread.so.0
#20 0x00000030436e592d in clone () from /lib64/libc.so.6

Comment 1 Eduardo Habkost 2012-03-07 19:47:32 UTC

Oops, fixing component.

Comment 2 Jiri Denemark 2012-03-07 21:26:08 UTC

Let's face it... it was me who messed this up and I'll take care of fixing it :-)

Comment 3 Eric Blake 2012-03-09 16:04:10 UTC

This bug will also affect live snapshot, followed by stopping the VM, followed by reverting to the snapshot, if the VM had a host cpu passthrough.  Basically, any time we restart a qemu process based on state from a previously running qemu, we have to ensure that we are restarting the new qemu with the same cpu setup.

Comment 4 Jiri Denemark 2012-03-13 07:14:27 UTC

Fixed upstream by v0.9.10-187-g041109a and sent for internal review: http://post-office.corp.redhat.com/archives/rhvirt-patches/2012-March/msg01174.html

Comment 8 dyuan 2012-03-14 02:43:32 UTC

Verified PASS with libvirt-0.9.10-5.el6, reproduced with libvirt-0.9.10-4.el6.

# sh -x crash-libvirt.sh 
+ cat
+ virsh define /tmp/crashvm.xml
Domain crashvm defined from /tmp/crashvm.xml

+ virsh start crashvm
Domain crashvm started

+ virsh managedsave crashvm

Domain crashvm state saved by libvirt

+ virsh start crashvm
Domain crashvm started

Comment 10 errata-xmlrpc 2012-06-20 06:49:46 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html

Note You need to log in before you can comment on or make changes to this bug.