Bug 690805

Summary: libvirtd crashes when resuming a VM
Product: Red Hat Enterprise Linux 6 Reporter: IBM Bug Proxy <bugproxy>
Component: libvirtAssignee: Daniel Veillard <veillard>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.1CC: balkov, eblake, jkachuck, jyang, yoyzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-25 13:11:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
proposed patch to fix the crash none

Description IBM Bug Proxy 2011-03-25 12:51:33 UTC
---Problem Description---
libvirtd from RHEL 6.1 Beta crashes when trying to resume a previously suspended VM (e.g. 
during a reboot). 
  
---uname output---
Linux c7b2 2.6.32-122.el6.x86_64 #1 SMP Wed Mar 9 23:54:34 EST 2011 x86_64 x86_64 
x86_64 GNU/Linux
 
Machine Type = IBM HS22 blade server 
  
---Steps to Reproduce---
 1. suspend VM e.g. like it is done automatically during a reboot
2. try to resume VM e.g. with virt-manager (clicking on restore)

virt-manager loses connection to libvirt and a message "connection lost" or similar is 
displayed.
A message in /var/log/messages about the crash similar to the one below is logged:

Mar 24 13:50:16 c7b2 kernel: libvirtd[27474]: segfault at 0 ip 00000032f4f25d1f sp 
00007f3b3e717c18 error 4 in libc-2.12.so[32f4e00000+187000]
Mar 24 13:50:16 c7b2 abrt[28104]: saved core dump of pid 27470 (/usr/sbin/libvirtd) to 
/var/spool/abrt/ccpp-1300971016-27470.new/coredump (76816384 bytes)
Mar 24 13:50:16 c7b2 abrtd: Directory 'ccpp-1300971016-27470' creation detected
Mar 24 13:50:16 c7b2 abrtd: Crash is in database already (dup of /var/spool/abrt/ccpp-
1300967188-2345)
Mar 24 13:50:16 c7b2 abrtd: Deleting crash ccpp-1300971016-27470 (dup of ccpp-1300967188-
2345), sending dbus signal

gdb -c /var/spool/abrt/ccpp-1300967188-2345/coredump and "thread apply all backtrace" 
yields the following:

Core was generated by `libvirtd --daemon'.
Program terminated with signal 11, Segmentation fault.
#0  __strlen_sse42 () at ../sysdeps/x86_64/multiarch/strlen.S:54
54              pcmpeqb (%rdi), %xmm1
Missing separate debuginfos, use: debuginfo-install avahi-libs-0.6.25-9.el6.x86_64 device-
mapper-libs-1.02.62-2.el6.x86_64 libcom_err-1.41.12-7.el6.x86_64 libcurl-7.19.7-22.el6.x86_64 
libudev-147-2.34.el6.x86_64
(gdb) thread apply all backtrace

Thread 7 (Thread 0x7f8e79aeb700 (LWP 2347)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x0000003b3aa40ae6 in virCondWait (c=<value optimized out>, m=<value optimized 
out>) at util/threads-pthread.c:108
#2  0x000000000041c545 in qemudWorker (data=0x7f8e740008c0) at libvirtd.c:1561
#3  0x00000032f56077e1 in start_thread (arg=0x7f8e79aeb700) at pthread_create.c:301
#4  0x00000032f4ee5dcd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 6 (Thread 0x7f8e790ea700 (LWP 2348)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x0000003b3aa40ae6 in virCondWait (c=<value optimized out>, m=<value optimized 
out>) at util/threads-pthread.c:108
#2  0x000000000041c545 in qemudWorker (data=0x7f8e740008d8) at libvirtd.c:1561
#3  0x00000032f56077e1 in start_thread (arg=0x7f8e790ea700) at pthread_create.c:301
#4  0x00000032f4ee5dcd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 5 (Thread 0x7f8e7a4ec700 (LWP 2346)):
#0  0x00000032f4edc6c3 in __poll (fds=<value optimized out>, nfds=<value optimized out>, 
timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x0000000000418a5d in virEventRunOnce () at event.c:595
#2  0x000000000041b3d9 in qemudOneLoop () at libvirtd.c:2238
#3  0x000000000041b897 in qemudRunLoop (opaque=0x7af640) at libvirtd.c:2348
#4  0x00000032f56077e1 in start_thread (arg=0x7f8e7a4ec700) at pthread_create.c:301
#5  0x00000032f4ee5dcd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 4 (Thread 0x7f8e735fe700 (LWP 2350)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x0000003b3aa40ae6 in virCondWait (c=<value optimized out>, m=<value optimized 
out>) at util/threads-pthread.c:108
#2  0x000000000041c545 in qemudWorker (data=0x7f8e74000908) at libvirtd.c:1561
#3  0x00000032f56077e1 in start_thread (arg=0x7f8e735fe700) at pthread_create.c:301
#4  0x00000032f4ee5dcd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 3 (Thread 0x7f8e73fff700 (LWP 2349)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x0000003b3aa40ae6 in virCondWait (c=<value optimized out>, m=<value optimized 
out>) at util/threads-pthread.c:108
#2  0x000000000041c545 in qemudWorker (data=0x7f8e740008f0) at libvirtd.c:1561
#3  0x00000032f56077e1 in start_thread (arg=0x7f8e73fff700) at pthread_create.c:301
#4  0x00000032f4ee5dcd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 2 (Thread 0x7f8e815ca800 (LWP 2345)):
#0  0x00000032f560803d in pthread_join (threadid=140249914066688, thread_return=0x0) at 
pthread_join.c:89
---Type <return> to continue, or q <return> to quit---
#1  0x000000000041fa68 in main (argc=<value optimized out>, argv=<value optimized 
out>) at libvirtd.c:3333

Thread 1 (Thread 0x7f8e72bfd700 (LWP 2351)):
#0  __strlen_sse42 () at ../sysdeps/x86_64/multiarch/strlen.S:54
#1  0x00000032f72062fd in audit_encode_nv_string (name=0x4ada0d "path", value=0x0, 
vlen=0) at audit_logging.c:119
#2  0x0000000000462214 in qemuAuditNetDevice (vmDef=0x7f8e6408edd0, 
netDef=0x7f8e6408f140, device=0x0, success=false) at qemu/qemu_audit.c:157
#3  0x0000000000454080 in qemuPhysIfaceConnect (def=0x7f8e6408edd0, 
conn=0x7f8e5c0009a0, driver=0x7d66d0, net=0x7f8e6408f140, 
    qemuCmdFlags=<value optimized out>, vmop=<value optimized out>) at 
qemu/qemu_command.c:131
#4  0x000000000045d957 in qemuBuildCommandLine (conn=0x7f8e5c0009a0, driver=<value 
optimized out>, def=0x7f8e6408edd0, monitor_chr=0x7f8e00000000, 
    monitor_json=255, qemuCmdFlags=<value optimized out>, migrateFrom=0x4ae383 
"stdio", migrateFd=20, current_snapshot=0x0, vmop=VIR_VM_OP_RESTORE)
    at qemu/qemu_command.c:3447
#5  0x0000000000448c35 in qemudStartVMDaemon (conn=0x7f8e5c0009a0, 
driver=0x7d66d0, vm=0x9560d0, migrateFrom=0x4ae383 "stdio", start_paused=true, 
    stdin_fd=20, stdin_path=0x7f8e640a0f00 "/var/lib/libvirt/qemu/save/testvm1.save", 
vmop=VIR_VM_OP_RESTORE) at qemu/qemu_driver.c:3230
#6  0x000000000044bd01 in qemudDomainSaveImageStartVM (conn=0x7f8e5c0009a0, 
driver=0x7d66d0, vm=0x9560d0, fd=0x7f8e72bfc8fc, 
    read_pid=0x7f8e72bfc8f8, header=0x7f8e72bfc900, path=0x7f8e640a0f00 
"/var/lib/libvirt/qemu/save/testvm1.save") at qemu/qemu_driver.c:6218
#7  0x000000000044c5d1 in qemudDomainObjRestore (conn=0x7f8e5c0009a0, 
driver=0x7d66d0, vm=0x9560d0, start_paused=false) at qemu/qemu_driver.c:6384
#8  qemudDomainObjStart (conn=0x7f8e5c0009a0, driver=0x7d66d0, vm=0x9560d0, 
start_paused=false) at qemu/qemu_driver.c:6637
#9  0x000000000044ca15 in qemudDomainStartWithFlags (dom=0x7f8e64003e60, flags=0) at 
qemu/qemu_driver.c:6693
#10 0x0000003b3aa95066 in virDomainCreate (domain=0x7f8e64003e60) at libvirt.c:5130
#11 0x0000000000429e28 in remoteDispatchDomainCreate (server=<value optimized out>, 
client=<value optimized out>, conn=0x7f8e5c0009a0, 
    hdr=<value optimized out>, rerr=0x7f8e72bfcb90, args=<value optimized out>, 
ret=0x7f8e72bfcc70) at remote.c:1225
#12 0x000000000042c8ca in remoteDispatchClientCall (server=0x7af640, 
client=0x7f8e74001330, msg=0x7f8e740014b0) at dispatch.c:530
#13 remoteDispatchClientRequest (server=0x7af640, client=0x7f8e74001330, 
msg=0x7f8e740014b0) at dispatch.c:408
#14 0x000000000041c5d8 in qemudWorker (data=0x7f8e74000920) at libvirtd.c:1582
#15 0x00000032f56077e1 in start_thread (arg=0x7f8e72bfd700) at pthread_create.c:301
#16 0x00000032f4ee5dcd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb) 

 
---System Management Component Data--- 
Userspace tool common name: libvirtd 
 
System management type: kvm 
 
The userspace tool has the following bit modes: 64bit 

Userspace rpm: libvirt-0.8.7-11.el6.x86_64 

The problem occurs when using the direct interface and macvtap.

Comment 1 IBM Bug Proxy 2011-03-25 12:51:40 UTC
Created attachment 487557 [details]
proposed patch to fix the crash

Comment 2 IBM Bug Proxy 2011-03-25 13:02:04 UTC
------- Comment From jens.com 2011-03-25 08:56 EDT-------
This has been fixed upstream with

commit 7cc101ce0e8e1929f6573c3bee3ec2e287304513
Author: Laine Stump <laine>
Date:   Mon Mar 14 11:15:19 2011 -0400

audit: eliminate potential null pointer deref when auditing macvtap devices
The newly added call to qemuAuditNetDevice in qemuPhysIfaceConnect was
assuming that res_ifname (the name of the macvtap device) was always
valid, but this isn't the case. If openMacvtapTap fails, it always
returns NULL, which would result in a segv.
Since the audit log only needs a record of devices that are actually
sent to qemu, and a failure to open the macvtap device means that no
device will be sent to qemu, we can solve this problem by only doing
the audit if openMacvtapTap is successful (in which case res_ifname is
guaranteed valid).

Please include this fix in RHEL6.1. Thanks !

Comment 4 Osier Yang 2011-03-25 13:11:30 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=688860#c6

this has already been fixed in libvirt-0.8.7-12.

*** This bug has been marked as a duplicate of bug 688860 ***

Comment 5 Osier Yang 2011-03-25 13:17:50 UTC
(In reply to comment #1)
> Created attachment 487557 [details]
> proposed patch to fix the crash

Hi, Jens,

However, the patch looks sensiable, could you propose it to libvir-list? we can carry it if it's not convenient for you.

Regards
Osier

Comment 6 Eric Blake 2011-03-25 14:25:54 UTC
(In reply to comment #5)
> (In reply to comment #1)
> > Created attachment 487557 [details]
> > proposed patch to fix the crash
> 
> Hi, Jens,
> 
> However, the patch looks sensiable, could you propose it to
> libvir-list? we can carry it if it's not convenient for you.
> 
> Regards
> Osier

Actually, I'm wondering if a better patch would be to mark virAuditEnclose as ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2), since we should never be passing it NULL arguments in the first place.