Bug 995312
Summary: | Libvirtd crashed after attaching a qemu-kvm process using qemu-attach | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Hu Jianwei <jiahu> | |
Component: | libvirt | Assignee: | Eric Blake <eblake> | |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 6.5 | CC: | acathrow, dallan, dyuan, eblake, hannsj_uhl, hliu, jdenemar, laine, mzhan, tlavigne | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-0.10.2-25.el6 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1010617 (view as bug list) | Environment: | ||
Last Closed: | 2013-11-21 09:07:54 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1010617 |
Description
Hu Jianwei
2013-08-09 03:43:59 UTC
Posted core dump log. part one: Program terminated with signal 11, Segmentation fault. #0 0x0000003c9480b43c in ?? () (gdb) t a a bt Thread 11 (Thread 28024): #0 0x0000003c944df253 in ?? () #1 0x00000140617b15ad in ?? () #2 0x0000003c00000b8e in ?? () #3 0x0000000000000009 in ?? () #4 0x00000000007a3660 in ?? () #5 0x00000140617b15ad in ?? () #6 0x0000003cabc513cc in virEventPollRunOnce () at util/event_poll.c:615 #7 0x0000003cabc50607 in virEventRunDefaultImpl () at util/event.c:247 #8 0x0000003cabd40d0d in virNetServerRun (srv=0x79e0d0) at rpc/virnetserver.c:748 #9 0x0000000000423bc7 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1228 Thread 10 (Thread 28026): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () Thread 9 (Thread 28027): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () Thread 8 (Thread 28028): #0 0x0000003c94527a9a in ?? () #1 0x000000000047b651 in qemuCapsGetCanonicalMachine (caps=0x7fffe4128a10, name=0x0) at qemu/qemu_capabilities.c:1890 #2 0x0000000000456330 in qemuCanonicalizeMachine (def=0x7fffd0000af0, caps=<value optimized out>) at qemu/qemu_driver.c:1537 #3 0x0000000000456566 in qemuDomainAttach (conn=0x7fffe4197250, pid_value=20837, flags=<value optimized out>) at qemu/qemu_driver.c:13198 #4 0x0000003c96000d27 in virDomainQemuAttach (conn=0x7fffe4197250, pid_value=<value optimized out>, flags=0) at libvirt-qemu.c:176 #5 0x00000000004265f0 in qemuDispatchDomainAttach (server=<value optimized out>, client=0x79a780, msg=<value optimized out>, rerr=0x7ffff0d3fb80, args=0x7fffd0000aa0, ret=0x7fffd0000ac0) at qemu_dispatch.h:111 #6 qemuDispatchDomainAttachHelper (server=<value optimized out>, client=0x79a780, msg=<value optimized out>, rerr=0x7ffff0d3fb80, args=0x7fffd0000aa0, ret=0x7fffd0000ac0) at qemu_dispatch.h:91 #7 0x0000003cabd401e2 in virNetServerProgramDispatchCall (prog=0x79f4b0, server=0x79e0d0, client=0x79a780, msg=0x7a31a0) at rpc/virnetserverprogram.c:431 #8 virNetServerProgramDispatch (prog=0x79f4b0, server=0x79e0d0, client=0x79a780, msg=0x7a31a0) at rpc/virnetserverprogram.c:304 #9 0x0000003cabd414ce in virNetServerProcessMsg (srv=<value optimized out>, client=0x79a780, prog=<value optimized out>, msg=0x7a31a0) at rpc/virnetserver.c:170 #10 0x0000003cabd41b6c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x79e0d0) at rpc/virnetserver.c:191 #11 0x0000003cabc63e9c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144 #12 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #13 0x0000003c94807851 in ?? () #14 0x00007ffff0d40700 in ?? () #15 0x0000000000000000 in ?? () Thread 7 (Thread 28029): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () ---Type <return> to continue, or q <return> to quit--- part two: Program terminated with signal 11, Segmentation fault. #0 0x0000003c9480b43c in ?? () (gdb) t a a bt Thread 11 (Thread 28024): #0 0x0000003c9480e4ed in ?? () #1 0x0000000000000000 in ?? () Thread 10 (Thread 28026): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () Thread 9 (Thread 28027): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () Thread 8 (Thread 28028): #0 0x0000003c94527a9a in ?? () #1 0x000000000047b651 in qemuCapsGetCanonicalMachine (caps=0x7fffe4128a10, name=0x0) at qemu/qemu_capabilities.c:1890 #2 0x0000000000456330 in qemuCanonicalizeMachine (def=0x7fffd0000af0, caps=<value optimized out>) at qemu/qemu_driver.c:1537 #3 0x0000000000456566 in qemuDomainAttach (conn=0x7fffe4197250, pid_value=20837, flags=<value optimized out>) at qemu/qemu_driver.c:13198 #4 0x0000003c96000d27 in virDomainQemuAttach (conn=0x7fffe4197250, pid_value=<value optimized out>, flags=0) at libvirt-qemu.c:176 #5 0x00000000004265f0 in qemuDispatchDomainAttach (server=<value optimized out>, client=0x79a780, msg=<value optimized out>, rerr=0x7ffff0d3fb80, args=0x7fffd0000aa0, ret=0x7fffd0000ac0) at qemu_dispatch.h:111 #6 qemuDispatchDomainAttachHelper (server=<value optimized out>, client=0x79a780, msg=<value optimized out>, rerr=0x7ffff0d3fb80, args=0x7fffd0000aa0, ret=0x7fffd0000ac0) at qemu_dispatch.h:91 #7 0x0000003cabd401e2 in virNetServerProgramDispatchCall (prog=0x79f4b0, server=0x79e0d0, client=0x79a780, msg=0x7a31a0) at rpc/virnetserverprogram.c:431 #8 virNetServerProgramDispatch (prog=0x79f4b0, server=0x79e0d0, client=0x79a780, msg=0x7a31a0) at rpc/virnetserverprogram.c:304 #9 0x0000003cabd414ce in virNetServerProcessMsg (srv=<value optimized out>, client=0x79a780, prog=<value optimized out>, msg=0x7a31a0) at rpc/virnetserver.c:170 #10 0x0000003cabd41b6c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x79e0d0) at rpc/virnetserver.c:191 #11 0x0000003cabc63e9c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144 #12 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #13 0x0000003c94807851 in ?? () #14 0x00007ffff0d40700 in ?? () #15 0x0000000000000000 in ?? () Thread 7 (Thread 28029): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () Thread 6 (Thread 28030): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () Thread 5 (Thread 28031): #0 0x0000003c9480b43c in ?? () #1 0x0000000000000000 in ?? () ---Type <return> to continue, or q <return> to quit--- Updated core dump logs, please ignore comment 2. Part one: Core was generated by `/usr/sbin/libvirtd'. Program terminated with signal 11, Segmentation fault. #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183 183 62: movl (%rsp), %edi (gdb) t a a bt Thread 11 (Thread 0x7ffff7fd0860 (LWP 26304)): #0 0x00007ffff64b6263 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 0x0000003cabc513cc in virEventPollRunOnce () at util/event_poll.c:615 #2 0x0000003cabc50607 in virEventRunDefaultImpl () at util/event.c:247 #3 0x0000003cabd40d0d in virNetServerRun (srv=0x79e0d0) at rpc/virnetserver.c:748 #4 0x0000000000423bc7 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1228 Thread 10 (Thread 0x7fffecf71700 (LWP 26305)): #0 __strcmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp.S:260 #1 0x000000000047b651 in qemuCapsGetCanonicalMachine (caps=0x7fffe00f9440, name=0x0) at qemu/qemu_capabilities.c:1890 #2 0x0000000000456330 in qemuCanonicalizeMachine (def=0x7fffd4000a30, caps=<value optimized out>) at qemu/qemu_driver.c:1537 #3 0x0000000000456566 in qemuDomainAttach (conn=0x7fffdc000c20, pid_value=20837, flags=<value optimized out>) at qemu/qemu_driver.c:13198 #4 0x0000003c96000d27 in virDomainQemuAttach (conn=0x7fffdc000c20, pid_value=<value optimized out>, flags=0) at libvirt-qemu.c:176 #5 0x00000000004265f0 in qemuDispatchDomainAttach (server=<value optimized out>, client=0x7a2f60, msg=<value optimized out>, rerr=0x7fffecf70b80, args=0x7fffd40008c0, ret=0x7fffd40008e0) at qemu_dispatch.h:111 #6 qemuDispatchDomainAttachHelper (server=<value optimized out>, client=0x7a2f60, msg=<value optimized out>, rerr=0x7fffecf70b80, args=0x7fffd40008c0, ret=0x7fffd40008e0) at qemu_dispatch.h:91 #7 0x0000003cabd401e2 in virNetServerProgramDispatchCall (prog=0x79f4b0, server=0x79e0d0, client=0x7a2f60, msg=0x7a1c40) at rpc/virnetserverprogram.c:431 #8 virNetServerProgramDispatch (prog=0x79f4b0, server=0x79e0d0, client=0x7a2f60, msg=0x7a1c40) at rpc/virnetserverprogram.c:304 #9 0x0000003cabd414ce in virNetServerProcessMsg (srv=<value optimized out>, client=0x7a2f60, prog=<value optimized out>, msg=0x7a1c40) at rpc/virnetserver.c:170 #10 0x0000003cabd41b6c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x79e0d0) at rpc/virnetserver.c:191 #11 0x0000003cabc63e9c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144 #12 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #13 0x00007ffff6b799d1 in start_thread (arg=0x7fffecf71700) at pthread_create.c:301 #14 0x00007ffff64bfa8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 9 (Thread 0x7fffec570700 (LWP 26306)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183 #1 0x0000003cabc63966 in virCondWait (c=<value optimized out>, m=<value optimized out>) at util/threads-pthread.c:117 #2 0x0000003cabc63f33 in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:103 #3 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #4 0x00007ffff6b799d1 in start_thread (arg=0x7fffec570700) at pthread_create.c:301 #5 0x00007ffff64bfa8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 8 (Thread 0x7fffebb6f700 (LWP 26307)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183 #1 0x0000003cabc63966 in virCondWait (c=<value optimized out>, m=<value optimized out>) at util/threads-pthread.c:117 #2 0x0000003cabc63f33 in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:103 #3 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #4 0x00007ffff6b799d1 in start_thread (arg=0x7fffebb6f700) at pthread_create.c:301 #5 0x00007ffff64bfa8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 7 (Thread 0x7fffeb16e700 (LWP 26308)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183 ---Type <return> to continue, or q <return> to quit--- Part two: Core was generated by `/usr/sbin/libvirtd'. Program terminated with signal 11, Segmentation fault. #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183 183 62: movl (%rsp), %edi (gdb) t a a bt Thread 11 (Thread 0x7ffff7fd0860 (LWP 26304)): #0 0x00007ffff6b806fd in write () at ../sysdeps/unix/syscall-template.S:82 #1 0x0000003cabc6925e in safewrite (fd=8, buf=0x79a770, count=116) at util/util.c:130 #2 0x0000003cabc57750 in virLogOutputToFd (category=<value optimized out>, priority=<value optimized out>, funcname=<value optimized out>, linenr=<value optimized out>, timestamp=<value optimized out>, flags=0, str=0x7a2ef0 "26304: debug : virKeepAliveMessage:116 : Sending keepalive request to client 0x7a2f60\n", data=0x8) at util/logging.c:846 #3 0x0000003cabc580ef in virLogVMessage (category=0x3cabddda44 "file.rpc/virkeepalive.c", priority=<value optimized out>, funcname=0x3cabddde30 "virKeepAliveMessage", linenr=116, flags=0, fmt=<value optimized out>, vargs=0x7fffffffd5c0) at util/logging.c:781 #4 0x0000003cabc5829c in virLogMessage (category=<value optimized out>, priority=<value optimized out>, funcname=<value optimized out>, linenr=<value optimized out>, flags=<value optimized out>, fmt=<value optimized out>) at util/logging.c:688 #5 0x0000003cabd4c904 in virKeepAliveMessage (ka=0x7a36b0, proc=1) at rpc/virkeepalive.c:116 #6 0x0000003cabd4ca9a in virKeepAliveTimerInternal (ka=0x7a36b0, msg=0x7fffffffd768) at rpc/virkeepalive.c:161 #7 0x0000003cabd4d18c in virKeepAliveTimer (timer=<value optimized out>, opaque=0x7a36b0) at rpc/virkeepalive.c:179 #8 0x0000003cabc515f8 in virEventPollDispatchTimeouts () at util/event_poll.c:435 #9 virEventPollRunOnce () at util/event_poll.c:628 #10 0x0000003cabc50607 in virEventRunDefaultImpl () at util/event.c:247 #11 0x0000003cabd40d0d in virNetServerRun (srv=0x79e0d0) at rpc/virnetserver.c:748 #12 0x0000000000423bc7 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1228 Thread 10 (Thread 0x7fffecf71700 (LWP 26305)): #0 __strcmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp.S:260 #1 0x000000000047b651 in qemuCapsGetCanonicalMachine (caps=0x7fffe00f9440, name=0x0) at qemu/qemu_capabilities.c:1890 #2 0x0000000000456330 in qemuCanonicalizeMachine (def=0x7fffd4000a30, caps=<value optimized out>) at qemu/qemu_driver.c:1537 #3 0x0000000000456566 in qemuDomainAttach (conn=0x7fffdc000c20, pid_value=20837, flags=<value optimized out>) at qemu/qemu_driver.c:13198 #4 0x0000003c96000d27 in virDomainQemuAttach (conn=0x7fffdc000c20, pid_value=<value optimized out>, flags=0) at libvirt-qemu.c:176 #5 0x00000000004265f0 in qemuDispatchDomainAttach (server=<value optimized out>, client=0x7a2f60, msg=<value optimized out>, rerr=0x7fffecf70b80, args=0x7fffd40008c0, ret=0x7fffd40008e0) at qemu_dispatch.h:111 #6 qemuDispatchDomainAttachHelper (server=<value optimized out>, client=0x7a2f60, msg=<value optimized out>, rerr=0x7fffecf70b80, args=0x7fffd40008c0, ret=0x7fffd40008e0) at qemu_dispatch.h:91 #7 0x0000003cabd401e2 in virNetServerProgramDispatchCall (prog=0x79f4b0, server=0x79e0d0, client=0x7a2f60, msg=0x7a1c40) at rpc/virnetserverprogram.c:431 #8 virNetServerProgramDispatch (prog=0x79f4b0, server=0x79e0d0, client=0x7a2f60, msg=0x7a1c40) at rpc/virnetserverprogram.c:304 #9 0x0000003cabd414ce in virNetServerProcessMsg (srv=<value optimized out>, client=0x7a2f60, prog=<value optimized out>, msg=0x7a1c40) at rpc/virnetserver.c:170 #10 0x0000003cabd41b6c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x79e0d0) at rpc/virnetserver.c:191 #11 0x0000003cabc63e9c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144 #12 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #13 0x00007ffff6b799d1 in start_thread (arg=0x7fffecf71700) at pthread_create.c:301 #14 0x00007ffff64bfa8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 9 (Thread 0x7fffec570700 (LWP 26306)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183 #1 0x0000003cabc63966 in virCondWait (c=<value optimized out>, m=<value optimized out>) at util/threads-pthread.c:117 #2 0x0000003cabc63f33 in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:103 #3 0x0000003cabc63789 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161 #4 0x00007ffff6b799d1 in start_thread (arg=0x7fffec570700) at pthread_create.c:301 ---Type <return> to continue, or q <return> to quit--- Upstream libvirt fails in a different manner: error: XML error: No PCI buses available without crashing; but I can definitely reproduce the issue with the RHEL build, and am investigating how easy or hard it will be to patch. (In reply to Eric Blake from comment #5) > Upstream libvirt fails in a different manner: > error: XML error: No PCI buses available Without looking at the code that does the qemu-attach, my initial guess about the upstream behavior you've found is that when the virDomainDef is created, it doesn't get a proper machine type set, so no pci-root device is added, thus there are no PCI buses. (the code that limits which machine types get a PCI bus doesn't exist in 0.10.2, which just assumes that *all* machine types have a PCI bus) (In reply to Laine Stump from comment #6) > (In reply to Eric Blake from comment #5) > > Upstream libvirt fails in a different manner: > > error: XML error: No PCI buses available > > Without looking at the code that does the qemu-attach, my initial guess > about the upstream behavior you've found is that when the virDomainDef is > created, it doesn't get a proper machine type set, so no pci-root device is > added, thus there are no PCI buses. (the code that limits which machine > types get a PCI bus doesn't exist in 0.10.2, which just assumes that *all* > machine types have a PCI bus) Thanks; that's matching my findings as well - upstream 'qemu-kvm' is now a wrapper that invokes 'qemu-system-x86_64 -machine accel=kvm', and we are horribly misparsing that (leaving it as a machine type of xen!). I also tracked the NULL deref on RHEL to a place where we assume def->os.machine will be set; it looks like our native-to-xml parser violates several assumptions of variables that will always be set. Patches starting to take shape here https://www.redhat.com/archives/libvir-list/2013-August/msg01424.html But I still don't have upstream working, let alone an idea of how invasive backports will be to avoid downstream crashing; I may end up doing a RHEL-only patch that just avoids the strcmp(NULL) rather than backports from upstream. Progress! After backporting this commit [1], I now see symptoms similar to what I just patched upstream (libvirtd no longer crashes, but is dying with "An error occurred, but the cause is unknown", and leaving an unremovable shutoff transient domain behind). As soon as this upstream v2 patch is approved [2], I can finish my backport; it _still_ won't allow virsh qemu-attach to WORK, but at least it won't be crashing libvirtd and won't be leaving unremovable domains behind. And since virsh qemu-attach is officially unsupported (it requires the use of libvirt-qemu.so), that's at least good enough to guarantee that someone using unsupported code won't get libvirtd into a funky state. Looks like I can get things in POST before another week elapses. [1] commit 1851a0c8640f0b42e20a2ccfd5cdb9a9409517ec Author: Guannan Ren <gren> Date: Thu Nov 1 19:43:03 2012 +0800 qemu: use default machine type if missing it in qemu command line BZ:https://bugzilla.redhat.com/show_bug.cgi?id=871273 when using virsh qemu-attach to attach an existing qemu process, if it misses the -M option in qemu command line, libvirtd crashed because the NULL value of def->os.machine in later use. Example: /usr/libexec/qemu-kvm -name foo \ -cdrom /var/lib/libvirt/images/boot.img \ -monitor unix:/tmp/demo,server,nowait \ error: End of file while reading data: Input/output error error: Failed to reconnect to the hypervisor This patch tries to set default machine type if the value of def->os.machine is still NULL after qemu command line parsing. [2] https://www.redhat.com/archives/libvir-list/2013-September/msg00364.html Verified this fix, libvirt daemon don't crash and no vm leak when qemu-attach. Packages: libvirt-client-0.10.2-26.el6.x86_64 Verify steps: 1. Prepare test environment same as [1] 2. Run qemu-attach. # virsh qemu-attach 3721 error: An error occurred, but the cause is unknown 3. Check daemon status # service libvirtd status libvirtd (pid 3616) is running... 4. Verify no vm leak in [2] # virsh list --all Id Name State ---------------------------------------------------- [1] https://bugzilla.redhat.com/show_bug.cgi?id=995312#c0 [2] http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-September/msg00229.html Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1581.html |