| Summary: | Segfault during peer2peer migration | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Rami Vaknin <rvaknin> | ||||||||
| Component: | libvirt | Assignee: | Daniel Berrangé <berrange> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 6.2 | CC: | ajia, dallan, dyuan, iheim, mgoldboi, mshao, mzhan, rwu, veillard, weizhan, yeylon | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | libvirt-0.9.3-6.el6 | Doc Type: | Bug Fix | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2011-12-06 11:16:46 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
Okay the last operations logged in libvirtd.log dump are: 12:17:35.880: 15033: debug : virEventPollDispatchHandles:454 : i=22 w=103 12:17:35.880: 15033: debug : virEventPollDispatchHandles:467 : Dispatch n=22 f=24 w=103 e=1 0x7f57f80f2ee0 12:17:35.880: 15034: debug : virEventPollRemoveHandle:184 : mark delete 22 24 12:17:35.880: 15034: debug : virEventPollInterruptLocked:677 : Interrupting 12:17:35.880: 15034: debug : virNetSocketFree:627 : sock=0x7f57f80f2ee0 fd=24 virNetSocketFree does some free and then malloc a bit later in virEventPollRemoveTimeout() debug crashes. I would guess that somehow the frees virNetSocketFree corrupted the heap or the associated data were still in use leading to a heap data structure corruption. Daniel Created attachment 513720 [details]
libvirtd.log
Try to reproduce the bug with the following steps:
kernel-2.6.32-166.el6.x86_64
libvirt-0.9.3-5.el6.x86_64
qemu-kvm-0.12.1.2-2.169.el6.x86_64
1. build tls environment from source to target host
server: target
client: source
2. build tls environment from source to source host with the same cacert.pem
server: source
client: source
on source host test tls environment:
3. # virsh -c qemu+tls://{target ip}/system
Welcome to virsh, the virtualization interactive terminal.
Type: 'help' for help with commands
'quit' to quit
virsh # exit
# virsh -c qemu+tls://{source ip}/system
Welcome to virsh, the virtualization interactive terminal.
Type: 'help' for help with commands
'quit' to quit
virsh # exit
4. start a guest on target host which image is on shared nfs mounted on both sides
# setsebool virt_use_nfs 1
# iptables -F
5. do migration on source host with command:
# virsh -c qemu+tls://{target ip}/system migrate --p2p guest_name qemu+tls://{source ip}/system
error: Cannot recv data: Input/output error
on target host
# service libvirtd status
libvirtd dead but pid file exists
seems that attaching libvirtd to gdb on the target before running the virsh migration would allow to get the crash and make sure it's the same bug, Could you try that and show the stack trace ? Also keep the current libvirtd.log on the target as it should contain a lot of debug informations. thanks, Daniel Created attachment 513750 [details]
part of libvirtd.log
stack trace info:
#0 0x0000000000000000 in ?? ()
#1 0x0000003a36434150 in _gnutls_string_resize (dest=0x7f11440c3ca8, new_size=<value optimized out>) at gnutls_str.c:192
#2 0x0000003a3641a614 in _gnutls_io_read_buffered (session=0x7f11440c3000, iptr=0x7fffdb42e0a8, sizeOfPtr=5, recv_type=<value optimized out>) at gnutls_buffers.c:515
#3 0x0000003a36416031 in _gnutls_recv_int (session=0x7f11440c3000, type=GNUTLS_APPLICATION_DATA, htype=4294967295, data=0x7f11440c8498 "", sizeofdata=4) at gnutls_record.c:904
#4 0x00007f115f48e45d in virNetTLSSessionRead (sess=<value optimized out>, buf=<value optimized out>, len=<value optimized out>) at rpc/virnettlscontext.c:812
#5 0x00007f115f48a05d in virNetSocketReadWire (sock=0x7f11440c82b0, buf=0x7f11440c8498 "", len=4) at rpc/virnetsocket.c:801
#6 0x00007f115f48a2d0 in virNetSocketRead (sock=0x7f11440c82b0, buf=0x7f11440c8498 "", len=4) at rpc/virnetsocket.c:981
#7 0x00007f115f4865ed in virNetClientIOReadMessage (client=0x7f11440c8440) at rpc/virnetclient.c:717
#8 virNetClientIOHandleInput (client=0x7f11440c8440) at rpc/virnetclient.c:736
#9 0x00007f115f487dd0 in virNetClientIncomingEvent (sock=0x7f11440c82b0, events=<value optimized out>, opaque=0x7f11440c8440) at rpc/virnetclient.c:1127
#10 0x00007f115f3e56b2 in virEventPollDispatchHandles () at util/event_poll.c:469
#11 virEventPollRunOnce () at util/event_poll.c:610
#12 0x00007f115f3e4567 in virEventRunDefaultImpl () at util/event.c:247
#13 0x000000000043c9dd in virNetServerRun (srv=0x1c8e550) at rpc/virnetserver.c:662
#14 0x000000000041d828 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1552
Seems more like the stacktrace of #722738 https://bugzilla.redhat.com/show_bug.cgi?id=722738 The two may be related but a priori you're hitting something different, Daniel This series should fix the problem https://www.redhat.com/archives/libvir-list/2011-July/msg01179.html Should also add this to improve error reporting for migration https://www.redhat.com/archives/libvir-list/2011-July/msg01201.html Verified on libvirt-0.9.3-7.el6.x86_64 using the automation test that reproduced it several times. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html |
Created attachment 513521 [details] core file, libvirtd and vdsm logs Environment: RHEVM 3.0 on dev env, last commit 12b1f476f4f1a01bb86cb5687c19f4ef0f0784c6 libvirt-0.9.3-5.el6.x86_64, vdsm-4.9-81.el6.x86_64 libvirtd[15034]: segfault at 7f5700000012 ip 00000030276787de sp 00007f5803df1330 error 6 in libc-2.12.so[3027600000+187000] Loaded symbols for /lib64/libnss_dns-2.12.so Core was generated by `libvirtd --daemon --listen'. Program terminated with signal 11, Segmentation fault. #0 _int_malloc (av=0x7f57f8000020, bytes=<value optimized out>) at malloc.c:4439 4439 bck->fd = unsorted_chunks(av); Missing separate debuginfos, use: debuginfo-install krb5-libs-1.9-9.el6.x86_64 libcurl-7.19.7-26.el6.x86_64 libgcrypt-1.4.5-5.el6.x86_64 openssl-1.0.0-10.el6.x86_64 (gdb) bt #0 _int_malloc (av=0x7f57f8000020, bytes=<value optimized out>) at malloc.c:4439 #1 0x0000003027679add in __libc_malloc (bytes=100) at malloc.c:3660 #2 0x00000030276fff7b in __vasprintf_chk (result_ptr=0x7f5803df1628, flags=1, format=0x7f5809dd6dcf "Remove timer %d", args=0x7f5803df15e0) at vasprintf_chk.c:50 #3 0x00007f5809ceef64 in vasprintf (strp=<value optimized out>, fmt=<value optimized out>, list=<value optimized out>) at /usr/include/bits/stdio2.h:199 #4 virVasprintf (strp=<value optimized out>, fmt=<value optimized out>, list=<value optimized out>) at util/util.c:1623 #5 0x00007f5809cded75 in virLogMessage (category=0x7f5809dd6dab "file.util/event_poll.c", priority=1, funcname=0x7f5809dd7350 "virEventPollRemoveTimeout", linenr=276, flags=0, fmt=<value optimized out>) at util/logging.c:721 #6 0x00007f5809cd77a3 in virEventPollRemoveTimeout (timer=20) at util/event_poll.c:276 #7 0x00007f5809d0fd06 in virDomainEventStateFree (state=0x7f57f80d0420) at conf/domain_event.c:556 #8 0x00007f5809d637e3 in doRemoteClose (conn=<value optimized out>, priv=0x7f57f80f35e0) at remote/remote_driver.c:848 #9 0x00007f5809d6394b in remoteClose (conn=0x7f57f8013790) at remote/remote_driver.c:863 #10 0x00007f5809d2fadb in virReleaseConnect (conn=0x7f57f8013790) at datatypes.c:114 #11 0x00007f5809d30fe8 in virUnrefConnect (conn=0x7f57f8013790) at datatypes.c:149 #12 0x0000000000481488 in doPeer2PeerMigrate (driver=0x7f57f8020220, conn=0x7f57ec000a70, vm=0x7f57f00d7b20, xmlin=<value optimized out>, dconnuri=0x7f57f8117900 "qemu+tls://nott-vdsa.qa.lab.tlv.redhat.com/system", uri=<value optimized out>, cookiein=0x0, cookieinlen=0, cookieout=0x7f5803df1b80, cookieoutlen=0x7f5803df1b8c, flags=3, dname=0x7f57f0046320 "stress_new_pool-23", resource=0, v3proto=true) at qemu/qemu_migration.c:2253 #13 qemuMigrationPerform (driver=0x7f57f8020220, conn=0x7f57ec000a70, vm=0x7f57f00d7b20, xmlin=<value optimized out>, dconnuri=0x7f57f8117900 "qemu+tls://nott-vdsa.qa.lab.tlv.redhat.com/system", uri=<value optimized out>, cookiein=0x0, cookieinlen=0, cookieout=0x7f5803df1b80, cookieoutlen=0x7f5803df1b8c, flags=3, dname=0x7f57f0046320 "stress_new_pool-23", resource=0, v3proto=true) at qemu/qemu_migration.c:2315 #14 0x0000000000448a83 in qemuDomainMigratePerform3 (dom=0x7f57f8196300, xmlin=0x0, cookiein=<value optimized out>, cookieinlen=0, cookieout=0x7f5803df1b80, cookieoutlen=0x7f5803df1b8c, dconnuri=0x7f57f8117900 "qemu+tls://nott-vdsa.qa.lab.tlv.redhat.com/system", uri=0x0, flags=3, dname=0x0, resource=0) at qemu/qemu_driver.c:6999 #15 0x00007f5809d4c9c4 in virDomainMigratePerform3 (domain=0x7f57f8196300, xmlin=0x0, cookiein=0x0, cookieinlen=<value optimized out>, cookieout=<value optimized out>, cookieoutlen=0x7f5803df1b8c, dconnuri=0x7f57f8117900 "qemu+tls://nott-vdsa.qa.lab.tlv.redhat.com/system", uri=0x0, flags=3, dname=0x0, bandwidth=0) at libvirt.c:5162 #16 0x000000000041fbc2 in remoteDispatchDomainMigratePerform3 (server=<value optimized out>, client=<value optimized out>, hdr=<value optimized out>, rerr=0x7f5803df1c10, args=<value optimized out>, ret=0x7f57f810de40) at remote.c:2789 #17 remoteDispatchDomainMigratePerform3Helper (server=<value optimized out>, client=<value optimized out>, hdr=<value optimized out>, rerr=0x7f5803df1c10, args=<value optimized out>, ret=0x7f57f810de40) at remote_dispatch.h:2700 #18 0x000000000043ac2e in virNetServerProgramDispatchCall (prog=0xccc420, server=0xccbae0, client=0xcf2a00, msg=0xfc5f60) at rpc/virnetserverprogram.c:375 #19 virNetServerProgramDispatch (prog=0xccc420, server=0xccbae0, client=0xcf2a00, msg=0xfc5f60) at rpc/virnetserverprogram.c:252 #20 0x000000000043d401 in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0xccbae0) at rpc/virnetserver.c:150 #21 0x00007f5809cecdaa in virThreadPoolWorker (opaque=0xccbbd0) at util/threadpool.c:98 #22 0x00007f5809cec812 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:157 #23 0x0000003027e077e1 in start_thread (arg=0x7f5803df2700) at pthread_create.c:301 #24 0x00000030276e68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115