Bug 722738

Summary: Segmentation fault in _gnutls_string_resize
Product: Red Hat Enterprise Linux 6 Reporter: Rami Vaknin <rvaknin>
Component: libvirtAssignee: Daniel Berrangé <berrange>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.2CC: ajia, dallan, dnaori, dyuan, iheim, mgoldboi, mshao, mzhan, rwu, syeghiay, veillard, weizhan, yeylon
Target Milestone: betaKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.3-6.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 11:16:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
core file, libvirtd and vdsm logs none

Description Rami Vaknin 2011-07-17 08:49:16 UTC
Created attachment 513519 [details]
core file, libvirtd and vdsm logs

Environment:
RHEVM 3.0 on dev env, last commit 12b1f476f4f1a01bb86cb5687c19f4ef0f0784c6
libvirt-0.9.3-5.el6.x86_64, vdsm-4.9-81.el6.x86_64

Scenario:
Running automation test that perform various VM operations.

from dmesg:
libvirtd[6685]: segfault at 0 ip (null) sp 00007fffc2f5eee8 error 14 in libvirtd[400000+107000]

from gdb:
Reading symbols from /lib64/libnss_dns-2.12.so...Reading symbols from /usr/lib/debug/lib64/libnss_dns-2.12.so.debug...done.
done.
Loaded symbols for /lib64/libnss_dns-2.12.so
Core was generated by `libvirtd --daemon --listen'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
Missing separate debuginfos, use: debuginfo-install krb5-libs-1.9-9.el6.x86_64 libcurl-7.19.7-26.el6.x86_64 libgcrypt-1.4.5-5.el6.x86_64 openssl-1.0.0-10.el6.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000003031a34150 in _gnutls_string_resize (dest=0x7ff500077788, new_size=<value optimized out>) at gnutls_str.c:192
#2  0x0000003031a1a614 in _gnutls_io_read_buffered (session=0x7ff500076ae0, iptr=0x7fffc2f5efe8, sizeOfPtr=5, recv_type=<value optimized out>) at gnutls_buffers.c:515
#3  0x0000003031a16031 in _gnutls_recv_int (session=0x7ff500076ae0, type=GNUTLS_APPLICATION_DATA, htype=4294967295, data=0x7ff50002f748 "", sizeofdata=4) at gnutls_record.c:904
#4  0x00007ff516a4145d in virNetTLSSessionRead (sess=<value optimized out>, buf=<value optimized out>, len=<value optimized out>) at rpc/virnettlscontext.c:812
#5  0x00007ff516a3d05d in virNetSocketReadWire (sock=0x7ff500012fe0, buf=0x7ff50002f748 "", len=4) at rpc/virnetsocket.c:801
#6  0x00007ff516a3d2d0 in virNetSocketRead (sock=0x7ff500012fe0, buf=0x7ff50002f748 "", len=4) at rpc/virnetsocket.c:981
#7  0x00007ff516a395ed in virNetClientIOReadMessage (client=0x7ff50002f6f0) at rpc/virnetclient.c:717
#8  virNetClientIOHandleInput (client=0x7ff50002f6f0) at rpc/virnetclient.c:736
#9  0x00007ff516a3add0 in virNetClientIncomingEvent (sock=0x7ff500012fe0, events=<value optimized out>, opaque=0x7ff50002f6f0) at rpc/virnetclient.c:1127
#10 0x00007ff5169986b2 in virEventPollDispatchHandles () at util/event_poll.c:469
#11 virEventPollRunOnce () at util/event_poll.c:610
#12 0x00007ff516997567 in virEventRunDefaultImpl () at util/event.c:247
#13 0x000000000043c9dd in virNetServerRun (srv=0x7aaae0) at rpc/virnetserver.c:662
#14 0x000000000041d828 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1552
(gdb)

Comment 2 David Naori 2011-07-17 12:59:51 UTC
i managed to reproduce this issue too. it happens when migrating the same domain back and forth between 2 hosts.

libvirtd may hang or crash.

Easy reproduce step is running RUTH automatic test:
migrateVmTests.MigrateVmTest.migrationSize:
    def migrationSize(self):
        initSize = self.agent1.getLVSize(self.sdUUID, self.volUUID)
        self.log.debug("initSize: %s", initSize)
        self.vm.migrate(self.client2)
        midSize = self.agent1.getLVSize(self.sdUUID, self.volUUID)
        self.log.debug("midSize: %s", midSize)
        self.assertEqual(initSize, midSize)
        self.vm.migrate(self.client)
        endSize = self.agent1.getLVSize(self.sdUUID, self.volUUID)
        self.log.debug("endSize: %s", endSize)
        self.assertEqual(initSize, endSize)

bt full of the core dump:

#1  0x0000003b4ac34150 in _gnutls_string_resize (dest=0x7f9f240cb0e8, new_size=<value optimized out>) at gnutls_str.c:192
        unused = 0
        alloc_len = 512
#2  0x0000003b4ac1a614 in _gnutls_io_read_buffered (session=0x7f9f240ca440, iptr=0x7fff66112478, sizeOfPtr=5, recv_type=<value optimized out>) at gnutls_buffers.c:515
        ret = 0
        ret2 = 0
        min = <value optimized out>
        buf_pos = <value optimized out>
        buf = <value optimized out>
        recvlowat = 0
        recvdata = <value optimized out>
#3  0x0000003b4ac16031 in _gnutls_recv_int (session=0x7f9f240ca440, type=GNUTLS_APPLICATION_DATA, htype=4294967295, data=0x7f9f240cfdd8 "", sizeofdata=4) at gnutls_record.c:904
        tmp = <value optimized out>
        decrypted_length = <value optimized out>
        version = <value optimized out>
        headers = 0x0
        recv_type = <value optimized out>
        length = <value optimized out>
        recv_data = 0x6f00000006 <Address 0x6f00000006 out of bounds>
        ret = <value optimized out>
        ret2 = <value optimized out>
        header_size = 5
        empty_packet = <value optimized out>
#4  0x00007f9f3f72045d in virNetTLSSessionRead (sess=<value optimized out>, buf=<value optimized out>, len=<value optimized out>) at rpc/virnettlscontext.c:812
        ret = <value optimized out>
#5  0x00007f9f3f71c05d in virNetSocketReadWire (sock=0x7f9f240c6300, buf=0x7f9f240cfdd8 "", len=4) at rpc/virnetsocket.c:801
        errout = 0x0
        ret = <value optimized out>
        __FUNCTION__ = "virNetSocketReadWire"
#6  0x00007f9f3f71c2d0 in virNetSocketRead (sock=0x7f9f240c6300, buf=0x7f9f240cfdd8 "", len=4) at rpc/virnetsocket.c:981
No locals.
#7  0x00007f9f3f7185ed in virNetClientIOReadMessage (client=0x7f9f240cfd80) at rpc/virnetclient.c:717
        wantData = <value optimized out>
        ret = <value optimized out>
#8  virNetClientIOHandleInput (client=0x7f9f240cfd80) at rpc/virnetclient.c:736
        ret = <value optimized out>
#9  0x00007f9f3f719dd0 in virNetClientIncomingEvent (sock=0x7f9f240c6300, events=<value optimized out>, opaque=0x7f9f240cfd80) at rpc/virnetclient.c:1127
        client = 0x7f9f240cfd80
        __func__ = "virNetClientIncomingEvent"
        __FUNCTION__ = "virNetClientIncomingEvent"
#10 0x00007f9f3f6776b2 in virEventPollDispatchHandles () at util/event_poll.c:469
        cb = 0x7f9f3f71bb80 <virNetSocketEventHandle>
        watch = 10
        opaque = 0x7f9f240c6300
        hEvents = 1
        i = 8
        n = <value optimized out>
#11 virEventPollRunOnce () at util/event_poll.c:610
        fds = 0x1d3b020
        ret = <value optimized out>
        timeout = <value optimized out>
        nfds = 9
        __func__ = "virEventPollRunOnce"
        __FUNCTION__ = "virEventPollRunOnce"
#12 0x00007f9f3f676567 in virEventRunDefaultImpl () at util/event.c:247
---Type <return> to continue, or q <return> to quit---
        __func__ = "virEventRunDefaultImpl"
#13 0x000000000043c9dd in virNetServerRun (srv=0x1ccaae0) at rpc/virnetserver.c:662
        timerid = -1
        timerActive = 0
        i = <value optimized out>
        __FUNCTION__ = "virNetServerRun"
        __func__ = "virNetServerRun"
#14 0x000000000041d828 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1552
        srv = 0x1ccaae0
        remote_config_file = 0x1cca1f0 "/etc/libvirt/libvirtd.conf"
        statuswrite = -1
        ret = 1
        pid_file = 0x1cd65f0 "/var/run/libvirtd.pid"
        sock_file = 0x1cd67e0 "/var/run/libvirt/libvirt-sock"
        sock_file_ro = 0x1cd67b0 "/var/run/libvirt/libvirt-sock-ro"
        timeout = -1
        verbose = 0
        godaemon = 1
        ipsock = 1
        config = 0x1cca890
        privileged = true
        opts = {{name = 0x4c6fca "verbose", has_arg = 0, flag = 0x7fff661128f4, val = 1}, {name = 0x4c6fd2 "daemon", has_arg = 0, flag = 0x7fff661128f0, val = 1}, {name = 0x4db21e "listen", has_arg = 0, flag = 0x7fff661128ec, val = 1}, 
          {name = 0x4d242a "config", has_arg = 1, flag = 0x0, val = 102}, {name = 0x4e47fc "timeout", has_arg = 1, flag = 0x0, val = 116}, {name = 0x4ea520 "pid-file", has_arg = 1, flag = 0x0, val = 112}, {name = 0x4d5725 "version", 
            has_arg = 0, flag = 0x0, val = 129}, {name = 0x4d5912 "help", has_arg = 0, flag = 0x0, val = 63}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}}
        __func__ = "main"

Comment 4 Daniel Veillard 2011-07-18 11:21:14 UTC
I knew why it did sound familiar, Wen Congyang reported the issue:

https://www.redhat.com/archives/libvir-list/2011-July/msg00547.html

there is a problem of calling virNetSocketFree while the structures are still
in use. Could very much be the cause for 722748 too,

Daniel

Comment 5 Daniel Veillard 2011-07-18 11:40:22 UTC
Hum there is a problem with the libvirtd.log . In case of crash
it happens a dump of the debug data generated in the last operations,
see for example in the libvirtd.log from 722748, but there there is
none, so that's not the libvirtd log of a crashed libvirt. Could you reproduce
again and let the daemon crash outside of gdb, then collect the log ?

  thanks,

Daniel

Comment 6 Daniel Berrangé 2011-07-19 13:26:28 UTC
This series should fix the problem

https://www.redhat.com/archives/libvir-list/2011-July/msg01179.html

Comment 7 Daniel Berrangé 2011-07-19 14:53:33 UTC
Should also add this to improve error reporting for migration

https://www.redhat.com/archives/libvir-list/2011-July/msg01201.html

Comment 9 weizhang 2011-07-20 06:19:38 UTC
verify pass with 
kernel-2.6.32-166.el6.x86_64
libvirt-0.9.3-6.el6.x86_64
qemu-kvm-0.12.1.2-2.169.el6.x86_64

reproduce steps:
1. build tls environment from source to target host
server: target 
client: source

2. build tls environment from source to source host with the same cacert.pem
server: source
client: source

on source host test tls environment:
3. # virsh -c qemu+tls://{target ip}/system
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # exit

# virsh -c qemu+tls://{source ip}/system
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # exit

4. start a guest on target host which image is on shared nfs mounted on both
sides
# setsebool virt_use_nfs 1
# iptables -F

5. do migration on source host with command:
#  virsh -c qemu+tls://{target ip}/system migrate --p2p guest_name
qemu+tls://{source ip}/system

there is no libvirtd segmentation fault on target host.

Comment 10 errata-xmlrpc 2011-12-06 11:16:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html