Bug 722282 - [libvirt] [Deadlock] Storage devices remain open by libvirtd
Summary: [libvirt] [Deadlock] Storage devices remain open by libvirtd
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 682015
TreeView+ depends on / blocked
 
Reported: 2011-07-14 20:07 UTC by Rami Vaknin
Modified: 2014-01-12 23:53 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-08-11 15:31:37 UTC
Target Upstream Version:


Attachments (Terms of Use)
Logs (796.73 KB, application/x-compressed-tar)
2011-07-14 20:07 UTC, Rami Vaknin
no flags Details
valgrind's log (55.93 KB, application/x-compressed-tar)
2011-07-18 16:35 UTC, Rami Vaknin
no flags Details

Description Rami Vaknin 2011-07-14 20:07:17 UTC
Created attachment 513257 [details]
Logs

Environment:
RHEVM 3.0 on dev env, libvirt-0.9.3-2.el6.x86_64, vdsm-4.9-81.el6.x86_64.

I used the QA automation framework which perform operation on ~30 VMs, operations like snapshot/template creation and deletion, disks/nics addition and deletion, stop/start/suspend VMs, it tries to perform the same operations on few VMs almost in parallel.

As a result, libvirtd opened storage devices (for instance /dev/dm-23) and they remained open due to deadlock.

      (gdb) info threads
      7 Thread 0x7f2c8b6b0700 (LWP 1305) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      6 Thread 0x7f2c8acaf700 (LWP 1306) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      5 Thread 0x7f2c8a2ae700 (LWP 1307) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      4 Thread 0x7f2c898ad700 (LWP 1308) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      3 Thread 0x7f2c88eac700 (LWP 1309) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      2 Thread 0x7f2c870f1700 (LWP 2775) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      * 1 Thread 0x7f2c91542800 (LWP 1304) 0x0000003027644582 in _IO_vfprintf_internal (s=<value optimized out>, format=<value optimized out>, ap=<value optimized out>) at vfprintf.c:1977
      (gdb) where
      #0 0x0000003027644582 in _IO_vfprintf_internal (s=<value optimized out>, format=<value optimized out>, ap=<value optimized out>) at vfprintf.c:1977
      #1 0x000000302770000c in __vasprintf_chk (result_ptr=0x7fff1d886140, flags=1, format=0x3636939c68 "%02d:%02d:%02d.%03d: %d: %s : %s:%lld : %s\n", args=0x7fff1d885f00) at vasprintf_chk.c:68
      #2 0x0000003636850d54 in vasprintf (strp=<value optimized out>, fmt=<value optimized out>, list=<value optimized out>) at /usr/include/bits/stdio2.h:199
      #3 virVasprintf (strp=<value optimized out>, fmt=<value optimized out>, list=<value optimized out>) at util/util.c:1648
      #4 0x0000003636850df8 in virAsprintf (strp=<value optimized out>, fmt=<value optimized out>) at util/util.c:1666
      #5 0x0000003636840423 in virLogFormatString (msg=0x7fff1d886140, funcname=0x3636938fd0 "virEventPollDispatchHandles", linenr=454, time_info=0x7fff1d8860c0, cur_time=<value optimized out>,
      priority=<value optimized out>, str=0x880910 "i=25 w=55") at util/logging.c:634
      #6 0x0000003636840c06 in virLogMessage (category=0x3636938a8b "file.util/event_poll.c", priority=1, funcname=0x3636938fd0 "virEventPollDispatchHandles", linenr=454, flags=0, fmt=<value optimized out>)
      at util/logging.c:729
      #7 0x000000363683a356 in virEventPollDispatchHandles () at util/event_poll.c:454
      #8 virEventPollRunOnce () at util/event_poll.c:610
      #9 0x0000003636839397 in virEventRunDefaultImpl () at util/event.c:247
      #10 0x000000000043c86d in virNetServerRun (srv=0x814ae0) at rpc/virnetserver.c:662
      #11 0x000000000041d708 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1552
      (gdb) break virNetTLSSessionRead
      Breakpoint 1 at 0x442250: file rpc/virnettlscontext.c, line 809. (2 locations)
      (gdb) c
      Continuing.
       
      Breakpoint 1, virNetTLSSessionRead (sess=0x7f2c78024f40, buf=0x7f2c78073bc8 "", len=4) at rpc/virnettlscontext.c:809
      809 {
      (gdb) n
      812 ret = gnutls_record_recv(sess->session, buf, len);
      (gdb)
      814 if (ret >= 0)
      (gdb) p ret
      $1 = -10




[root@intg-vdsb ~]# lsof -p 6632
COMMAND   PID USER   FD   TYPE             DEVICE   SIZE/OFF       NODE
NAME
libvirtd 6632 root  cwd    DIR              253,0       4096          2 /
libvirtd 6632 root  rtd    DIR              253,0       4096          2 /
libvirtd 6632 root  txt    REG              253,0    1105328     403877
/usr/sbin/libvirtd (deleted)
libvirtd 6632 root  mem    REG              253,0     155696     783375
/lib64/ld-2.12.so
libvirtd 6632 root  mem    REG              253,0      31984     396738
/usr/lib64/libpciaccess.so.0.10.8
libvirtd 6632 root  mem    REG              253,0    1912928     783379
/lib64/libc-2.12.so
libvirtd 6632 root  mem    REG              253,0      22536     783450
/lib64/libdl-2.12.so
libvirtd 6632 root  mem    REG              253,0     145672     783381
/lib64/libpthread-2.12.so
libvirtd 6632 root  mem    REG              253,0     598816     783435
/lib64/libm-2.12.so
libvirtd 6632 root  mem    REG              253,0      88240     783399
/lib64/libz.so.1.2.3
libvirtd 6632 root  mem    REG              253,0     124624     783641
/lib64/libselinux.so.1
libvirtd 6632 root  mem    REG              253,0      47072     783383
/lib64/librt-2.12.so
libvirtd 6632 root  mem    REG              253,0     268200     783437
/lib64/libdbus-1.so.3.4.0
libvirtd 6632 root  mem    REG              253,0     113904     783687
/lib64/libresolv-2.12.so
libvirtd 6632 root  mem    REG              253,0     387880     783460
/lib64/libfreebl3.so
libvirtd 6632 root  mem    REG              253,0      43392     783475
/lib64/libcrypt-2.12.so
libvirtd 6632 root  mem    REG              253,0     248304     783647
/lib64/libsepol.so.1
libvirtd 6632 root  mem    REG              253,0      14800     784383
/lib64/libcom_err.so.2.1
libvirtd 6632 root  mem    REG              253,0     127480     395372
/usr/lib64/libnssutil3.so
libvirtd 6632 root  mem    REG              253,0     241536     783441
/lib64/libnspr4.so
libvirtd 6632 root  mem    REG              253,0      18864     784397
/lib64/libplc4.so
libvirtd 6632 root  mem    REG              253,0    1287424     395499
/usr/lib64/libnss3.so
libvirtd 6632 root  mem    REG              253,0      14704     784398
/lib64/libplds4.so
libvirtd 6632 root  mem    REG              253,0     215792     395501
/usr/lib64/libssl3.so
libvirtd 6632 root  mem    REG              253,0      95016     783442
/lib64/libaudit.so.1.0.0
libvirtd 6632 root  mem    REG              253,0    1651160     395817
/usr/lib64/libcrypto.so.1.0.0
libvirtd 6632 root  mem    REG              253,0     183440     395503
/usr/lib64/libsmime3.so
libvirtd 6632 root  mem    REG              253,0     181608     784382
/lib64/libk5crypto.so.3.1
libvirtd 6632 root  mem    REG              253,0      10328     783668
/lib64/libkeyutils.so.1.3
libvirtd 6632 root  mem    REG              253,0      35752     395505
/usr/lib64/libnuma.so.1
libvirtd 6632 root  mem    REG              253,0      43320     784381
/lib64/libkrb5support.so.0.1
libvirtd 6632 root  mem    REG              253,0     915104     784384
/lib64/libkrb5.so.3.3
libvirtd 6632 root  mem    REG              253,0      15616     783664
/lib64/libgpg-error.so.0.5.0
libvirtd 6632 root  mem    REG              253,0     225808     397181
/usr/lib64/libpcap.so.1.0.0
libvirtd 6632 root  mem    REG              253,0     109664     398139
/usr/lib64/libsasl2.so.2.0.23
libvirtd 6632 root  mem    REG              253,0     269104     784400
/lib64/libgssapi_krb5.so.2.2
libvirtd 6632 root  mem    REG              253,0    1379080     394916
/usr/lib64/libxml2.so.2.7.6
libvirtd 6632 root  mem    REG              253,0     479856     783665
/lib64/libgcrypt.so.11.5.3
libvirtd 6632 root  mem    REG              253,0      60816     785052
/lib64/liblber-2.4.so.2.5.6
libvirtd 6632 root  mem    REG              253,0     307224     785053
/lib64/libldap-2.4.so.2.5.6
libvirtd 6632 root  mem    REG              253,0      61504     783649
/lib64/libudev.so.0.5.1
libvirtd 6632 root  mem    REG              253,0     374128     395921
/usr/lib64/libssl.so.1.0.0
libvirtd 6632 root  mem    REG              253,0     209088     783446
/lib64/libidn.so.11.6.1
libvirtd 6632 root  mem    REG              253,0      71456     394775
/usr/lib64/libtasn1.so.3.1.6
libvirtd 6632 root  mem    REG              253,0     164024     397192
/usr/lib64/libssh2.so.1.0.1
libvirtd 6632 root  mem    REG              253,0     662656     394864
/usr/lib64/libgnutls.so.26.14.12
libvirtd 6632 root  mem    REG              253,0     343864     399107
/usr/lib64/libcurl.so.4.1.1
libvirtd 6632 root  mem    REG              253,0     340072     783495
/lib64/libnl.so.1.1
libvirtd 6632 root  mem    REG              253,0      21160     785075
/lib64/libcap-ng.so.0.0.0
libvirtd 6632 root  mem    REG              253,0      31352     394890
/usr/lib64/libyajl.so.1.0.7
libvirtd 6632 root  mem    REG              253,0      51056     395369
/usr/lib64/libavahi-common.so.3.5.1
libvirtd 6632 root  mem    REG              253,0      84296     399312
/usr/lib64/libexslt.so.0.8.15
libvirtd 6632 root  mem    REG              253,0      68768     393585
/usr/lib64/libavahi-client.so.3.2.5
libvirtd 6632 root  mem    REG              253,0     248720     399028
/usr/lib64/libxslt.so.1.1.26
libvirtd 6632 root  mem    REG              253,0     136744     394932
/usr/lib64/libfa.so.1.3.1
libvirtd 6632 root  mem    REG              253,0     304424     398772
/usr/lib64/libaugeas.so.0.11.0
libvirtd 6632 root  mem    REG              253,0      61480     404455
/usr/lib64/libnetcf.so.1.4.0
libvirtd 6632 root  mem    REG              253,0     158912     787495
/lib64/libdevmapper.so.1.02
libvirtd 6632 root  mem    REG              253,0      27424     783388
/lib64/libnss_dns-2.12.so
libvirtd 6632 root  mem    REG              253,0      17048     399244
/usr/lib64/sasl2/liblogin.so.2.0.23
libvirtd 6632 root  mem    REG              253,0      19352     403817
/usr/lib64/sasl2/libcrammd5.so.2.0.23
libvirtd 6632 root  mem    REG              253,0      49240     403820
/usr/lib64/sasl2/libdigestmd5.so.2.0.23
libvirtd 6632 root  mem    REG              253,0      16408     393090
/usr/lib64/sasl2/libanonymous.so.2.0.23
libvirtd 6632 root  mem    REG              253,0    1526984     783432
/lib64/libdb-4.7.so
libvirtd 6632 root  mem    REG              253,0      22272     393093
/usr/lib64/sasl2/libsasldb.so.2.0.23
libvirtd 6632 root  mem    REG              253,0      29560     393484
/usr/lib64/sasl2/libgssapiv2.so.2.0.23
libvirtd 6632 root  mem    REG              253,0      16952     399247
/usr/lib64/sasl2/libplain.so.2.0.23
libvirtd 6632 root  mem    REG              253,0      61624     783400
/lib64/libnss_files-2.12.so
libvirtd 6632 root  mem    REG              253,0      26050     394826
/usr/lib64/gconv/gconv-modules.cache
libvirtd 6632 root  mem    REG              253,0   99158752     404858
/usr/lib/locale/locale-archive
libvirtd 6632 root  DEL    REG              253,0                404151
/usr/lib64/libvirt.so.0.9.3.#prelink#.A1z8f8
libvirtd 6632 root  DEL    REG              253,0                404055
/usr/lib64/libvirt-qemu.so.0.9.3.#prelink#.BWRrqE
libvirtd 6632 root    0r   CHR                1,3        0t0       3779
/dev/null
libvirtd 6632 root    1w   CHR                1,3        0t0       3779
/dev/null
libvirtd 6632 root    2w   CHR                1,3        0t0       3779
/dev/null
libvirtd 6632 root    3w   REG              253,0 1238563079     915495
/var/log/libvirtd.log
libvirtd 6632 root    4r  FIFO                0,8        0t0      19732
pipe
libvirtd 6632 root    5u  unix 0xffff88040baf90c0        0t0      19744
socket
libvirtd 6632 root    6w  FIFO                0,8        0t0      19732
pipe
libvirtd 6632 root    7r  FIFO                0,8        0t0      19733
pipe
libvirtd 6632 root    8w  FIFO                0,8        0t0      19733
pipe
libvirtd 6632 root    9u  sock                0,6        0t0      19734
can't identify protocol
libvirtd 6632 root   10u  unix 0xffff88040cc56c80        0t0      19737
/var/run/libvirt/libvirt-sock
libvirtd 6632 root   11u  unix 0xffff88040ba63680        0t0      19739
/var/run/libvirt/libvirt-sock-ro
libvirtd 6632 root   12r   CHR                1,9        0t0       3784
/dev/urandom
libvirtd 6632 root   13u  IPv4              19742        0t0        TCP
*:16514 (LISTEN)
libvirtd 6632 root   14u  sock                0,6        0t0      19750
can't identify protocol
libvirtd 6632 root   15u  unix 0xffff88040ab66080        0t0      19743
socket
libvirtd 6632 root   16w   REG                0,3          0 4026531946
/proc/mtrr
libvirtd 6632 root   17u   REG                0,0       4096       9935
/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.1/config
libvirtd 6632 root   18u  sock                0,6        0t0      19793
can't identify protocol
libvirtd 6632 root   19u  unix 0xffff8807f56eb680        0t0    6321290
/var/run/libvirt/libvirt-sock
libvirtd 6632 root   20u  sock                0,6        0t0      21690
can't identify protocol
libvirtd 6632 root   21u  sock                0,6        0t0      21691
can't identify protocol
libvirtd 6632 root   22u  sock                0,6        0t0    6321358
can't identify protocol
libvirtd 6632 root   23u  sock                0,6        0t0    6321359
can't identify protocol
libvirtd 6632 root   24u  unix 0xffff8807c24e8380        0t0    6322577
/var/run/libvirt/libvirt-sock
libvirtd 6632 root   25u  IPv4             158404        0t0        TCP
intg-vdsb.qa.lab.tlv.redhat.com:44223->nott-vdsa.qa.lab.tlv.redhat.com:165
14 (CLOSE_WAIT)
libvirtd 6632 root   26u  sock                0,6        0t0     138909
can't identify protocol
libvirtd 6632 root   27r  FIFO                0,8        0t0     158405
pipe
libvirtd 6632 root   28u  IPv4             167029        0t0        TCP
intg-vdsb.qa.lab.tlv.redhat.com:60217->nott-vdsa.qa.lab.tlv.redhat.com:165
14 (CLOSE_WAIT)
libvirtd 6632 root   29r  FIFO                0,8        0t0     167030
pipe
libvirtd 6632 root   30w  FIFO                0,8        0t0     167030
pipe
libvirtd 6632 root   31u  sock                0,6        0t0    6322584
can't identify protocol
libvirtd 6632 root   32r   BLK             253,23 0x90000000     143969
/dev/dm-23
libvirtd 6632 root   33u  sock                0,6        0t0    6322585
can't identify protocol
libvirtd 6632 root   34u  IPv4             163644        0t0        TCP
intg-vdsb.qa.lab.tlv.redhat.com:60213->nott-vdsa.qa.lab.tlv.redhat.com:165
14 (CLOSE_WAIT)
libvirtd 6632 root   36r  FIFO                0,8        0t0     163645
pipe
libvirtd 6632 root   39w  FIFO                0,8        0t0     163645
pipe
libvirtd 6632 root   40r   BLK             253,21 0x40000000     141784
/dev/dm-21
libvirtd 6632 root   42w  FIFO                0,8        0t0     158405
pipe
libvirtd 6632 root   45r   BLK             253,22 0x90000000     142904
/dev/dm-22
libvirtd 6632 root   48u  IPv4             159265        0t0        TCP
intg-vdsb.qa.lab.tlv.redhat.com:44225->nott-vdsa.qa.lab.tlv.redhat.com:165
14 (CLOSE_WAIT)
libvirtd 6632 root   49r  FIFO                0,8        0t0     159266
pipe
libvirtd 6632 root   50w  FIFO                0,8        0t0     159266
pipe
libvirtd 6632 root   57u  IPv4             159832        0t0        TCP
intg-vdsb.qa.lab.tlv.redhat.com:44227->nott-vdsa.qa.lab.tlv.redhat.com:165
14 (CLOSE_WAIT)
libvirtd 6632 root   58r  FIFO                0,8        0t0     159833
pipe
libvirtd 6632 root   60w  FIFO                0,8        0t0     159833
pipe

Comment 2 Eric Blake 2011-07-14 20:56:24 UTC
Are you in a setup where you could use valgrind to track for leaked fds?  valgrind --track-fds=yes might be able to pinpoint if libvirt code is opening a block device and forgetting to close it.

Comment 3 Eric Blake 2011-07-14 21:13:02 UTC
I'm not sure I see what you think is evidence that there is a deadlock.  Is this a case of libvirt being unresponsive, at a point where it happens to have some device fds open, although the deadlock has nothing in particular to do with device code?  Or is this a case of the file system itself going into deadlock at the kernel level and thus blocking libvirt as a result?  Are there any particular strings you want me to look at in the log files to back up your claim of deadlock?  How reproducible is the setup that caused this problem?

Comment 4 Michal Privoznik 2011-07-15 10:55:19 UTC
From the GDB output I see that gnutls_record_recv returns -10, which is GNUTLS_E_INVALID_SESSION. This happens on incorrect closed session. I've helped rvaknin to debug this and it was returning over and over again, which results in very much 'I/O errors' messages in logs. Therefore I think this might be dup of BZ#721411 or at least related to.

Comment 5 Rami Vaknin 2011-07-18 16:35:03 UTC
This bug reproduces on libvirt-0.9.3-5.el6.x86_64

I ran libvirtd under valgrind using the following command line:
valgrind --track-fds=yes --verbose --log-file=/root/valgrind_fd_nott /usr/sbin/libvirtd --daemon --listen

The output of "lsof -p" shows that /dev/dm-31 remains open:
memcheck- 15350 root   38r   BLK             253,31 0x90000000    1655163 /dev/dm-31

I'll attach valgrind's log.

Comment 6 Rami Vaknin 2011-07-18 16:35:28 UTC
Created attachment 513647 [details]
valgrind's log

Comment 7 Rami Vaknin 2011-07-18 19:41:46 UTC
According to valgrind's log, it seems that the FD's leak is in dup2 of stderr (look for the followin vm name in valgrind's log: "stress_new_pool-02")

Comment 8 Rami Vaknin 2011-07-21 13:36:33 UTC
(In reply to comment #4)
> From the GDB output I see that gnutls_record_recv returns -10, which is
> GNUTLS_E_INVALID_SESSION. This happens on incorrect closed session. I've helped
> rvaknin to debug this and it was returning over and over again, which results
> in very much 'I/O errors' messages in logs. Therefore I think this might be dup
> of BZ#721411 or at least related to.

Note that this bug reproduces in libvirt-0.9.3-7.el6.x86_64 although BZ#721411 already verified.

Comment 9 Laine Stump 2011-08-04 06:26:52 UTC
Rami: a leaked fd that is a dup of stderr is unrelated to the problem you've reported - stderr would never be a block device.

Also, the word "deadlock" is in the summary of the report, but I don't see anything in the report that suggests there is a deadlock anywhere. Can you explain where you think there is a deadlock?

To be sure I understand this problem correctly:

The symptoms are that after some number of domain life cycle operations, there are no domains left running on the system, but there are several block devices still open (in the example given in Comment 0, it's /dev/dm-21, /dev/dm-22, and /dev/dm-23). Other than these leaked fds that are open on block devices, there are no other operationally visible symptoms. Is that correct?

What exactly are these block devices used for? The image files of the guests?

Is this easy to reproduce? Any chance you can come up with a reproduction case simple enough to be done on a single machine? Alternately, can you install the latest libvirt build from brew (including the debuginfo), and attempt to reproduce the problem, then give me access to the machine to poke around?

Michal: was there a particular reason you thought the failure of gnutls_record_recv might be the cause of these leaked block device fds? Or did that just happen to be the only interesting error in the logs?

Comment 10 Laine Stump 2011-08-08 06:15:40 UTC
Regarding a run with valgrind - can you try that again, but letting it run until you see the extra block devices open in the output of lsof? At that point valgrind will be able to tell us what line of code opened the block device, which will help to narrow down who should be closing it.

Comment 11 Rami Vaknin 2011-08-08 12:46:03 UTC
(In reply to comment #9)
> Rami: a leaked fd that is a dup of stderr is unrelated to the problem you've
> reported - stderr would never be a block device.

I thought it might be related. btw, is the leaked fd that is a dup of stderr is a known issue? is it covered by any BZ?

> Also, the word "deadlock" is in the summary of the report, but I don't see
> anything in the report that suggests there is a deadlock anywhere. Can you
> explain where you think there is a deadlock?

That's the impression me and mprivozn got while trying to debug the issue for the first time.

> 
> To be sure I understand this problem correctly:
> 
> The symptoms are that after some number of domain life cycle operations, there
> are no domains left running on the system, but there are several block devices
> still open (in the example given in Comment 0, it's /dev/dm-21, /dev/dm-22, and
> /dev/dm-23). Other than these leaked fds that are open on block devices, there
> are no other operationally visible symptoms. Is that correct?
>
You're right, except for the operationally visible symptom - the symptom is vdsm-wise, it can't do any operation on those VMs as there disk remain open by another process.

> What exactly are these block devices used for? The image files of the guests?
> 
Correct.

> Is this easy to reproduce? Any chance you can come up with a reproduction case
> simple enough to be done on a single machine? Alternately, can you install the
> latest libvirt build from brew (including the debuginfo), and attempt to
> reproduce the problem, then give me access to the machine to poke around?
> 
It will take a lot of time to spot the scenario since the automatic test has a lot steps, I assume that the same reproduction with the newest libvirt will be easy. BTW, I reproduced it on any libvirt I've tested (0.9.3-2 to 0.9.3-7), let me know which libvirt exactly to use.

> Michal: was there a particular reason you thought the failure of
> gnutls_record_recv might be the cause of these leaked block device fds? Or did
> that just happen to be the only interesting error in the logs?

Comment 12 Rami Vaknin 2011-08-08 12:48:13 UTC
(In reply to comment #10)
> Regarding a run with valgrind - can you try that again, but letting it run
> until you see the extra block devices open in the output of lsof? At that point
> valgrind will be able to tell us what line of code opened the block device,
> which will help to narrow down who should be closing it.

Please let me know which valgrind's parameters will help to spot the issue.

Comment 13 Michal Privoznik 2011-08-08 13:34:56 UTC
Laine: yes, it was the first guide to error. I was thinking - we go stucked in there (as I've noted above, we called this function over and over again) and therefore not reached the point where fds are closed. But it was the very first (and rough) idea what may happen.

Comment 14 Laine Stump 2011-08-08 15:11:59 UTC
Based on the other information you gave, it sounds likely that it's not a resource leak, but just some thread not finishing up a task, so don't spend your time with a valgrind run yet, as it may not show anything anyway.

However, we discussed this in the weekly libvirt meeting, and I learned that all the versions you've tested with had newly rewritten rpc code that still had a number of problems; it wasn't stabilized until sometime later. To get to a known "working" baseline, can you run again with "0.9.4-rc1.2" from the hybrid repo? This has several well known fixes to the new rpc code. We should also try to arrange to do the testing at a time when I can be around to poke at it with gdb if/when it fails.

Comment 15 Rami Vaknin 2011-08-11 14:48:07 UTC
I ran the same automation test twice and it seems that the bug is fixed and does not reproduces anymore.

Tested with libvirt-0.9.4-0rc1.2.el6.x86_64

Comment 16 Laine Stump 2011-08-11 15:31:37 UTC
Since the problem no longer occurs with the fixed rpc code when the same test is run by the same person, I think we can safely assume it has been fixed, so I'm closing the bug.


Note You need to log in before you can comment on or make changes to this bug.