Bug 851423

Summary: virsh segmentation fault when using find-storage-pool-sources
Product: Red Hat Enterprise Linux 6 Reporter: zhe peng <zpeng>
Component: libvirtAssignee: Gunannan Ren <gren>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.4CC: acathrow, dallan, dyasny, dyuan, gren, mzhan, rwu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.0-1.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 07:21:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhe peng 2012-08-24 07:44:28 UTC
Description of problem:
virsh segmentation fault when using virsh find-storage-pool-sources


Version-Release number of selected component (if applicable):
libvirt-0.10.0-0rc1.el6.x86_64


How reproducible:
10%

Steps to Reproduce:
1.prepare a iscsi source xml
#cat iscsi.xml
 <source>
     <host name='10.66.90.100'/>
         <device path='iqn.2001-05.com.equallogic:0-8a0906-6eb1f7d03-30cf49b25f24f94d-libvirt-1-150313'/>
           </source>

#gdb virsh
(gdb)find-storage-pool-sources iscsi iscsi.xml
Starting program: /usr/bin/virsh find-storage-pool-sources iscsi source.xml
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff1b3b700 (LWP 20183)]
<sources>
  <source>
    <host name='10.66.90.100'/>
    <device path='iqn.2001-05.com.equallogic:0-8a0906-12a1f7d03-0daf49b25a84ee02-s3-kyla-131842'/>
  </source>
  <source>
    <host name='10.66.90.100'/>
    <device path='iqn.2001-05.com.equallogic:0-8a0906-9951f7d03-34cf49b25f04f94b-libvirt-2-150313'/>
  </source>
  <source>
    <host name='10.66.90.100'/>
    <device path='iqn.2001-05.com.equallogic:0-8a0906-6eb1f7d03-30cf49b25f24f94d-libvirt-1-150313'/>
  </source>
</sources>


Program received signal SIGSEGV, Segmentation fault.
0x000000355ee09220 in pthread_mutex_lock () from /lib64/libpthread.so.0
(gdb) bt
#0  0x000000355ee09220 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1  0x00007ffff7d191bd in virNetSocketRemoveIOCallback (sock=0x0) at rpc/virnetsocket.c:1392
#2  0x00007ffff7d0c65d in virNetClientMarkClose (client=0x67bbe0, reason=3) at rpc/virnetclient.c:514
#3  0x00007ffff7d0cb86 in virNetClientCloseInternal (client=0x67bbe0, reason=3) at rpc/virnetclient.c:575
#4  0x00007ffff7cf14de in doRemoteClose (conn=<value optimized out>, priv=0x67b530) at remote/remote_driver.c:948
#5  0x00007ffff7cf168b in remoteClose (conn=0x67b1c0) at remote/remote_driver.c:976
#6  0x00007ffff7cabc1b in virReleaseConnect (conn=0x67b1c0) at datatypes.c:114
#7  0x00007ffff7cad148 in virUnrefConnect (conn=0x67b1c0) at datatypes.c:152
#8  0x00007ffff7cc6478 in virConnectClose (conn=0x67b1c0) at libvirt.c:1456
#9  0x000000000042aaa9 in ?? ()
#10 0x000000000042d481 in ?? ()
#11 0x000000355ea1ecdd in __libc_start_main () from /lib64/libc.so.6
#12 0x000000000040a7c9 in ?? ()
#13 0x00007fffffffe518 in ?? ()
#14 0x000000000000001c in ?? ()
#15 0x0000000000000004 in ?? ()
#16 0x00007fffffffe7a4 in ?? ()
#17 0x00007fffffffe7b3 in ?? ()
#18 0x00007fffffffe7cd in ?? ()
#19 0x00007fffffffe7d3 in ?? ()
#20 0x0000000000000000 in ?? ()


Actual results:
Segmentation fault(core dumped)

Expected results:
no segmentation fault.

Additional info:
not reproduce 100%.

Comment 1 zhe peng 2012-08-24 07:54:48 UTC
reproduce one more time:
(gdb) bt
#0  0x000000355ee09220 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1  0x00007ffff7cfc87d in virNetSocketRemoveIOCallback (sock=0x0) at rpc/virnetsocket.c:1577
#2  0x00007ffff7cef8dd in virNetClientMarkClose (client=0x67da40, reason=3) at rpc/virnetclient.c:647
#3  0x00007ffff7cefe46 in virNetClientCloseInternal (client=0x67da40, reason=3) at rpc/virnetclient.c:708
#4  0x00007ffff7cd3b6e in doRemoteClose (conn=<value optimized out>, priv=0x67d710) at remote/remote_driver.c:993
#5  0x00007ffff7cd3d1b in remoteClose (conn=0x67d370) at remote/remote_driver.c:1021
#6  0x00007ffff7c909ff in virConnectDispose (obj=0x67d370) at datatypes.c:144
#7  0x00007ffff7c2ca4b in virObjectUnref (anyobj=<value optimized out>) at util/virobject.c:139
#8  0x00007ffff7ca6048 in virConnectClose (conn=0x67d370) at libvirt.c:1455
#9  0x000000000040c109 in vshDeinit (ctl=0x7fffffffe370) at virsh.c:2507
#10 0x000000000040fd1f in main (argc=<value optimized out>, argv=<value optimized out>) at virsh.c:2942

Comment 2 Gunannan Ren 2012-08-27 09:15:40 UTC
The segmentation fault happens when an async event causes the client event loop thread to set client-sock to NULL. Then, the working thread dereferences the NULL value before NULL-value checking.

patch sent to upstream
https://www.redhat.com/archives/libvir-list/2012-August/msg01727.html

Comment 3 Gunannan Ren 2012-08-27 09:30:47 UTC
commit 2b8624dd33023bd706b55b5a956d242d53928ec5
Author: Guannan Ren <gren>
Date:   Mon Aug 27 16:59:25 2012 +0800

    rpc: fix segmentation fault caused by null client-sock
    
    The client-sock could have been set to NULL by eventloop thread
    after async event fired.

Comment 5 zhe peng 2012-08-30 05:31:58 UTC
verify with libvirt-0.10.0-1.el6.x86_64

run virsh find-storage-pool-sources more than 300 times, no segmentation fault occur
always get output from gdb:
Starting program: /usr/bin/virsh find-storage-pool-sources iscsi iscsi-pool.xml
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff1b17700 (LWP 24267)]
<sources>
  <source>
    <host name='10.66.90.100'/>
    <device path='iqn.2001-05.com.equallogic:0-8a0906-12a1f7d03-0daf49b25a84ee02-s3-kyla-131842'/>
  </source>
  <source>
    <host name='10.66.90.100'/>
    <device path='iqn.2001-05.com.equallogic:0-8a0906-9951f7d03-34cf49b25f04f94b-libvirt-2-150313'/>
  </source>
  <source>
    <host name='10.66.90.100'/>
    <device path='iqn.2001-05.com.equallogic:0-8a0906-6eb1f7d03-30cf49b25f24f94d-libvirt-1-150313'/>
  </source>
</sources>

[Thread 0x7ffff1b17700 (LWP 24267) exited]

Program exited normally.

verification passed.

Comment 6 errata-xmlrpc 2013-02-21 07:21:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html