Bug 1296802

Summary: Libvirtd is killed while getting rbd volume name which is already removed from ceph through rbd
Product: [Community] Virtualization Tools Reporter: Yang Yang <yanyang>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, crobinso, dyuan, hhan, jferlan, mzhan, wido
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-18 09:54:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Yang Yang 2016-01-08 07:05:45 UTC
Description of problem:
Remove RBD images from ceph server through rbd while refreshing the pool. Pool
refreshing is completed without error. However, libvirtd is killed
while getting rbd volume name which is already removed

Version-Release number of selected component (if applicable):
libvirt-1.3.1-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare a running rbd pool
# virsh pool-dumpxml rbd
<pool type='rbd'>
  <name>rbd</name>
  <uuid>ebda974a-4fb7-4af2-b1a0-7a94e5cdda98</uuid>
  <capacity unit='bytes'>0</capacity>
  <allocation unit='bytes'>0</allocation>
  <available unit='bytes'>0</available>
  <source>
    <host name='10.66.110.191'/>
    <name>yy</name>
  </source>
</pool>

2. create volumes
# for i in {1..100}; do virsh vol-create-as rbd vol$i 100M; done

3. refresh rbd pool, meanwhile, remove 2 rbd vols from rbd server
# virsh pool-refresh rbd

[root@osd1 ~]# rbd rm yy/vol97
Removing image: 100% complete...done.
[root@osd1 ~]# rbd rm yy/vol96
Removing image: 100% complete...done.

4. list vol
# virsh vol-list rbd
error: Failed to list volumes
error: key in virGetStorageVol must not be NULL

5. get volume name which is removed in step 3
# virsh vol-name yy/vol97
error: Disconnected from qemu:///system due to I/O error
error: failed to get vol 'yy/vol97'
error: internal error: client socket is closed

error: One or more references were leaked after disconnect from the hypervisor

Actual results:


Expected results:
Pool refreshing should properly handle the volumes which are deleted through
other route than libvirt 

Additional info:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f2c47cd7700 (LWP 17954)]
0x00007f2c556c7fc6 in __strcmp_sse42 () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f2c556c7fc6 in __strcmp_sse42 () from /lib64/libc.so.6
#1  0x00007f2c58362c99 in virStorageVolDefFindByKey (pool=<optimized out>,
    key=key@entry=0x7f2c3001c250 "yy/vol97") at conf/storage_conf.c:1734
#2  0x00007f2c3eb05698 in storageVolLookupByKey (conn=0x7f2c34003d40,
    key=0x7f2c3001c250 "yy/vol97") at storage/storage_driver.c:1501
#3  0x00007f2c583bad57 in virStorageVolLookupByKey (conn=0x7f2c34003d40,
    key=0x7f2c3001c250 "yy/vol97") at libvirt-storage.c:1342
#4  0x00007f2c58fe64d8 in remoteDispatchStorageVolLookupByKey (
    server=0x7f2c5ae1afb0, msg=0x7f2c5ae3cda0, ret=0x7f2c3001c3a0,
    args=0x7f2c3001c4e0, rerr=0x7f2c47cd6c30, client=0x7f2c5ae3c290)
    at remote_dispatch.h:15967
#5  remoteDispatchStorageVolLookupByKeyHelper (server=0x7f2c5ae1afb0,
    client=0x7f2c5ae3c290, msg=0x7f2c5ae3cda0, rerr=0x7f2c47cd6c30,
    args=0x7f2c3001c4e0, ret=0x7f2c3001c3a0) at remote_dispatch.h:15945
#6  0x00007f2c584033a2 in virNetServerProgramDispatchCall (msg=0x7f2c5ae3cda0,
    client=0x7f2c5ae3c290, server=0x7f2c5ae1afb0, prog=0x7f2c5ae37fa0)
    at rpc/virnetserverprogram.c:437
#7  virNetServerProgramDispatch (prog=0x7f2c5ae37fa0,
    server=server@entry=0x7f2c5ae1afb0, client=0x7f2c5ae3c290,
    msg=0x7f2c5ae3cda0) at rpc/virnetserverprogram.c:307
#8  0x00007f2c583fe61d in virNetServerProcessMsg (msg=<optimized out>,
    prog=<optimized out>, client=<optimized out>, srv=0x7f2c5ae1afb0)
    at rpc/virnetserver.c:135
---Type <return> to continue, or q <return> to quit---
#9  virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f2c5ae1afb0)
    at rpc/virnetserver.c:156
#10 0x00007f2c582f78e5 in virThreadPoolWorker (
    opaque=opaque@entry=0x7f2c5ae0ff60) at util/virthreadpool.c:145
#11 0x00007f2c582f6e08 in virThreadHelper (data=<optimized out>)
    at util/virthread.c:206
#12 0x00007f2c5595edc5 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f2c5568c1cd in clone () from /lib64/libc.so.6

Comment 1 Yang Yang 2016-01-12 09:04:28 UTC
libvirt info

# pwd
/root/libvirt
# git describe 
v1.3.1-rc1

Comment 2 Cole Robinson 2016-04-10 22:56:36 UTC
Wido, does this sound familiar? any idea if this is fixed upstream?

Comment 3 Wido den Hollander 2016-04-11 07:06:51 UTC
(In reply to Cole Robinson from comment #2)
> Wido, does this sound familiar? any idea if this is fixed upstream?

I think it does. I thought I fixed it here: http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=f46d137e33a348c0f96eaacc58e29794170757cb

Commit: f46d137e33a348c0f96eaacc58e29794170757cb

Not sure where this is getting from. I think it goes wrong inside virStorageBackendRBDRefreshPool() where it iterates over the 'names' variable.

Not sure though. But that's my best guess.

Could take a while before I can dig into this.

Comment 4 Cole Robinson 2016-04-11 11:21:07 UTC
(In reply to Wido den Hollander from comment #3)
> 
> Could take a while before I can dig into this.

no worries, I was just looking to see if it could be easily closed

Comment 5 Daniel Berrangé 2021-08-18 09:54:23 UTC
Comment #3 suggests a likely fix and since there's been no feedback in 5 years since, I'll assume it was correct.