Bug 1436999

Summary: Repeatable double free when undefining storage pool
Product: Red Hat Enterprise Linux 7 Reporter: Ján Tomko <jtomko>
Component: libvirtAssignee: Ján Tomko <jtomko>
Status: CLOSED ERRATA QA Contact: yisun
Severity: high Docs Contact:
Priority: medium    
Version: 7.3CC: dyuan, jtomko, lmen, rbalakri, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-3.2.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1436400 Environment:
Last Closed: 2017-08-01 17:24:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1436400    
Bug Blocks:    

Description Ján Tomko 2017-03-29 08:52:43 UTC
Also reproducible with libvirt-daemon-3.1.0-2.el7.x86_64

+++ This bug was initially created as a clone of Bug #1436400 +++

While experimenting with ansible to configure a fresh virtualization host, I managed to create a storage pool that, if I try to undefine it, will cause libvirtd to crash.  I will attach the full journal entry, but here's the error and backtrace:

libvirtd[21979]: *** Error in `/usr/sbin/libvirtd': double free or corruption (fasttop): 0x00007efff80016e0 ***
libvirtd[21979]: ======= Backtrace: =========
libvirtd[21979]: /lib64/libc.so.6(+0x791fb)[0x7f003a5ee1fb]
libvirtd[21979]: /lib64/libc.so.6(+0x8288a)[0x7f003a5f788a]
libvirtd[21979]: /lib64/libc.so.6(cfree+0x4c)[0x7f003a5fb2bc]
libvirtd[21979]: /lib64/libvirt.so.0(virFree+0x1b)[0x7f003e23071b]
libvirtd[21979]: /lib64/libvirt.so.0(virStoragePoolSourceClear+0x85)[0x7f003e30a025]
libvirtd[21979]: /lib64/libvirt.so.0(virStoragePoolDefFree+0x21)[0x7f003e30a0d1]
libvirtd[21979]: /lib64/libvirt.so.0(+0x14bad1)[0x7f003e30bad1]
libvirtd[21979]: /lib64/libvirt.so.0(virStoragePoolObjRemove+0x6e)[0x7f003e30c64e]
libvirtd[21979]: /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so(+0xf325)[0x7f002ced5325]
libvirtd[21979]: /lib64/libvirt.so.0(virStoragePoolUndefine+0x8f)[0x7f003e36a07f]
libvirtd[21979]: /usr/sbin/libvirtd(+0x39a4f)[0x55e4da3eda4f]
libvirtd[21979]: /lib64/libvirt.so.0(virNetServerProgramDispatch+0x3ce)[0x7f003e3b26fe]
libvirtd[21979]: /usr/sbin/libvirtd(+0x50718)[0x55e4da404718]
libvirtd[21979]: /lib64/libvirt.so.0(+0xdb50b)[0x7f003e29b50b]
libvirtd[21979]: /lib64/libvirt.so.0(+0xda898)[0x7f003e29a898]
libvirtd[21979]: /lib64/libpthread.so.0(+0x76ca)[0x7f003a9426ca]
libvirtd[21979]: /lib64/libc.so.6(clone+0x5f)[0x7f003a67cf7f]

I create the pool with this ansible task (which is probably wrong; I'm just experimenting):

- name: Define pool
  tags: virtualization
  virt_pool:
    command: define
    name: virts
    xml: '{{ lookup("file", "pool.xml") }}'
    autostart: yes
    state: active
  when: not "virts" in ansible_libvirt_pools

pool.xml is:

<pool type='logical'>
  <source>
    <name>virt</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/virt</path>
  </target>
</pool>

The pool which is defined looks like this:

virsh # pool-dumpxml virt
<pool type='logical'>
  <name>virt</name>
  <uuid>ef610b5d-9b53-4d62-a5d2-5ad601602202</uuid>
  <capacity unit='bytes'>0</capacity>
  <allocation unit='bytes'>0</allocation>
  <available unit='bytes'>0</available>
  <source>
    <name>virt</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/virt</path>
  </target>
</pool>

which is odd because the name is wrong.  Not sure how that happened but I'm sure my XML isn't correct or complete.

Running pool-destroy or deleting the pool from virt-manager immediately crashes libvirtd with the above backtrace.  This is all on an updated F25 machine:

[root@vs02 ~]# rpm -qa|grep virt|sort
libgovirt-0.3.4-1.fc25.x86_64
libvirt-client-2.2.0-2.fc25.x86_64
libvirt-daemon-2.2.0-2.fc25.x86_64
libvirt-daemon-config-network-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-interface-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-network-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-nodedev-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-nwfilter-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-qemu-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-secret-2.2.0-2.fc25.x86_64
libvirt-daemon-driver-storage-2.2.0-2.fc25.x86_64
libvirt-daemon-kvm-2.2.0-2.fc25.x86_64
libvirt-glib-1.0.0-1.fc25.x86_64
libvirt-libs-2.2.0-2.fc25.x86_64
libvirt-python-2.2.0-1.fc25.x86_64
libvirt-python3-2.2.0-1.fc25.x86_64
virt-install-1.4.1-2.fc25.noarch
virt-manager-1.4.1-2.fc25.noarch
virt-manager-common-1.4.1-2.fc25.noarch
virt-viewer-5.0-1.fc25.x86_64

--- Additional comment from Jason Tibbitts on 2017-03-27 22:55:16 CEST ---

Note that if I just add "<name>virts</name>" to the XML, everything is fine.  Obviously I was doing something wrong but I figured you'd want to know about a complete libvirtd crash in any case.

--- Additional comment from Ján Tomko on 2017-03-28 15:22:51 CEST ---

Upstream patch:
https://www.redhat.com/archives/libvir-list/2017-March/msg01440.html

--- Additional comment from Ján Tomko on 2017-03-29 10:43:46 CEST ---

Pushed as:
commit e9f9690958b7fc86c4002c16cd2bdccba0dd07d1
Author:     Ján Tomko <jtomko>
CommitDate: 2017-03-29 10:36:55 +0200

    conf: do not steal pointers from the pool source

git describe: v3.2.0-rc1-16-ge9f9690

Comment 1 yisun 2017-03-30 03:21:46 UTC
reproduce steps as follow on libvirt-3.1.0-2.el7.x86_64:
root@localhost ~  ## cat logical.pool 
<pool type='logical'>
  <source>
    <name>vg_luks</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/vg_luks</path>
  </target>
</pool>


root@localhost ~  ## virsh pool-create logical.pool 
Pool vg_luks created from logical.pool

root@localhost ~  ## virsh pool-destroy vg_luks
error: Disconnected from qemu:///system due to I/O error
error: Failed to destroy pool vg_luks
error: End of file while reading data: Input/output error

root@localhost ~  ## service libvirtd restart
Redirecting to /bin/systemctl restart  libvirtd.service
root@localhost ~  ## virsh pool-define logical.pool 
Pool vg_luks defined from logical.pool

root@localhost ~  ## virsh pool-start vg_luks
Pool vg_luks started

root@localhost ~  ## virsh pool-destroy vg_luks
Pool vg_luks destroyed

root@localhost ~  ## virsh pool-undefine vg_luks
error: Disconnected from qemu:///system due to I/O error
error: Failed to undefine pool vg_luks
error: End of file while reading data: Input/output error

Comment 3 yisun 2017-04-06 07:09:39 UTC
Verified with libvirt-3.2.0-1.el7.x86_64

Steps:
1. pool-create and pool-destroy
## cat logical.pool 
<pool type='logical'>
  <source>
    <name>vg_luks</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/vg_luks</path>
  </target>
</pool>


## virsh pool-create logical.pool 
Pool vg_luks created from logical.pool

## virsh pool-destroy vg_luks
Pool vg_luks destroyed

## virsh pool-list --all
 Name                 State      Autostart 
-------------------------------------------
 default              active     no        
       


2. pool-define, pool-start, pool-destroy and pool-undefine
## virsh pool-define logical.pool
Pool vg_luks defined from logical.pool

## virsh pool-start vg_luks
Pool vg_luks started

## virsh pool-destroy vg_luks
Pool vg_luks destroyed

## virsh pool-undefine vg_luks
Pool vg_luks has been undefined

## virsh pool-list --all 
 Name                 State      Autostart 
-------------------------------------------
 default              active     no

Comment 4 errata-xmlrpc 2017-08-01 17:24:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 5 errata-xmlrpc 2017-08-02 00:03:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846