Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1076719

Summary:	libvirtd crashes if VM crashes or is destroyed while hot-attaching disks
Product:	Red Hat Enterprise Linux 6	Reporter:	Eric Blake <eblake>
Component:	libvirt	Assignee:	Peter Krempa <pkrempa>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.2	CC:	ajia, dyuan, eblake, jdenemar, jherrman, mzhan, rbalakri, shyu, tdosek
Target Milestone:	rc	Keywords:	ZStream
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	libvirt-0.10.2-30.el6	Doc Type:	Bug Fix
Doc Text:	Prior to this update, there was a typographical error in a condition that checks whether QEMU successfully attached a new disk to a guest. Due to the error, the libvirtd daemon terminated unexpectedly if the monitor command was unsuccesful; for instance, in case of a virtual machine failure or when attaching a guest disk drive was interrupted. In this update, the error has been corrected, and libvirtd no longer crashes in the described circumstances.	Story Points:	---
Clone Of:	1075973	Environment:
Last Closed:	2014-10-14 04:20:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	573946, 1026966, 1075973
Bug Blocks:	1080471

Description Eric Blake 2014-03-14 21:44:05 UTC

Cloning to RHEL 6; this is a downstream-only crasher, introduced in RHEL 6.2 when rewriting the patch for bug 573946 to apply to libvirt 0.9.4.

+++ This bug was initially created as a clone of Bug #1075973 +++

Version-Release number of selected component (if applicable):
libvirt-1.1.1-26.el7.x86_64
qemu-kvm-1.5.3-52.el7.x86_64
And:
libvirt-1.1.1-27.el7.x86_64
qemu-kvm-rhev-1.5.3-53.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. create a guest with gluster volume.
# virsh create r7g-qcow2-gluster.xml
Domain r7g-qcow2 created from r7g-qcow2-gluster.xml

# virsh dumpxml r7g-qcow2| grep disk -A 7
    <disk type='network' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source protocol='gluster' name='gluster-vol1/r7g-qcow2.img'>
        <host name='10.66.84.12'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>

# virsh list --all| grep r7g-qcow2
 10    r7g-qcow2                      running

2. try to attach a volume into the guest, press Ctrl+C to interupt the "virsh attach-device".

# more disk-gluster-vol.xml
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2'/>
<source protocol='gluster' name='gluster-vol1/rhel7.0-qcow2.img'>
<host name='10.66.106.22'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>

# virsh attach-device r7g-qcow2 disk-gluster-vol.xml
^C

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 58    r7g-qcow2                      running

3. try to destory the guest, met libvirtd core dump
# virsh destroy r7g-qcow2
error: Failed to destroy domain r7g-qcow2
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor

# virsh list --all
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused

# service libvirtd status
Redirecting to /bin/systemctl status  libvirtd.service
libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; disabled)
   Active: failed (Result: core-dump) since Thu 2014-03-13 17:28:42 CST; 20s ago
  Process: 10969 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=dumped, signal=SEGV)
 Main PID: 10969 (code=dumped, signal=SEGV)
   CGroup: /system.slice/libvirtd.service
           └─4476 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf

Mar 13 17:24:40 intel-i5-8-1 systemd[1]: Started Virtualization daemon.
Mar 13 17:24:40 intel-i5-8-1 dnsmasq[4476]: read /etc/hosts - 4 addresses
Mar 13 17:24:40 intel-i5-8-1 dnsmasq[4476]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Mar 13 17:24:40 intel-i5-8-1 dnsmasq-dhcp[4476]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPDISCOVER(virbr0) 52:54:00:7f:62:54
Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPOFFER(virbr0) 192.168.122.74 52:54:00:7f:62:54
Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPREQUEST(virbr0) 192.168.122.74 52:54:00:7f:62:54
Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPACK(virbr0) 192.168.122.74 52:54:00:7f:62:54 rhel75
Mar 13 17:28:42 intel-i5-8-1 systemd[1]: libvirtd.service: main process exited, code=dumped, status=11/SEGV
Mar 13 17:28:42 intel-i5-8-1 systemd[1]: Unit libvirtd.service entered failed state.

Actual results:
In step3, libvirtd core dump.

Expected results:
In step3, virsh destroy the guest successfully, no libvirtd core dump.

--- Additional comment from  on 2014-03-13 04:33:57 MDT ---



--- Additional comment from Peter Krempa on 2014-03-14 10:51:31 MDT ---

This is a downstream only issue. Fixed by:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2014-March/msg00363.html

commit e6cbf1ffab1f98704bf5d3ce09c4ceba2b022b6f
Author: Peter Krempa <pkrempa>
Date:   Fri Mar 14 17:34:07 2014 +0100

    qemu: monitor: Fix invalid parentheses
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1075973
    
    RHEL-only: the code in question is handling a downstream command
    
    A typo in parenteses in a condition checking the success of a monitor
    command lead to a crash of libvirtd if the monitor command isn't
    successful.
    
    The error path uses a combination of "ret == 0" and "ret < 0" error
    checks. Due to this fact the disk definition parsed from the user input
    is added to the domain definition but at the same time it's freed at the
    end of the AttachDevice API.
    
    When the domain is destroyed afterwards a use-after-free error leads to
    a crash on random places when freeing the disk in question.
    
    To reproduce use the attached reproducer with ANY disk definition
    supported (gluster as stated in the original report isn't required).
    
    Reproducer:
    
     diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
     index 502b977..afcf603 100644
     --- a/src/qemu/qemu_monitor.c
     +++ b/src/qemu/qemu_monitor.c
     @@ -28,6 +28,7 @@
      #include <sys/un.h>
      #include <unistd.h>
      #include <fcntl.h>
     +#include <signal.h>
    
      #include "qemu_monitor.h"
      #include "qemu_monitor_text.h"
     @@ -3003,6 +3004,8 @@ int qemuMonitorAddDrive(qemuMonitorPtr mon,
              return -1;
          }
    
     +    kill(mon->vm->pid, 9);
     +
          if (mon->json)
              ret = qemuMonitorJSONAddDrive(mon, drivestr);
          else

Comment 1 Eric Blake 2014-03-19 23:28:29 UTC

In POST:
http://post-office.corp.redhat.com/archives/rhvirt-patches/2014-March/msg00437.html

Comment 4 chhu 2014-04-10 07:05:41 UTC

Verified with packages:
libvirt-0.10.2-31.el6.x86_64
qemu-kvm-0.12.1.2-2.423.el6.x86_64

Test steps:
1. Create a guest with gluster volume
# virsh create r6-qcow2-gluster.xml 
Domain r6-qcow2 created from r6-qcow2-gluster.xml

# virsh dumpxml r6-qcow2| grep disk -A 7
    <disk type='network' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source protocol='gluster' name='gluster-vol1/rhel6-qcow2-disk.img'>
        <host name='10.66.106.25' port='24007'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>

2. Try to attach a gluster volume which is not available, press Ctrl+C to interupt the "virsh attach-device" process. VM is still running, then destroy the vm successfully, no libvirtd crash.

# more disk-gluster-vol.xml 
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2'/>
<source protocol='gluster' name='gluster-vol1/test.img'>
<host name='10.66.106.24' port='24007'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>

# virsh attach-device r6-qcow2 disk-gluster-vol.xml
^C

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 5     r6-qcow2                       running

# virsh destroy r6-qcow2
Domain r6-qcow2 destroyed

# virsh list --all
 Id    Name                           State
----------------------------------------------------

# service libvirtd status
libvirtd (pid  2445) is running...

3. Try to attach a gluster volume which is not available, got the qemu error.
VM is still running, no libvirtd crash.

# more disk-gluster-vol.xml 
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2'/>
<source protocol='gluster' name='gluster-vol1/test.img'>
<host name='10.66.106.24' port='24007'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>

# virsh attach-device r6-qcow2 disk-gluster-vol.xml 
error: Failed to attach device from disk-gluster-vol.xml
error: internal error unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk1' could not be initialized

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 6     r6-qcow2                       running

# service libvirtd status
libvirtd (pid  2445) is running...

4. Attach/detach another available gluster volume successfully.
# more disk-gluster-vol.xml 
<disk type='network' device='disk'>
<driver name='qemu' type='raw'/>
<source protocol='gluster' name='gluster-vol1/exist.img'>
<host name='10.66.106.25' port='24007'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>

# virsh attach-device r6-qcow2 disk-gluster-vol.xml 
Device attached successfully

# virsh detach-disk r6-qcow2 vdb
Disk detached successfully

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 6     r6-qcow2                       running

# service libvirtd status
libvirtd (pid  2445) is running...

Test results:
current command work well.

Comment 6 errata-xmlrpc 2014-10-14 04:20:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1374.html