Bug 1075973
| Summary: | libvirtd crashes if VM crashes or is destroyed while hot-attaching disks | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | chhu | ||||
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 7.0 | CC: | acathrow, ajia, bili, dyuan, eblake, jdenemar, mzhan, shyu | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-1.1.1-28.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1076719 (view as bug list) | Environment: | |||||
| Last Closed: | 2014-06-13 11:25:39 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 573946, 1026966 | ||||||
| Bug Blocks: | 1076719 | ||||||
| Attachments: |
|
||||||
Created attachment 873907 [details]
core dump file
This is a downstream only issue. Fixed by: http://post-office.corp.redhat.com/archives/rhvirt-patches/2014-March/msg00363.html commit e6cbf1ffab1f98704bf5d3ce09c4ceba2b022b6f Author: Peter Krempa <pkrempa> Date: Fri Mar 14 17:34:07 2014 +0100 qemu: monitor: Fix invalid parentheses https://bugzilla.redhat.com/show_bug.cgi?id=1075973 RHEL-only: the code in question is handling a downstream command A typo in parenteses in a condition checking the success of a monitor command lead to a crash of libvirtd if the monitor command isn't successful. The error path uses a combination of "ret == 0" and "ret < 0" error checks. Due to this fact the disk definition parsed from the user input is added to the domain definition but at the same time it's freed at the end of the AttachDevice API. When the domain is destroyed afterwards a use-after-free error leads to a crash on random places when freeing the disk in question. To reproduce use the attached reproducer with ANY disk definition supported (gluster as stated in the original report isn't required). Reproducer: diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 502b977..afcf603 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -28,6 +28,7 @@ #include <sys/un.h> #include <unistd.h> #include <fcntl.h> +#include <signal.h> #include "qemu_monitor.h" #include "qemu_monitor_text.h" @@ -3003,6 +3004,8 @@ int qemuMonitorAddDrive(qemuMonitorPtr mon, return -1; } + kill(mon->vm->pid, 9); + if (mon->json) ret = qemuMonitorJSONAddDrive(mon, drivestr); else Verified with packages:
libvirt-1.1.1-28.el7.x86_64
qemu-kvm-rhev-1.5.3-53.el7.x86_64
Test steps:
1. create a guest with gluster volume.
# virsh create r7g-qcow2-gluster.xml
Domain r7g-qcow2 created from r7g-qcow2-gluster.xml
# virsh list --all
Id Name State
----------------------------------------------------
5 r7g-qcow2 running
# virsh dumpxml r7g-qcow2| grep disk -A 7
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source protocol='gluster' name='gluster-vol1/r7g-qcow2.img'>
<host name='10.66.84.12'/>
</source>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
2. try to attach a volume into the guest, press Ctrl+C to interupt the "virsh attach-device".
# more disk-gluster-vol.xml
<disk type='network' device='disk'>
<driver name='qemu' type='qcow2'/>
<source protocol='gluster' name='gluster-vol1/test.img'>
<host name='10.66.106.22'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>
# virsh attach-device r7g-qcow2 disk-gluster-vol.xml
^C
3. destroy the guest successfully, no libvirtd core dump.
# virsh destroy r7g-qcow2
Domain r7g-qcow2 destroyed
# virsh list --all
Id Name State
---------------------------------------------------
# service libvirtd status -l
Redirecting to /bin/systemctl status -l libvirtd.service
libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled)
Active: active (running) since Wed 2014-03-19 10:21:22 CST; 1min 20s ago
Main PID: 16281 (libvirtd)
CGroup: /system.slice/libvirtd.service
├─ 2546 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf
└─16281 /usr/sbin/libvirtd
Mar 19 10:21:39 localhost.localdomain libvirtd[16281]: User record for user '107' was not found: No such file or directory
Mar 19 10:21:39 localhost.localdomain libvirtd[16281]: Group record for user '107' was not found: No such file or directory
Mar 19 10:21:39 localhost.localdomain libvirtd[16281]: User record for user '107' was not found: No such file or directory
Mar 19 10:21:39 localhost.localdomain libvirtd[16281]: Group record for user '107' was not found: No such file or directory
Mar 19 10:22:07 localhost.localdomain libvirtd[16281]: End of file while reading data: Input/output error
Mar 19 10:22:09 localhost.localdomain libvirtd[16281]: internal error: unable to execute QEMU command '__com.redhat_drive_add': could not open disk image gluster://10.66.106.22/gluster-vol1/test.img: Could not open 'gluster://10.66.106.22/gluster-vol1/test.img': Transport endpoint is not connected
Mar 19 10:22:11 localhost.localdomain dnsmasq-dhcp[2546]: DHCPDISCOVER(virbr0) 52:54:00:7f:62:54
Mar 19 10:22:11 localhost.localdomain dnsmasq-dhcp[2546]: DHCPOFFER(virbr0) 192.168.122.74 52:54:00:7f:62:54
Mar 19 10:22:11 localhost.localdomain dnsmasq-dhcp[2546]: DHCPREQUEST(virbr0) 192.168.122.74 52:54:00:7f:62:54
Mar 19 10:22:11 localhost.localdomain dnsmasq-dhcp[2546]: DHCPACK(virbr0) 192.168.122.74 52:54:00:7f:62:54 rhel75
Test results:
guest is destroyed successfully, without libvirtd core dump.
This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |
Version-Release number of selected component (if applicable): libvirt-1.1.1-26.el7.x86_64 qemu-kvm-1.5.3-52.el7.x86_64 And: libvirt-1.1.1-27.el7.x86_64 qemu-kvm-rhev-1.5.3-53.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. create a guest with gluster volume. # virsh create r7g-qcow2-gluster.xml Domain r7g-qcow2 created from r7g-qcow2-gluster.xml # virsh dumpxml r7g-qcow2| grep disk -A 7 <disk type='network' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source protocol='gluster' name='gluster-vol1/r7g-qcow2.img'> <host name='10.66.84.12'/> </source> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> # virsh list --all| grep r7g-qcow2 10 r7g-qcow2 running 2. try to attach a volume into the guest, press Ctrl+C to interupt the "virsh attach-device". # more disk-gluster-vol.xml <disk type='network' device='disk'> <driver name='qemu' type='qcow2'/> <source protocol='gluster' name='gluster-vol1/rhel7.0-qcow2.img'> <host name='10.66.106.22'/> </source> <target dev='vdb' bus='virtio'/> </disk> # virsh attach-device r7g-qcow2 disk-gluster-vol.xml ^C # virsh list --all Id Name State ---------------------------------------------------- 58 r7g-qcow2 running 3. try to destory the guest, met libvirtd core dump # virsh destroy r7g-qcow2 error: Failed to destroy domain r7g-qcow2 error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor # virsh list --all error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused # service libvirtd status Redirecting to /bin/systemctl status libvirtd.service libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; disabled) Active: failed (Result: core-dump) since Thu 2014-03-13 17:28:42 CST; 20s ago Process: 10969 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=dumped, signal=SEGV) Main PID: 10969 (code=dumped, signal=SEGV) CGroup: /system.slice/libvirtd.service └─4476 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf Mar 13 17:24:40 intel-i5-8-1 systemd[1]: Started Virtualization daemon. Mar 13 17:24:40 intel-i5-8-1 dnsmasq[4476]: read /etc/hosts - 4 addresses Mar 13 17:24:40 intel-i5-8-1 dnsmasq[4476]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Mar 13 17:24:40 intel-i5-8-1 dnsmasq-dhcp[4476]: read /var/lib/libvirt/dnsmasq/default.hostsfile Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPDISCOVER(virbr0) 52:54:00:7f:62:54 Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPOFFER(virbr0) 192.168.122.74 52:54:00:7f:62:54 Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPREQUEST(virbr0) 192.168.122.74 52:54:00:7f:62:54 Mar 13 17:25:31 intel-i5-8-1 dnsmasq-dhcp[4476]: DHCPACK(virbr0) 192.168.122.74 52:54:00:7f:62:54 rhel75 Mar 13 17:28:42 intel-i5-8-1 systemd[1]: libvirtd.service: main process exited, code=dumped, status=11/SEGV Mar 13 17:28:42 intel-i5-8-1 systemd[1]: Unit libvirtd.service entered failed state. Actual results: In step3, libvirtd core dump. Expected results: In step3, virsh destroy the guest successfully, no libvirtd core dump.