Bug 1728530
Summary: | libvirtd crashes non deterministically when trying to destroy a guest | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Katerina Koukiou <kkoukiou> | ||||
Component: | libvirt | Assignee: | Ján Tomko <jtomko> | ||||
Status: | CLOSED ERRATA | QA Contact: | yafu <yafu> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 8.1 | CC: | jdenemar, jtomko, rbalakri, yalzhang | ||||
Target Milestone: | rc | ||||||
Target Release: | 8.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-4.5.0-31.el8 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-11-05 20:51:02 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Katerina Koukiou
2019-07-10 06:33:53 UTC
~30 % chance of reproducing this with:
$ cat <<EOF >xml
> <domain type='qemu'>
> <name>test_domain</name>
> <memory unit='KiB'>2097152</memory>
> <uuid>e6e546d3-9642-4ecb-908e-f3cc84e66a39</uuid>
> <currentMemory unit='KiB'>2097152</currentMemory>
> <vcpu>1</vcpu>
> <os>
> <type arch='x86_64' machine='pc'>hvm</type>
> </os>
> <devices>
> <emulator>/usr/bin/qemu-system-x86_64</emulator>
> </devices>
> </domain>
> EOF
$ virsh create xml; virsh destroy test_domain& virsh define xml; sleep 1; virsh undefine test_domain
The define call crashes while trying to free def, which it should only do when adding it to the list failed:
#2 0x00007ffff7cbb900 in virDomainDefFree (def=<optimized out>, def@entry=0x7fffd4001fe0) at conf/domain_conf.c:3004
#3 0x00007fffc31b9f2e in qemuDomainDefineXMLFlags (conn=0x7fffd80019e0, xml=<optimized out>, flags=<optimized out>) at qemu/qemu_driver.c:7427
Fixed upstream by: commit 7e760f61577e6c4adbb0b015f8f7ac1796570cdd Author: Marc Hartmayer <mhartmay.com> CommitDate: 2018-08-29 10:02:03 +0200 virDomainObjListAddLocked: fix double free If @vm has flagged as "to be removed" virDomainObjListFindByNameLocked returns NULL (although the definition actually exists). Therefore, the possibility exits that "virHashAddEntry" will raise the error "Duplicate key" => virDomainObjListAddObjLocked fails => virDomainObjEndAPI(&vm) is called and this leads to a freeing of @def since @def is already assigned to vm->def. But actually this leads to a double free since the common usage pattern is that the caller of virDomainObjListAdd(Locked) is responsible for freeing @def in case of an error. Let's fix this by setting vm->def to NULL in case of an error. Backtrace: ➤ bt #0 virFree (ptrptr=0x7575757575757575) #1 0x000003ffb5b25b3e in virDomainResourceDefFree #2 0x000003ffb5b37c34 in virDomainDefFree #3 0x000003ff9123f734 in qemuDomainDefineXMLFlags #4 0x000003ff9123f7f4 in qemuDomainDefineXML #5 0x000003ffb5cd2c84 in virDomainDefineXML #6 0x000000011745aa82 in remoteDispatchDomainDefineXML ... Reviewed-by: Bjoern Walk <bwalk.com> Signed-off-by: Marc Hartmayer <mhartmay.com> git describe: v4.7.0-rc1-30-g7e760f6157 contains: v4.7.0-rc2~2 Hi,Ján, It reports error "error: internal error: Duplicate key" when i try to verify the bug with libvirt-4.5.0-31.module+el8.1.0+3808+3325c1a3.x86_64. Would you help to check it please? # virsh create vm1.xml; virsh destroy vm1& virsh define vm1.xml;sleep 1; virsh undefine vm1; Domain vm1 created from vm1.xml [1] 26608 Domain vm1 destroyed error: Failed to define domain from vm1.xml error: internal error: Duplicate key [1]+ Done virsh destroy vm1 error: failed to get domain 'vm1' error: Domain not found: no domain with matching name 'vm1' Hi, the error message is not as helpful as it could be. As long as libvirt does not crash, I consider this bug fixed. The error message is already fixed upstream by: commit a5c71129bf2c12a827f1bc00149acd1c572ffe9c virDomainObjListAddLocked: Produce better error message than 'Duplicate key' However I do consider that a separate bug. (In reply to Ján Tomko from comment #6) > Hi, > > the error message is not as helpful as it could be. > As long as libvirt does not crash, I consider this bug fixed. > > The error message is already fixed upstream by: > commit a5c71129bf2c12a827f1bc00149acd1c572ffe9c > virDomainObjListAddLocked: Produce better error message than 'Duplicate > key' > However I do consider that a separate bug. Thanks. File a bug to track it: https://bugzilla.redhat.com/show_bug.cgi?id=1737790 Reproduced with libvirt-4.5.0-25.x86_64. Verified with libvirt-4.5.0-31.module+el8.1.0+3808+3325c1a3.x86_64. Test steps: 1.Check the libvirtd pid: #pidof libvirtd 23812 2.#for i in {1..100}; do virsh create vm1.xml; virsh destroy vm1& virsh define vm1.xml; sleep 1; virsh undefine vm1; done 3.Check libvirtd pid again, the libvirtd not crash during step2: #pidof libvirtd 23812 It's still crashing for me with the same test that caused the initial issue, just the stack trace is different now. That's on the libvirt version that this bug was verified. libvirt-daemon-4.5.0-31 However this stopped being reproduceable for me after I installed the debuginfo packages, so can't provide yet the coredump with debuginfo. However this is the relevant section from the journal, does it give you some hint what's going on? Aug 09 09:05:39 localhost.localdomain systemd-coredump[8447]: Process 1914 (libvirtd) of user 1001 dumped core. Stack trace of thread 8444: #0 0x00007f13dd0f1aa4 __pthread_mutex_lock (libpthread.so.0) #1 0x00007f13ac1100a2 virQEMUDriverGetConfig (libvirt_driver_qemu.so) #2 0x00007f13ac154ac0 qemuStateStop (libvirt_driver_qemu.so) #3 0x00007f13e041656f virStateStop (libvirt.so.0) #4 0x000055833ea86d31 daemonStopWorker (libvirtd) #5 0x00007f13e0294d9a virThreadHelper (libvirt.so.0) #6 0x00007f13dd0ef2de start_thread (libpthread.so.0) #7 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1914: #0 0x00007f139fa0c3c0 _ZN5boost15aligned_storageILm16ELm8EED1Ev (libceph-common.so.0) #1 0x00007f13dcd5e06c __run_exit_handlers (libc.so.6) #2 0x00007f13dcd5e1a0 exit (libc.so.6) #3 0x00007f13dcd4787a __libc_start_main (libc.so.6) #4 0x000055833ea86a4e _start (libvirtd) Stack trace of thread 1915: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1916: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1917: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1918: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1919: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1920: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295af4 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1921: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295af4 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1922: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295af4 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1923: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295af4 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1924: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295af4 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1925: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1926: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1927: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1928: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) Stack trace of thread 1929: #0 0x00007f13dd0f547c pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f13e029500a virCondWait (libvirt.so.0) #2 0x00007f13e0295b43 virThreadPoolWorker (libvirt.so.0) #3 0x00007f13e0294d6c virThreadHelper (libvirt.so.0) #4 0x00007f13dd0ef2de start_thread (libpthread.so.0) #5 0x00007f13dce20133 __clone (libc.so.6) The logs from the domain that caused the crash are here: [root@localhost ~]# cat /var/log/libvirt/qemu/subVmTestCreate21.log 2019-08-09 13:04:38.763+0000: starting up libvirt version: 4.5.0, package: 31.module+el8.1.0+3808+3325c1a3 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2019-07-30-15:19:59, ), qemu version: 2.12.0qemu-kvm-2.12.0-83.module+el8.1.0+3852+0ba8aef0, kernel: 4.18.0-128.el8.x86_64, hostname: localhost.localdomain LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=subVmTestCreate21,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-subVmTestCreate21/master-key.aes -machine pc-i440fx-rhel7.6.0,accel=tcg,usb=off,vmport=off,dump-guest-core=off -m 256 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 8582a929-7854-4256-9656-da8ccf415580 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=30,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-reboot -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/var/lib/libvirt/pools/tmpPool/vmTmpDestination.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -drive file=/var/lib/libvirt/novell.iso,format=raw,if=none,id=drive-ide0-0-0,readonly=on -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=32,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:fb:73:f0,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=33,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -vnc 127.0.0.1:1 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on 2019-08-09T13:04:39.130770Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/1 (label charserial0) 2019-08-09T13:04:39.978852Z qemu-kvm: terminating on signal 15 from pid 1641 (<unknown process>) 2019-08-09 13:04:40.184+0000: shutting down, reason=destroyed Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3345 |