Bug 1150505
Summary: | Domain is out of control from libvirt when running some concurrent define/undefine/start/destroy jobs rapidly | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Hu Jianwei <jiahu> | ||||||
Component: | libvirt | Assignee: | Martin Kletzander <mkletzan> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.1 | CC: | berrange, dyuan, honzhang, jiahu, jmiao, lmiksik, mkletzan, mzhan, rbalakri, vivianzhang | ||||||
Target Milestone: | rc | Keywords: | Upstream | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libvirt-1.2.8-11.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-03-05 07:46:15 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Hu Jianwei
2014-10-08 11:37:59 UTC
Probably need this upstream commit commit 4882618ed13b469d92fa8b2b4a158fdb17dbe9f1 Author: Guido Günther <agx> Date: Thu Sep 25 13:32:58 2014 +0200 qemu: use systemd's TerminateMachine to kill all processes If we don't properly clean up all processes in the machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm starts fail with 'CreateMachine: File exists' Additional processes can e.g. be added via echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks but there are other cases like http://bugs.debian.org/761521 Invoke TerminateMachine to be on the safe side since systemd tracks the cgroup anyway. This is a noop if all processes have terminated already. Please provide debug logs from libvirt while reproducing the issue? Thank you. Created attachment 947129 [details]
Error log for scratch build
Please check the error log for scratch build
Fixed upstream with v1.2.10-9-gb629c64: commit b629c64e5e0a32ef439b8eeb3a697e2cd76f3248 Author: Martin Kletzander <mkletzan> AuthorDate: Thu Oct 30 14:38:35 2014 +0100 qemu: avoid rare race when undefining domain Still can reproduce it. [root@ibm-x3850x5-06 ~]# rpm -q libvirt libvirt-1.2.8-7.el7.x86_64 After do concurrent jobs rapidly. [root@ibm-x3850x5-06 ~]# virsh list --all Id Name State ---------------------------------------------------- - test shut off [root@ibm-x3850x5-06 ~]# virsh start test error: Failed to start domain test error: error from service: CreateMachine: File exists [root@ibm-x3850x5-06 ~]# ps aux | grep qemu-kvm qemu 377 7.1 0.8 1661472 290980 ? Sl 10:34 0:38 /usr/libexec/qemu-kvm -name test -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 2ce8d663-981e-416e-8760-a21216481992 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/test.img,if=none,id=drive-ide0-0-0,format=raw,cache=none -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=21 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9d:96:2a,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on root 858 0.0 0.0 112644 972 pts/0 S+ 10:43 0:00 grep --color=auto qemu-kvm Created attachment 960995 [details]
log for libvirtd on 1.2.8-7 build
I need to investigate more if this is still not fixed. Moving back to assigned. I can produce this bug on build libvirt-1.2.8-10.el7.x86_64 verify it on build libvirt-1.2.8-11.el7.x86_64 verify steps: 1. prepare a guest xml in the host In the first terminal: #while true; do virsh undefine vm1;virsh define vm1.xml; done In the second terminal: # while true;do virsh destroy vm1;virsh start vm1;done 2. execute the stress scripts test more than 2 hours, guest still works normally, no qemu-kvm process exists always # virsh start vm1 Domain vm1 started [root@intel-e31225-16-2 ~]# virsh list Id Name State ---------------------------------------------------- 12824 vm1 running move to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html |