Bug 609575

Summary: KVM guests often fail to start up when the host is under load (monitor socket failed to show up)
Product: [Community] Virtualization Tools Reporter: Guido Winkelmann <guido-rhbug>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DEFERRED QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: unspecifiedCC: berrange, crobinso, g.kisshope, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-21 00:13:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Guido Winkelmann 2010-06-30 16:07:40 UTC
Description of problem:

When the host system is under high CPU load from other KVM guest processes, libvirt sometimes fails to start up new KVM guests. This even happens when most of the host system's physical cores are still idle, it is sufficient for this bug if there is at least one KVM guest (with one core) that uses 100% CPU. It happens more often, though, if there are at least as many kvm guests using 100% CPU already running as there are physical cpu cores.

When this bug occurs the error message usually looks something like this:

virsh # start testserver-a
error: Failed to start domain testserver-a
error: monitor socket did not show up.: No such file or directory

Version-Release number of selected component (if applicable):

qemu-kvm 0.12.4
libvirt 0.8.1
Also present in latest libvirt git as of 2010-06-03.

How reproducible:

Happens most of the time, but not always.

Steps to Reproduce:
1. Start up a new KVM guest with libvirt
2. Start something in this KVM guest that uses 100% CPU (I use primes from djb's primegen package in an endless loop)
3. (Optionally repeat for as many physical cores the host has)
4. Define and try to start a new kvm guest
  
Actual results:

virsh # start testserver-a
error: Failed to start domain testserver-a
error: monitor socket did not show up.: No such file or directory

Expected results:

The new guest machine should start up without errors.

Additional info:

The host system is running Fedora 12. The kernel is 2.6.32.12-115.fc12.x86_64

The last lines of /usr/local/var/log/libvirt/qemu/testserver-a.log look like this:

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.12 -enable-kvm -m 256 -smp 1,sockets=1,cores=1,threads=1 -name testserver-a -uuid 10cbe19a-ebb8-8114-b929-abb50da85b4a -nodefaults -chardev socket,id=monitor,path=/usr/local/var/lib/libvirt/qemu/testserver-a.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -no-acpi -boot c -device lsi,id=scsi0,bus=pci.0,addr=0x7 -drive file=/data/testserver-a-system.img,if=none,id=drive-scsi0-0-1,boot=on -device scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1 -drive file=/data/testserver-a-data1.img,if=none,id=drive-virtio-disk1 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/data/testserver-a-data2.img,if=none,id=drive-virtio-disk2 -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk2,id=virtio-disk2 -drive file=/data/gentoo-install-amd64-minimal-20100408.iso,if=none,media=cdrom,id=drive-ide0-0-0,readonly=on -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=/data/testserver-a_configfloppy.img,if=none,id=drive-fdc0-0-0 -global isa-fdc.driveA=drive-fdc0-0-0 -device e1000,vlan=0,id=net0,mac=52:54:00:84:6d:69,bus=pci.0,addr=0x6 -net tap,fd=17,vlan=0,name=hostnet0 -usb -vnc 127.0.0.1:1,password -k de -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 
17:31:17.657: debug : virCgroupNew:542 : New group /libvirt/qemu/testserver-a
17:31:17.658: debug : virCgroupDetect:232 : Detected mount/mapping 0:cpu at /mnt/cgroups/cpu in /sysdefault
17:31:17.658: debug : virCgroupDetect:232 : Detected mount/mapping 1:cpuacct at /mnt/cgroups/cpuacct in /sysdefault
17:31:17.658: debug : virCgroupDetect:232 : Detected mount/mapping 3:memory at /mnt/cgroups/memory in /sysdefault
17:31:17.658: debug : virCgroupDetect:232 : Detected mount/mapping 4:devices at /mnt/cgroups/devices in /sysdefault
17:31:17.658: debug : virCgroupMakeGroup:484 : Make group /libvirt/qemu/testserver-a
17:31:17.658: debug : virCgroupMakeGroup:496 : Make controller /mnt/cgroups/cpu/sysdefault/libvirt/qemu/testserver-a/
17:31:17.658: debug : virCgroupMakeGroup:496 : Make controller /mnt/cgroups/cpuacct/sysdefault/libvirt/qemu/testserver-a/
17:31:17.658: debug : virCgroupMakeGroup:496 : Make controller /mnt/cgroups/memory/sysdefault/libvirt/qemu/testserver-a/
17:31:17.658: debug : virCgroupMakeGroup:496 : Make controller /mnt/cgroups/devices/sysdefault/libvirt/qemu/testserver-a/
17:31:17.658: debug : virCgroupSetValueStr:277 : Set value '/mnt/cgroups/cpu/sysdefault/libvirt/qemu/testserver-a/tasks' to '15200'
17:31:17.894: debug : virCgroupSetValueStr:277 : Set value '/mnt/cgroups/cpuacct/sysdefault/libvirt/qemu/testserver-a/tasks' to '15200'
17:31:18.648: debug : virCgroupSetValueStr:277 : Set value '/mnt/cgroups/memory/sysdefault/libvirt/qemu/testserver-a/tasks' to '15200'
17:31:20.648: debug : virCgroupSetValueStr:277 : Set value '/mnt/cgroups/devices/sysdefault/libvirt/qemu/testserver-a/tasks' to '15200'

Comment 1 Guido Winkelmann 2010-07-15 16:55:12 UTC
I have been experimenting with starting the VM manually using that commandline, and I have found that if I leave out the -nodefaults parameter, the VM will again start up reliably.

Comment 2 Cole Robinson 2016-03-21 00:13:47 UTC
Sorry this never received a response. Libvirt and qemu have changed so much since this bug was filed that even if the issue still exists it's likely to be of totally different origin

Closing as DEFERRED... if anyone can still reproduce with recent fedora, I recommend filing a new bug