Bug 1172522
| Summary: | Failed to start domains discontinuously during starting 514 guests continuously | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Hu Jianwei <jiahu> |
| Component: | libvirt | Assignee: | Libvirt Maintainers <libvirt-maint> |
| Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.1 | CC: | dyuan, honzhang, jdenemar, mzhan, rbalakri |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-12-10 10:26:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Well, you set max_processes to 1000 and then try to start 514 domains with 4 VCPUs each, which means that 2056 (514 * 4) CPU threads will need to be created. It's logical QEMU can't create threads and dies. And I don't even count non-CPU threads QEMU also needs to create. Each QEMU process will have at least one but possibly more threads in addition to CPU threads. Yes, when enlarging the max_processes in qemu.conf, the "Resource temporarily unavailable" is disappeared, but hit 3 times another error: [root@dhcp-66-83-39 ~]# cat /etc/libvirt/qemu.conf | grep ^[^#] max_processes = 1000000 max_files = 1000000 ... Domain guest497 started Domain guest498 started Domain guest499 started error: Failed to start domain guest500 error: internal error: early end of file from monitor: possible problem: ((null):25194): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied Domain guest501 started Domain guest502 started ... Generally, User don't know what resource temporarily unavailable from the error message, why not let libvirt or qemu-kvm report a clear error message? Thanks. Also, during doing 514 guests migration, I hit 14 times same issue in comment 3. [root@ibm-x3850x5-02 scalability]# rpm -q libvirt qemu-kvm-rhev libvirt-1.2.8-10.el7.x86_64 qemu-kvm-rhev-2.1.2-16.el7.x86_64 [root@ibm-x3850x5-02 scalability]# cat mig.sh #! /bin/sh - for i in {1..514} do echo "---------guest$i-------" virsh migrate --live --undefinesource guest$i qemu+tls://10.66.83.39/system --verbose done [root@ibm-x3850x5-02 scalability]# nohup sh mig.sh & After finishing the scripts, check the output. [root@ibm-x3850x5-02 scalability]# cat nohup.out | grep error -b2 Migration: [100 %] 48159----------guest207------- 48184:error: internal error: early end of file from monitor: possible problem: 48257-((null):24633): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 48362- -- Migration: [100 %] 48677----------guest212------- 48702:error: internal error: early end of file from monitor: possible problem: 48775-((null):24711): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 48880- -- Migration: [100 %] 49233----------guest217------- 49258:error: internal error: early end of file from monitor: possible problem: 49331-((null):24776): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 49436- -- Migration: [100 %] 58016----------guest279------- 58041:error: internal error: early end of file from monitor: possible problem: 58114-((null):25805): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 58219- -- Migration: [100 %] 64553----------guest316------- 64578:error: internal error: early end of file from monitor: possible problem: 64651-((null):26682): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 64756- -- Migration: [100 %] 65541----------guest323------- 65566:error: internal error: early end of file from monitor: possible problem: 65639-((null):26849): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 65744- -- Migration: [100 %] 66655----------guest329------- 66680:error: internal error: early end of file from monitor: possible problem: 66753-((null):27036): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 66858- -- Migration: [100 %] 67019----------guest331------- 67044:error: internal error: early end of file from monitor: possible problem: 67117-((null):27094): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 67222- -- Migration: [100 %] 67383----------guest333------- 67408:error: internal error: early end of file from monitor: possible problem: 67481-((null):27152): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 67586- -- Migration: [100 %] 68269----------guest339------- 68294:error: internal error: early end of file from monitor: possible problem: 68367-((null):27285): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 68472- -- Migration: [100 %] 73285----------guest365------- 73310:error: internal error: early end of file from monitor: possible problem: 73383-((null):28066): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 73488- -- Migration: [100 %] 77798----------guest387------- 77823:error: internal error: early end of file from monitor: possible problem: 77896-((null):28939): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 78001- -- Migration: [100 %] 84457----------guest417------- 84482:error: internal error: early end of file from monitor: possible problem: 84555-((null):30161): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 84660- -- Migration: [100 %] 89573----------guest441------- 89598:error: internal error: early end of file from monitor: possible problem: 89671-((null):31366): Spice-ERROR **: reds.c:3975:do_spice_init: statistics shm_open failed, Permission denied 89776- [root@ibm-x3850x5-02 scalability]# The "resource temporarily unavailable" is a system error message which libvirt or qemu gets when they try to create a new thread or process, we can't get anything better than this. And both libvirt and qemu reports when they got the error: "cannot execute binary /usr/libexec/qemu-kvm" or "qemu_thread_create" so I think it's pretty clear. I'm not sure what the spice error is about but is it a different issue. It comes from a spice server on the destination host. Perhaps log files there could give us more info about the error... OK, I'll try to capture the related logs, if possible, a separate bug will be reported. Thanks. |
Description Failed to start domains discontinuously during starting 514 guests continuously Version: libvirt-1.2.8-10.el7.x86_64 qemu-kvm-rhev-2.1.2-16.el7.x86_64 spice-server-0.12.4-8.el7.x86_64 selinux-policy-3.13.1-9.el7.noarch How reproducible: 100% Steps to Reproduce: [root@ibm-x3850x5-02 scalability]# getenforce Enforcing [root@ibm-x3850x5-02 scalability]# cat /etc/systemd/system/multi-user.target.wants/libvirtd.service| grep Limit LimitNOFILE=2048000 [root@dhcp-66-83-39 433894]# cat /etc/libvirt/libvirtd.conf | grep ^[^#] listen_tls = 1 auth_tls = "none" max_anonymous_clients = 20000 max_workers = 20000 max_requests = 20000 max_client_requests = 5000 log_level = 1 log_filters="1:qemu_monitor_json" log_outputs="1:file:/var/log/libvirt/libvirtd.log" [root@dhcp-66-83-39 433894]# cat /etc/libvirt/qemu.conf | grep ^[^#] max_processes = 1000 max_files = 10000 [root@dhcp-66-83-39 433894]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1934080 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1000000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 1934080 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited [root@dhcp-66-83-39 433894]# [root@dhcp-66-83-39 433894]# for i in {1..514};do virsh start guest$i;sleep 3;done ...(clipped) Domain guest263 started Domain guest264 started error: Failed to start domain guest265 error: internal error: early end of file from monitor: possible problem: ((null):72465): Spice-ERROR **: red_dispatcher.c:1145:red_dispatcher_init: create thread failed 11 error: Failed to start domain guest266 error: internal error: early end of file from monitor: possible problem: qemu: qemu_thread_create: Resource temporarily unavailable error: Failed to start domain guest267 error: internal error: early end of file from monitor: possible problem: qemu: qemu_thread_create: Resource temporarily unavailable Domain guest268 started ...(clipped) error: Failed to start domain guest271 error: internal error: early end of file from monitor: possible problem: qemu: qemu_thread_create: Resource temporarily unavailable error: Failed to start domain guest272 error: internal error: early end of file from monitor: possible problem: ((null):72589): Spice-ERROR **: red_dispatcher.c:1145:red_dispatcher_init: create thread failed 11 Domain guest273 started Domain guest274 started error: Failed to start domain guest275 error: internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/libexec/qemu-kvm: Resource temporarily unavailable error: Failed to start domain guest276 error: internal error: early end of file from monitor: possible problem: qemu: qemu_thread_create: Resource temporarily unavailable error: Failed to start domain guest277 error: internal error: early end of file from monitor: possible problem: qemu: qemu_thread_create: Resource temporarily unavailable ...(clipped) Actual results: Met two errors: a. error: internal error: early end of file from monitor: possible problem: ((null):72746): Spice-ERROR **: red_dispatcher.c:1145:red_dispatcher_init: create thread failed 11 b. error: internal error: early end of file from monitor: possible problem: qemu: qemu_thread_create: Resource temporarily unavailable When enlarge the sleep time to 60 second, the issue still can reproduce. Disabled selinux(Permissive), the issue still can reproduce too. ++++++++++++++++++++++++++++++++++++++++++++++ Some logs from libvirtd.log: 2014-12-10 09:51:25.294+0000: 8568: error : qemuMonitorOpenUnix:309 : failed to connect to monitor socket: No such process 2014-12-10 09:51:25.294+0000: 8567: debug : virEventPollMakePollFDs:402 : Prepare n=268 w=269, f=281 e=25 d=0 2014-12-10 09:51:25.294+0000: 8568: debug : virFileClose:99 : Closed fd 359 ... 2014-12-10 09:51:25.297+0000: 8567: debug : virEventPollMakePollFDs:402 : Prepare n=150 w=151, f=162 e=25 d=0 2014-12-10 09:51:25.297+0000: 8568: error : qemuProcessWaitForMonitor:2205 : internal error: process exited while connecting to monitor: qemu: qemu_thread_create: Resource temporarily unavailable 2014-12-10 09:51:25.297+0000: 8567: debug : virEventPollMakePollFDs:402 : Prepare n=151 w=152, f=163 e=25 d=0 2014-12-10 09:51:25.297+0000: 8567: debug : virEventPollMakePollFDs:402 : Prepare n=152 w=153, f=164 e=25 d=0 2014-12-10 09:51:25.297+0000: 8567: debug : virEventPollMakePollFDs:402 : Prepare n=153 w=154, f=166 e=25 d=0 2014-12-10 09:51:25.297+0000: 8568: debug : virFileClose:99 : Closed fd 358 ++++++++++++++++++++++++++++++++++++++++++++++++ Expected results: Fix it Additional info: domain xml example: [root@dhcp-66-83-39 433894]# virsh dumpxml guest400 <domain type='kvm'> <name>guest400</name> <uuid>7a7df8f0-dedd-468d-be1d-19951a03f6c8</uuid> <memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory> <vcpu placement='static'>4</vcpu> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/home/images/guest400.img'/> <target dev='hda' bus='ide'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='pci' index='0' model='pci-root'/> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </controller> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='ide' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'/> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='spice' autoport='yes'/> <video> <model type='qxl' ram='65536' vram='65536' vgamem='8192' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='none'/> </devices> </domain>