Bug 854552
Summary: | Host machine gets stuck while rebooting after libvirt failed killing a qemu process | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Luwen Su <lsu> | ||||||
Component: | kernel | Assignee: | Lauro Ramos Venancio <lvenanci> | ||||||
kernel sub component: | KVM | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||||
Severity: | medium | ||||||||
Priority: | medium | CC: | chayang, dyuan, itxx00, juzhang, mkletzan, mzhan, ppyy, rbalakri, virt-maint, wgomerin | ||||||
Version: | 6.4 | ||||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-11 19:29:55 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1270638, 1359574, 1366045 | ||||||||
Attachments: |
|
Hello, could you try to reproduce that with following scenarios, please? Logs from the daemon and the machine would be helpful from these tries as well. 1) test with bigger numbers (bigger limits) and see if the machine gets killed # for i in {1..1000}; do echo $i; virsh memtune test-1 --hard-limit 100000000 --soft-limit 100000000 --swap-hard-limit 100000021 --live; virsh memtune test-1 --hard-limit 100000000 --soft-limit 100000000 --swap-hard-limit 100000092 --live; virsh memtune test-1 --hard-limit 100000000 --soft-limit 100000000 --swap-hard-limit 100000021 --live; done 2) test on host with bypassing virsh (modify according to your machine), optionally with output of current usage: # cd /sys/fs/cgroup/memory/libvirt/qemu/test-1/ # for i in {1..1000} do echo $i cat memory.usage_in_bytes memory.memsw.usage_in_bytes echo 100000 >memory.limit_in_bytes echo 100000 >memory.soft_limit_in_bytes echo 100021 >memory.memsw.limit_in_bytes echo 100000 >memory.limit_in_bytes echo 100000 >memory.soft_limit_in_bytes echo 100092 >memory.memsw.limit_in_bytes echo 100000 >memory.limit_in_bytes echo 100000 >memory.soft_limit_in_bytes echo 100021 >memory.memsw.limit_in_bytes done Please post the last usage and try if the second machine has problem starting and if the reboot problem persists. (In reply to comment #2) Both your methods and mine , the logs below appear every time after the domain be auto destoyed. 1.libvirtd.log:error : qemuMonitorIO:602 : internal error End of file from monitor 2.The kernel's log in attachment. Here is a debug log , i'm not sure it covers all the error details, because the orginal's is too long , so i just paste a part of it here. see libvirtd.log in attachment. The libvirtd hang and host reboot unsucess can't reproduce by 100% , but them occured at least. For your method 1: 1.100000000 with 1000 times is ok , no errors occured. But i found for a 1G memory guest , if i run #for i in {1..1000} ;do echo $i ; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100021 --live; ....snip....done error: Unable to change memory parameters error: unable to set swap_hard_limit tunable: Device or resource busy But when i change swap-hard-limit to 600021 , it works fine.For 2G , set 400021 also fine. # virsh memtune test-1 hard_limit : 600000 soft_limit : 600000 swap_hard_limit: 600024 For your method 2: It's not same everytime for numbers, but for the procedue and error infos , it does. 337 962560 393326592 338 966656 393322496 339 1003520 393322496 340 1003520 393322496 341 397312 202153984 342 0 0 343 0 0 .... 380 381 382 383 The error message: Before 341: -bash: echo: write error: Device or resource busy -bash: echo: write error: Invalid argument -bash: echo: write error: Device or resource busy -bash: echo: write error: Invalid argument -bash: echo: write error: Device or resource busy -bash: echo: write error: Invalid argument 342-380 0 0 Others cat: memory.usage_in_bytes: No such file or directory cat: memory.memsw.usage_in_bytes: No such file or directory -bash: memory.limit_in_bytes: No such file or directory -bash: memory.soft_limit_in_bytes: No such file or directory -bash: memory.memsw.limit_in_bytes: No such file or directory -bash: memory.limit_in_bytes: No such file or directory -bash: memory.soft_limit_in_bytes: No such file or directory -bash: memory.memsw.limit_in_bytes: No such file or directory -bash: memory.limit_in_bytes: No such file or directory -bash: memory.soft_limit_in_bytes: No such file or directory -bash: memory.memsw.limit_in_bytes: No such file or directory But if i add a sleep 5 , the memory.usage_in_bytes will not shrink quickly like above , and it works fine.So i doubt it can't set memory free quickly so that lead to the memory run out. I test other numbers whatever big or not big , like 100000000 or just 200000 , got the same result if not add sleep. Created attachment 610684 [details]
libvirtd-debug
Thanks for help with the investigation, but I must disappoint you. The outputs are more or less what I thought is happening. The errors libvirt is showing are the same as kernel shows except for the "End of file from monitor" and the following that are caused by the same thing, which is that the qemu process gets killed. Which makes sense as well because (as can be seen from the kernel logs as well, I just missed them at first) the limit for the machine is too low for how much it wants to use. This is default behavior or the kernel. That is also confirmed by the first method (that raising the limit doesn't make the machine crash). Error "No such file or directory" is expected as well because when the machine crashes, libvirt clears the cgroups created earlier. If there is any other problem you'd like to take care of (that i maybe missed), please let me know, otherwise, for the problem with reboot, I'd change the component to kernel as this has no connection to libvirt. Hi, No more erros found with gdb and there is a way that reproduce the reboot problem below to help others for researching further and debugging.I agree with your suggesion and thanks your help , seems all the libvirt's issuses came from the kernel problem. Here is a method to reproduce reboot issue . A running guest and open three terminnals The first excute: gdb libvirtd The second excute: for i in {1..1000} ; do virsh start test-1 ; sleep 50 ; virsh destroy test-1 ; done The third excute: for i in {1..20000} ;do echo $i ; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100021 --live; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100092 --live; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100021 --live;done All the errors are expected except warning : qemuProcessKill:3966 : Timed out waiting after SIGTERM to process 5133, sending SIGKILL warning : qemuProcessKill:3998 : Timed out waiting after SIGKILL to process 5133 in gdb. After that , just reboot the host.For log please refer kernel-log in attachenment. OK then, thanks for researching this. This time I was able to get to "Timed out waiting after SIGTERM to process 25512, sending SIGKILL", but the SIGKILL timeout still doesn't appear. I'm wondering where that comes from, but anyway, it might be valuable to know what the process (pid 5133 in your case) is doing. At least ps -f $pid. I'm transferring this to kernel as libvirt is doing everything properly to my knowledge. As I see this BZ is still open, let me try to cleanup the information here a little bit. Host machine is unable to reboot (shutdown?) after "some manipulation". As qemu process is still running after libvirt tried to kill it, I'd guess it is because of that process that the machine gets stuck. Could you try attaching to the qemu process with gdb and run a "t a a bt full" in the gdb after the "Timed out waiting after SIGKILL to process" message appears from libvirt? There may be a thread in D state or something similar, which would explain this. Make sure you've got all required debuginfos installed. (In reply to comment #8) > As I see this BZ is still open, let me try to cleanup the information here a > little bit. > > Host machine is unable to reboot (shutdown?) after "some manipulation". As > qemu process is still running after libvirt tried to kill it, I'd guess it > is because of that process that the machine gets stuck. > > Could you try attaching to the qemu process with gdb and run a "t a a bt > full" in the gdb after the "Timed out waiting after SIGKILL to process" > message appears from libvirt? There may be a thread in D state or something > similar, which would explain this. > > Make sure you've got all required debuginfos installed. Hi Martin, En...I'm reproducing it with snap4 pkgs Not reproduce the comment 6 's issue yet , through thouse steps , till now i get 3 kind of error , will keep testting. 1.Device or resource busy 2.Guest auto shutdown 3.Invalid parameters For gdb `pidof qemu-kvm` , because the steps i use need destroy guest , so it's not easy to do that. Anyway , i'm forcusing on this , so keep the NEED_info tag temp , Will comment here once i get more result. Hi Martin , I catch this issue in libvirt -18 #rpm -q libvirt qemu-kvm-rhev kernel libvirt-0.10.2-18.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.355.el6.x86_64 kernel-2.6.32-356.el6.x86_64 I found this one is similar with Bug 891653 - Cgroups memory limit are causing the virt to be terminated unexpectedly What do you think about it? Is it a duplication? Same steps with comment 6 libvirtd.log 2013-01-29 08:58:48.444+0000: 9474: error : qemuDomainSetMemoryParameters:7452 : unable to set memory hard_limit tunable: Device or resource busy 2013-01-29 08:58:48.444+0000: 9474: error : qemuDomainSetMemoryParameters:7478 : unable to set swap_hard_limit tunable: Invalid argument 2013-01-29 08:58:49.369+0000: 9477: warning : virCgroupMoveTask:885 : no vm cgroup in controller 3 2013-01-29 08:58:49.369+0000: 9477: warning : virCgroupMoveTask:885 : no vm cgroup in controller 4 2013-01-29 08:58:49.369+0000: 9477: warning : virCgroupMoveTask:885 : no vm cgroup in controller 6 2013-01-29 08:59:09.092+0000: 9473: error : qemuMonitorIO:613 : internal error End of file from monitor message log: 8 localhost kernel: qemu-kvm invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0 Jan 29 17:43:08 localhost kernel: qemu-kvm cpuset=vcpu0 mems_allowed=0 Jan 29 17:43:08 localhost kernel: Pid: 16047, comm: qemu-kvm Not tainted 2.6.32-356.el6.x86_64 #1 Jan 29 17:43:08 localhost kernel: Call Trace: Jan 29 17:43:08 localhost kernel: [<ffffffff810cb5d1>] ? cpuset_print_task_mems_allowed+0x91/0xb0 Jan 29 17:43:08 localhost kernel: [<ffffffff8111cd10>] ? dump_header+0x90/0x1b0 Jan 29 17:43:08 localhost kernel: [<ffffffff81172211>] ? task_in_mem_cgroup+0xe1/0x120 Jan 29 17:43:08 localhost kernel: [<ffffffff8111d192>] ? oom_kill_process+0x82/0x2a0 Jan 29 17:43:08 localhost kernel: [<ffffffff8111d08e>] ? select_bad_process+0x9e/0x120 Jan 29 17:43:08 localhost kernel: [<ffffffff8111d912>] ? mem_cgroup_out_of_memory+0x92/0xb0 Jan 29 17:43:08 localhost kernel: [<ffffffff81173454>] ? mem_cgroup_handle_oom+0x274/0x2a0 Jan 29 17:43:08 localhost kernel: [<ffffffff81170e90>] ? memcg_oom_wake_function+0x0/0xa0 Jan 29 17:43:08 localhost kernel: [<ffffffff81173a39>] ? __mem_cgroup_try_charge+0x5b9/0x5d0 Jan 29 17:43:08 localhost kernel: [<ffffffff81174db7>] ? mem_cgroup_charge_common+0x87/0xd0 Jan 29 17:43:08 localhost kernel: [<ffffffff81174e48>] ? mem_cgroup_newpage_charge+0x48/0x50 Jan 29 17:43:08 localhost kernel: [<ffffffff81143d2c>] ? handle_pte_fault+0x79c/0xb50 Jan 29 17:43:08 localhost kernel: [<ffffffff8104baa7>] ? pte_alloc_one+0x37/0x50 Jan 29 17:43:08 localhost kernel: [<ffffffff8117b469>] ? do_huge_pmd_anonymous_page+0xb9/0x380 Jan 29 17:43:08 localhost kernel: [<ffffffff8114431a>] ? handle_mm_fault+0x23a/0x310 Jan 29 17:43:08 localhost kernel: [<ffffffff8114451a>] ? __get_user_pages+0x12a/0x430 Jan 29 17:43:08 localhost kernel: [<ffffffff811448b9>] ? get_user_pages+0x49/0x50 Jan 29 17:43:08 localhost kernel: [<ffffffff8104c307>] ? get_user_pages_fast+0x157/0x1c0 Jan 29 17:43:08 localhost kernel: [<ffffffffa0387343>] ? hva_to_pfn+0x33/0x1a0 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffff8150f276>] ? down_read+0x16/0x30 Jan 29 17:43:08 localhost kernel: [<ffffffffa03a29cb>] ? mapping_level+0x17b/0x1d0 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffffa03a74bd>] ? paging64_page_fault+0xbd/0x4b0 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffffa03a5da8>] ? paging64_gva_to_gpa+0x48/0x90 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffffa03988d1>] ? emulator_read_emulated+0x101/0x240 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffffa03a3f08>] ? kvm_mmu_page_fault+0x28/0xc0 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffffa03ee558>] ? handle_exception+0x2c8/0x390 [kvm_intel] Jan 29 17:43:08 localhost kernel: [<ffffffff814e7644>] ? wireless_nlevent_process+0x24/0x80 Jan 29 17:43:08 localhost kernel: [<ffffffff814e7644>] ? wireless_nlevent_process+0x24/0x80 Jan 29 17:43:08 localhost kernel: [<ffffffffa03edef3>] ? vmx_handle_exit+0xc3/0x280 [kvm_intel] Jan 29 17:43:08 localhost kernel: [<ffffffffa039cfb6>] ? kvm_arch_vcpu_ioctl_run+0x486/0x1040 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffffa0385ff4>] ? kvm_vcpu_ioctl+0x434/0x580 [kvm] Jan 29 17:43:08 localhost kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80 Jan 29 17:43:08 localhost kernel: [<ffffffff81194eb2>] ? vfs_ioctl+0x22/0xa0 Jan 29 17:43:08 localhost kernel: [<ffffffff8119537a>] ? do_vfs_ioctl+0x3aa/0x580 Jan 29 17:43:08 localhost kernel: [<ffffffff811955d1>] ? sys_ioctl+0x81/0xa0 Jan 29 17:43:08 localhost kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b Jan 29 17:43:08 localhost kernel: Task in /libvirt/qemu/test-1 killed as a result of limit of /libvirt/qemu/test-1 Jan 29 17:43:08 localhost kernel: memory: usage 100000kB, limit 100000kB, failcnt 50 Jan 29 17:43:08 localhost kernel: memory+swap: usage 100000kB, limit 100092kB, failcnt 0 Jan 29 17:43:08 localhost kernel: Mem-Info: Jan 29 17:43:08 localhost kernel: Node 0 DMA per-cpu: Jan 29 17:43:08 localhost kernel: CPU 0: hi: 0, btch: 1 usd: 0 Jan 29 17:43:08 localhost kernel: CPU 1: hi: 0, btch: 1 usd: 0 Jan 29 17:43:08 localhost kernel: CPU 2: hi: 0, btch: 1 usd: 0 Jan 29 17:43:08 localhost kernel: CPU 3: hi: 0, btch: 1 usd: 0 Jan 29 17:43:08 localhost kernel: Node 0 DMA32 per-cpu: Jan 29 17:43:08 localhost kernel: CPU 0: hi: 186, btch: 31 usd: 169 Jan 29 17:43:08 localhost kernel: CPU 1: hi: 186, btch: 31 usd: 171 Jan 29 17:43:08 localhost kernel: CPU 2: hi: 186, btch: 31 usd: 167 Jan 29 17:43:08 localhost kernel: CPU 3: hi: 186, btch: 31 usd: 151 Jan 29 17:43:08 localhost kernel: Node 0 Normal per-cpu: Jan 29 17:43:08 localhost kernel: CPU 0: hi: 186, btch: 31 usd: 51 Jan 29 17:43:08 localhost kernel: CPU 1: hi: 186, btch: 31 usd: 156 Jan 29 17:43:08 localhost kernel: CPU 2: hi: 186, btch: 31 usd: 41 Jan 29 17:43:08 localhost kernel: CPU 3: hi: 186, btch: 31 usd: 21 Jan 29 17:43:08 localhost kernel: active_anon:48476 inactive_anon:5 isolated_anon:0 Jan 29 17:43:08 localhost kernel: active_file:52647 inactive_file:1571866 isolated_file:0 Jan 29 17:43:08 localhost kernel: unevictable:4722 dirty:6 writeback:0 unstable:0 Jan 29 17:43:08 localhost kernel: free:103124 slab_reclaimable:59743 slab_unreclaimable:16092 Jan 29 17:43:08 localhost kernel: mapped:7192 shmem:71 pagetables:1669 bounce:0 Jan 29 17:43:08 localhost kernel: Node 0 DMA free:15720kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Jan 29 17:43:08 localhost kernel: lowmem_reserve[]: 0 3510 7519 7519 Jan 29 17:43:08 localhost kernel: Node 0 DMA32 free:167996kB min:31492kB low:39364kB high:47236kB active_anon:1196kB inactive_anon:4kB active_file:5128kB inactive_file:3049956kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3595040kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:4kB slab_reclaimable:104224kB slab_unreclaimable:476kB kernel_stack:16kB pagetables:268kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 29 17:43:08 localhost kernel: lowmem_reserve[]: 0 0 4008 4008 Jan 29 17:43:08 localhost kernel: Node 0 Normal free:228780kB min:35956kB low:44944kB high:53932kB active_anon:192708kB inactive_anon:16kB active_file:205460kB inactive_file:3237508kB unevictable:18888kB isolated(anon):0kB isolated(file):0kB present:4104640kB mlocked:6628kB dirty:24kB writeback:0kB mapped:28744kB shmem:280kB slab_reclaimable:134748kB slab_unreclaimable:63892kB kernel_stack:1976kB pagetables:6408kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jan 29 17:43:08 localhost kernel: lowmem_reserve[]: 0 0 0 0 Jan 29 17:43:08 localhost kernel: Node 0 DMA: 2*4kB 0*8kB 2*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15720kB Jan 29 17:43:08 localhost kernel: Node 0 DMA32: 687*4kB 456*8kB 204*16kB 112*32kB 60*64kB 27*128kB 14*256kB 15*512kB 9*1024kB 12*2048kB 25*4096kB = 167996kB Jan 29 17:43:08 localhost kernel: Node 0 Normal: 220*4kB 756*8kB 649*16kB 482*32kB 347*64kB 173*128kB 58*256kB 43*512kB 38*1024kB 1*2048kB 18*4096kB = 228640kB Jan 29 17:43:08 localhost kernel: 1625136 total pagecache pages Jan 29 17:43:08 localhost kernel: 0 pages in swap cache Jan 29 17:43:08 localhost kernel: Swap cache stats: add 0, delete 0, find 0/0 Jan 29 17:43:08 localhost kernel: Free swap = 0kB Jan 29 17:43:08 localhost kernel: Total swap = 0kB Jan 29 17:43:08 localhost kernel: 1957887 pages RAM Jan 29 17:43:08 localhost kernel: 80558 pages reserved Jan 29 17:43:08 localhost kernel: 1607282 pages shared Jan 29 17:43:08 localhost kernel: 178657 pages non-shared Jan 29 17:43:08 localhost kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Jan 29 17:43:08 localhost kernel: [16027] 107 16027 345941 26166 2 0 0 qemu-kvm Jan 29 17:43:08 localhost kernel: Memory cgroup out of memory: Kill process 16027 (qemu-kvm) score 1000 or sacrifice child Jan 29 17:43:08 localhost kernel: Killed process 16027, UID 107, (qemu-kvm) total-vm:1383764kB, anon-rss:99984kB, file-rss:4680kB Jan 29 17:43:08 localhost kernel: virbr0: port 2(vnet0) entering disabled state Jan 29 17:43:08 localhost kernel: device vnet0 left promiscuous mode Jan 29 17:43:08 localhost kernel: virbr0: port 2(vnet0) entering disabled state This is different because in this case it is killed by kernel as it should (qemu is using too much memory, so it gets killed). Definitely not a dup. I got this error too. vm cannot start. libvirt-0.10.2-18.el6.4.x86_64 and libvirtd.log : 2013-05-15 09:45:22.062+0000: 32373: error : qemuMonitorIO:613 : internal error End of file from monitor 2013-05-15 09:45:22.064+0000: 32373: error : virNWFilterDHCPSnoopEnd:2131 : internal error ifname "vnet51" not in key map 2013-05-15 09:45:22.587+0000: 32373: error : virNWFilterDHCPSnoopEnd:2131 : internal error ifname "vnet52" not in key map 2013-05-15 09:45:22.631+0000: 32373: error : virNetDevGetIndex:653 : Unable to get index for interface vnet52: No such device This issue is already fixed. I am able to reproduce it using kernel 2.6.32-296.el6 and 2.6.32-358.el6, but it is not possible using the latest kernel (2.6.32-653.el6). |
Created attachment 609978 [details] kernel-log Description of problem: With using memtune set more than 200 times , cgroup out of memory.It lead to guest auto be destroyed , libvirtd hang and reboot can't success. Version-Release number of selected component (if applicable): libvirt-0.10.1-1.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.307.el6.x86_64 kernel-2.6.32-296.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1.Prepare a running guest with not set memtune before #virsh memtune test-1 hard_limit : 1283748 soft_limit : unlimited swap_hard_limit: unlimited 2. #for i in {1..1000} ;do echo $i ; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100021 --live; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100092 --live; virsh memtune test-1 --hard-limit 100000 --soft-limit 100000 --swap-hard-limit 100021 --live;done 3. 3.1 sometimes there is an error :unable to set swap_hard_limit tunable: Device or resource busy Just destroy the guest and start it , then excute step 2. 3.2 The guest will be auto destroyed after more 200 times . libvirtd.log: error : qemuMonitorIO:602 : internal error End of file from monitor error : virNetSocketReadWire:1176 : Cannot recv data: Connection reset by peer warning : qemuProcessKill:3966 : Timed out waiting after SIGTERM to process 5133, sending SIGKILL warning : qemuProcessKill:3998 : Timed out waiting after SIGKILL to process 5133 warning : qemuDomainObjBeginJobInternal:838 : Cannot start job (destroy, none) for domain test-2; current job is (query, none) owned by (2207, 0) error : qemuDomainObjBeginJobInternal:842 : Timed out during operation: cannot acquire state change lock 3.3 The kernel log see the attachemnt 4.Reboot the host #reboot It will hang on the process Turning off swap Actual results: Cgroup out of memory and it lead to guest auto destroyed and reboot unsucessed Expected results: Cgroup should not out of memory or should stop it out. Additional info: