1660451 – Executor queue can get full if vm.destroy takes some time to complete

Bug 1660451 - Executor queue can get full if vm.destroy takes some time to complete

Summary: Executor queue can get full if vm.destroy takes some time to complete

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	vdsm
Sub Component:
Version:	4.2.7
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	ovirt-4.3.6
Target Release:	4.3.6
Assignee:	Milan Zamazal
QA Contact:	Guilherme Santos
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1660452 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-18 10:57 UTC by nijin ashok
Modified:	2019-10-10 15:37 UTC (History)
CC List:	14 users (show)
Fixed In Version:	vdsm-4.30.29
Doc Type:	Bug Fix
Doc Text:	Previously, some virtual machine operations unnecessarily blocked other VM operations. This led to some problems with monitoring while shutting down large virtual machines. The current release fixes these issues: It relaxes some conditions for blocking virtual machine operations and makes blocking safer. This should reduce monitoring problems experienced in some scenarios.
Clone Of:
Environment:
Last Closed:	2019-10-10 15:36:50 UTC
oVirt Team:	Virt
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	3776561	None	None	None	2018-12-27 06:12:45 UTC
Red Hat Product Errata	RHBA-2019:3009	None	None	None	2019-10-10 15:37:06 UTC
oVirt gerrit	102370	'None'	MERGED	virt: Split vmContainerLock	2021-01-25 12:18:09 UTC
oVirt gerrit	102371	'None'	MERGED	common: Protect access to clientIF.vmContainer values	2021-01-25 12:17:27 UTC
oVirt gerrit	102514	'None'	MERGED	virt: Split vmContainerLock	2021-01-25 12:17:27 UTC
oVirt gerrit	102515	'None'	MERGED	common: Protect access to clientIF.vmContainer values	2021-01-25 12:17:27 UTC

Description nijin ashok 2018-12-18 10:57:29 UTC

Description of problem:

In vm.destroy, we are sending libvirt.VIR_DOMAIN_DESTROY_GRACEFUL which will send SIGTERM to qemu-kvm process. So this can take some time if there is a lot of i/o pending to flush for the VM. However, during this time, the qemu monitor will not be accessible for this domain and hence the periodic tasks which access this VMs qemu monitor will get stuck. With a single VM with vm.destroy running for 30 seconds, the executor queue can reach 20+ threads which all serves the stuck tasks.

The issue was observed in the customer environment when the customer tried to shut down a VM when it was migrating.

===
2018-12-10 16:49:02,370+0900 INFO  (jsonrpc/6) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:,49558, flow_id=2d076393-b68f-4504-9309-9a8b6004e947, vmId=a6d98c62-fd66-44bc-aa2c-2b52321a9f00 (api:46)
2018-12-10 16:49:02,370+0900 INFO  (jsonrpc/6) [virt.vm] (vmId='a6d98c62-fd66-44bc-aa2c-2b52321a9f00') Release VM resources (vm:5215)
2018-12-10 16:49:02,370+0900 INFO  (jsonrpc/6) [virt.vm] (vmId='a6d98c62-fd66-44bc-aa2c-2b52321a9f00') Stopping connection (guestagent:438)
2018-12-10 16:49:02,371+0900 INFO  (jsonrpc/6) [virt.vm] (vmId='a6d98c62-fd66-44bc-aa2c-2b52321a9f00') _destroyVmGraceful attempt #0 (vm:5252)
===

Then the executor queue starts increasing which and reach the limit of 30 all serving the periodic tasks of this VM. Once it reaches 30, all the VMs will go into "non-responding" status since there is no more worker thread available.

===
2018-12-10 16:49:04,271+0900 WARN  (vdsm.Scheduler) [Executor] executor state: count=6
2018-12-10 16:49:14,896+0900 WARN  (vdsm.Scheduler) [Executor] executor state: count=10
2018-12-10 16:49:39,338+0900 WARN  (vdsm.Scheduler) [Executor] executor state: count=17
2018-12-10 16:49:48,325+0900 WARN  (vdsm.Scheduler) [Executor] executor state: count=22
2018-12-10 16:49:50,863+0900 WARN  (vdsm.Scheduler) [Executor] executor state: count=27 
2018-12-10 16:49:58,838+0900 WARN  (vdsm.Scheduler) [Executor] executor state: count=30
2018-12-10 16:49:59,842+0900 WARN  (vdsm.Scheduler) [Executor] Too many workers (limit=30), not adding more (executor:210)

2018-12-10 16:49:52,362+0900 WARN  (jsonrpc/3) [virt.vm] (vmId='35f7d64b-071f-4432-97aa-3b10c048e6be') monitor became unresponsive (command timeout, age=60.9699999988) (vm:6151)
2018-12-10 16:49:52,364+0900 WARN  (jsonrpc/3) [virt.vm] (vmId='1e8d5f9c-77f2-4659-a64c-fbf857081a6d') monitor became unresponsive (command timeout, age=60.9699999988) (vm:6151)
===

Once the VM is down, it will release all the threads and the executor queue will go into normal stage.

===
2018-12-10 16:50:02,305+0900 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call VM.destroy succeeded in 59.93 seconds (__init__:573)
2018-12-10 16:50:02,307+0900 INFO  (periodic/19625) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,307+0900 INFO  (periodic/19623) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,321+0900 INFO  (periodic/19624) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,388+0900 INFO  (periodic/19609) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,390+0900 INFO  (periodic/19632) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,403+0900 INFO  (periodic/19636) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,403+0900 INFO  (periodic/19638) [Executor] Worker was discarded (executor:305)
2018-12-10 16:50:02,405+0900 INFO  (periodic/19639) [Executor] Worker was discarded (executor:305)
===

I was able to reproduce this issue by setting some (unusual) values for the disk caching and limiting the i/o in storage so that I will be having large dirty pages while destroying the VM so there would be some delay in vm.destroy. The destroy took 30 seconds and the queue went to 23 with only single VM running in the host.

From my test environment.

===
2018-12-18 09:56:37,726+0530 INFO  (jsonrpc/5) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:10.65.177.57,49360, flow_id=7d78f081, vmId=0a58e3f8-aead-4b1d-9d83-5ed4c93619ee (api:46)
2018-12-18 09:56:45,495+0530 WARN  (vdsm.Scheduler) [Executor] executor state: count=8
2018-12-18 09:56:59,499+0530 WARN  (vdsm.Scheduler) [Executor] executor state: count=17
2018-12-18 09:57:03,499+0530 WARN  (vdsm.Scheduler) [Executor] executor state: count=21
2018-12-18 09:57:07,494+0530 WARN  (vdsm.Scheduler) [Executor] executor state: count=23

2018-12-18 09:57:07,756+0530 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call VM.destroy failed (error 42) in 30.03 seconds (__init__:573)
2018-12-18 09:57:07,756+0530 INFO  (qgapoller/8) [Executor] Worker was discarded (executor:305)
2018-12-18 09:57:07,757+0530 INFO  (periodic/32) [Executor] Worker was discarded (executor:305)
2018-12-18 09:57:07,759+0530 INFO  (periodic/43) [Executor] Worker was discarded (executor:305)
2018-12-18 09:57:07,780+0530 INFO  (periodic/40) [Executor] Worker was discarded (executor:305)
===


Version-Release number of selected component (if applicable):

vdsm-4.20.39.1-1.el7ev.x86_64


How reproducible:

100%

Steps to Reproduce:

1. In my 3 GB RAM VM, I set below values in sysctl to get large dirty pages.

vm.dirty_background_ratio = 60
vm.dirty_ratio = 80
vm.dirty_expire_centisecs = 36000

This was done just for getting a large cache since I don't have a VM with huge RAM to work with.

2. Migrate the VM and during the migration, shutdown the VM.

3. Check the vdsm log and we will be able to see executor queue getting full.

Actual results:

Executor queue is getting full if vm.destroy takes some time to complete

Expected results:

I think the periodic monitoring should be disabled if VM is getting destroyed.

Additional info:

Comment 2 Ryan Barry 2018-12-19 00:13:17 UTC

*** Bug 1660452 has been marked as a duplicate of this bug. ***

Comment 3 Francesco Romani 2019-01-02 11:08:23 UTC

Thanks for the bug Nijin, this behaviour can indeed be triggered by real large VM, so better address it.
As you noticed, Vdsm is supposed to recover cleanly from those cases, but there is a room for improvement indeed.

From Vdsm perspective to suspend/stop monitoring while the Vm is shutting down makes sense. There are two caveats:
1. we need to make sure Engine is OK and can cope with that
2. monitoring is quite a performance-sensitive area, for Vdsm (for various technical reasons I can elaborate). We should be careful adding checks there, and try hard to assess the performance impact of them.

The real issue is #1. We can solve #2 if Engine can cope with this change of behaviour.

Comment 4 Francesco Romani 2019-01-02 11:33:56 UTC

Nijin, could you please file a bug against libvirt(/qemu?) to inquiry if the behaviour described in https://bugzilla.redhat.com/show_bug.cgi?id=1660451#c0 could be improved?
Maybe it is an unfixable design quirk, maybe it could be improved in the future.

Comment 5 nijin ashok 2019-01-07 08:45:50 UTC

(In reply to Francesco Romani from comment #4)
> Nijin, could you please file a bug against libvirt(/qemu?) to inquiry if the
> behaviour described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1660451#c0 could be improved?
> Maybe it is an unfixable design quirk, maybe it could be improved in the
> future.

I have opened bug 1663859.

Comment 6 Ryan Barry 2019-01-21 14:53:50 UTC

Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both

Comment 8 Milan Zamazal 2019-07-26 16:13:41 UTC

I can't reproduce the bug using the scenario from the bug description or another way. There is also no vdsm.log to check what all blocks in the executor. The sosreport I looked into is failing on something that was fixed before 4.20.39, so it's not relevant.

Nevertheless I tried to emulate the problem artificially and some periodic tasks indeed produce blocked tasks in the executor. Adding check for self._monitorable to Vm.isDomainReadyForCommands should remedy the problem. Unless a blocked QEMU monitor blocks all libvirt calls, not just for the given VM. I can't check that without a reproducer.

Comment 9 Milan Zamazal 2019-07-29 13:09:26 UTC

Another issue we need to take care of is vmContainerLock taken during the whole VM.destroy call.

Comment 10 Milan Zamazal 2019-08-13 12:50:43 UTC

vmContainer locking has been improved. Since I can't reproduce the original problem, I could check only that it fixes an artificially induced blocking. It should fix the reported issue as well, unless QEMU performs a globally, rather than VM specific, locking operations. This must be checked in verification.

Comment 11 Lucie Leistnerova 2019-08-26 11:39:26 UTC

So what are the proper verification steps please?

Comment 12 Milan Zamazal 2019-08-26 12:24:40 UTC

Those described in Comment 0 would be if you can reproduce the problem, expected results would be that the executor keeps working rather than getting full (it can be examined in vdsm.log). I couldn't reproduce the original problem; please ask Nijin for help if you have also trouble to reproduce it.

Comment 13 RHV bug bot 2019-09-05 13:28:40 UTC

WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops

Comment 14 RHV bug bot 2019-09-05 13:34:25 UTC

WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops

Comment 15 Guilherme Santos 2019-09-12 13:24:59 UTC

Verified on:
vdsm-4.30.30-1.el7ev.x86_64

Steps:
1. Run "stress -m 12 -i 4" (yum package) and wait until almost 100% of memory gets used
2. Migrate the VM and, during the migration, shutdown the VM.
3. Check the vdsm log.

Results:
No logs about executors being used for the action

Comment 18 errata-xmlrpc 2019-10-10 15:36:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3009

Note You need to log in before you can comment on or make changes to this bug.