We saw several occurrences of TooManyTasks issue on VDSM. It can be useful to have a list of tasks in case of such an error.
it seems both useful and easy enough for 4.1.7, targeting accordingly.
To reproduce: First, let's make more likely that Vdsm will see TooManyTasks. We can achieve this artificially reducing the resources available to the Vdsm executor, for example using those crazy settings in vdsm configuration: [sampling] periodic_workers = 1 periodic_task_per_worker = 1 max_workers = 2 Again, those settings are legal but absolutely crazy out of testing/development environment; then restart Vdsm. Now Vdsm should see TooManyTasks error very frequently. One needs to run any workload, watch out for timeouts (much more likely). After some time just open the Vdsm logs and check for lines like 2017-09-14 15:03:46,225+0200 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x35b6f90>, executor queue full (periodic:241) 2017-09-14 15:03:46,225+0200 WARN (vdsm.Scheduler) [virt.periodic.Operation] executor state: <Executor periodic workers=1 max_workers=2 <TaskQueue periodic max_tasks=1 tasks(1)=deque([<Task discardable <Operation action=<function <lambda> at 0x7f801cded140> at 0x34b5210> timeout=7.5, duration=0 at 0x3b15290>]) at 0x371ac90> at 0x371add0> the "deque=..." shows the current task list. +++ [1] Usually the values are: periodic_workers = N*M -> N numbers of cores in the box, M = multiplication factor usually 1 or 2. Values M > 1 are overhead, but sometimes needed periodic_task_per_worker = 100 max_workers = P * K -> P = value of "periodic_workers", K = multiplication factor much like M above. The value of "M" and "K" is eurhystic, depends on the setup, there is no "good" value here.
Verify with: vdsm-4.19.34-1.el7ev.x86_64 Did the steps at comment 2: vdsm log: [root@puma43 vdsm]# grep virt.periodic.Operation vdsm.log 2017-10-17 10:43:13,304+0300 WARN (MainThread) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.NumaInfoMonitor'> at 0x3552c10>, executor queue full (periodic:239) 2017-10-17 10:43:13,304+0300 WARN (MainThread) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x3552c90>, executor queue full (periodic:239) 2017-10-17 10:43:13,305+0300 WARN (MainThread) [virt.periodic.Operation] could not run <vdsm.virt.sampling.VMBulkSampler object at 0x3552d10>, executor queue full (periodic:239) 2017-10-17 10:43:13,305+0300 WARN (MainThread) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x3552fd0>, executor queue full (periodic:239) 2017-10-17 10:43:13,306+0300 WARN (MainThread) [virt.periodic.Operation] could not run <vdsm.virt.sampling.HostMonitor object at 0x3558090>, executor queue full (periodic:239) 2017-10-17 10:43:13,306+0300 WARN (MainThread) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:43:28,308+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:43:28,309+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x3552c90>, executor queue full (periodic:239) 2017-10-17 10:43:43,302+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x3552fd0>, executor queue full (periodic:239) 2017-10-17 10:43:43,313+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x3552c90>, executor queue full (periodic:239) 2017-10-17 10:43:58,316+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:44:13,300+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x3552fd0>, executor queue full (periodic:239) 2017-10-17 10:44:13,300+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <vdsm.virt.sampling.VMBulkSampler object at 0x3552d10>, executor queue full (periodic:239) 2017-10-17 10:44:13,311+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:44:28,315+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239)