Bug 1490810
Summary: | print task list to log in TooManyTasks issue | ||
---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Oved Ourfali <oourfali> |
Component: | Core | Assignee: | Francesco Romani <fromani> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Israel Pinto <ipinto> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.20.4 | CC: | bugs, eheftman, ipinto, lveyde, tjelinek |
Target Milestone: | ovirt-4.1.7 | Flags: | tjelinek:
ovirt-4.1?
tjelinek: planning_ack? tjelinek: devel_ack+ rule-engine: testing_ack+ |
Target Release: | 4.19.33 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | v4.19.33 | Doc Type: | Enhancement |
Doc Text: |
Previously, while VDSM performed periodic monitoring and maintenance tasks, operations would occasionally become too slow or even become blocked. In this case, the internal queue became full and the periodic operations were no longer performed. A "TooManyTasks" warning appeared in the log files at a maximum rate of once every 10 seconds.
In this release, if VDSM cannot perform its periodic operations, in addition to issuing a warning in the log files, VDSM also dumps the contents of the queue into the logs.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-11-13 12:25:58 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Oved Ourfali
2017-09-12 09:35:56 UTC
it seems both useful and easy enough for 4.1.7, targeting accordingly. To reproduce: First, let's make more likely that Vdsm will see TooManyTasks. We can achieve this artificially reducing the resources available to the Vdsm executor, for example using those crazy settings in vdsm configuration: [sampling] periodic_workers = 1 periodic_task_per_worker = 1 max_workers = 2 Again, those settings are legal but absolutely crazy out of testing/development environment; then restart Vdsm. Now Vdsm should see TooManyTasks error very frequently. One needs to run any workload, watch out for timeouts (much more likely). After some time just open the Vdsm logs and check for lines like 2017-09-14 15:03:46,225+0200 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x35b6f90>, executor queue full (periodic:241) 2017-09-14 15:03:46,225+0200 WARN (vdsm.Scheduler) [virt.periodic.Operation] executor state: <Executor periodic workers=1 max_workers=2 <TaskQueue periodic max_tasks=1 tasks(1)=deque([<Task discardable <Operation action=<function <lambda> at 0x7f801cded140> at 0x34b5210> timeout=7.5, duration=0 at 0x3b15290>]) at 0x371ac90> at 0x371add0> the "deque=..." shows the current task list. +++ [1] Usually the values are: periodic_workers = N*M -> N numbers of cores in the box, M = multiplication factor usually 1 or 2. Values M > 1 are overhead, but sometimes needed periodic_task_per_worker = 100 max_workers = P * K -> P = value of "periodic_workers", K = multiplication factor much like M above. The value of "M" and "K" is eurhystic, depends on the setup, there is no "good" value here. Verify with: vdsm-4.19.34-1.el7ev.x86_64 Did the steps at comment 2: vdsm log: [root@puma43 vdsm]# grep virt.periodic.Operation vdsm.log 2017-10-17 10:43:13,304+0300 WARN (MainThread) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.NumaInfoMonitor'> at 0x3552c10>, executor queue full (periodic:239) 2017-10-17 10:43:13,304+0300 WARN (MainThread) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x3552c90>, executor queue full (periodic:239) 2017-10-17 10:43:13,305+0300 WARN (MainThread) [virt.periodic.Operation] could not run <vdsm.virt.sampling.VMBulkSampler object at 0x3552d10>, executor queue full (periodic:239) 2017-10-17 10:43:13,305+0300 WARN (MainThread) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x3552fd0>, executor queue full (periodic:239) 2017-10-17 10:43:13,306+0300 WARN (MainThread) [virt.periodic.Operation] could not run <vdsm.virt.sampling.HostMonitor object at 0x3558090>, executor queue full (periodic:239) 2017-10-17 10:43:13,306+0300 WARN (MainThread) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:43:28,308+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:43:28,309+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x3552c90>, executor queue full (periodic:239) 2017-10-17 10:43:43,302+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x3552fd0>, executor queue full (periodic:239) 2017-10-17 10:43:43,313+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.BlockjobMonitor'> at 0x3552c90>, executor queue full (periodic:239) 2017-10-17 10:43:58,316+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:44:13,300+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x3552fd0>, executor queue full (periodic:239) 2017-10-17 10:44:13,300+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <vdsm.virt.sampling.VMBulkSampler object at 0x3552d10>, executor queue full (periodic:239) 2017-10-17 10:44:13,311+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) 2017-10-17 10:44:28,315+0300 WARN (vdsm.Scheduler) [virt.periodic.Operation] could not run <function <lambda> at 0x20c6668>, executor queue full (periodic:239) |