Bug 1576752
Summary: | Number of "Prestarted VMs" is ignored and all VMs of Pool starts after editing existing Pool. | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | RHV bug bot <rhv-bugzilla-bot> |
Component: | ovirt-engine | Assignee: | Shmuel Melamud <smelamud> |
Status: | CLOSED ERRATA | QA Contact: | meital avital <mavital> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 4.1.8 | CC: | achareka, atripath, bgraveno, lsurette, michal.skrivanek, mkalinin, rbalakri, Rhev-m-bugs, rhodain, smelamud, srevivo, vpagar, ykaul |
Target Milestone: | ovirt-4.2.4 | Keywords: | Rebase, ZStream |
Target Release: | --- | Flags: | lsvaty:
testing_plan_complete-
|
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | ovirt-engine-4.2.4 | Doc Type: | No Doc Update |
Doc Text: |
undefined
|
Story Points: | --- |
Clone Of: | 1553425 | Environment: | |
Last Closed: | 2018-06-27 10:02:26 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1553425 | ||
Bug Blocks: |
Description
RHV bug bot
2018-05-10 10:39:42 UTC
the default for VmPoolMonitorIntervalInMinutes is 5 mins. It is assumed that all VMs are attempted to start in that period. Apparently it was changed to 1 minute which may not be enough. There is a bug, perhaps, that the assumption is not correct and it may take longer time to start all the VMs if the number of VMs to prestart is high and system is busy, but it may "go away" if you just change the interval back to 5 mins (or increase more) (Originally by michal.skrivanek) Apparently the customer has set the following values, back on their 3.4 or 3.5 environment: option_name | option_value --------------------------------+-------------- VmPoolMonitorMaxAttempts | 3 VmPoolMonitorBatchSize | 50 VmPoolMonitorIntervalInMinutes | 1" Probably those values interfere with the required behavior. The idea back then was to prestart as many VMs as possible. Which sounds different from their request today. I will check more and update the bug later. (Originally by Marina Kalinin) I have reviewed the data the problem is related to the number of threads which are trying to start the VMs here is a grep from the engine logs related to pool 67e3ea67-811d-4703-919a-269af29c21a5 (MyPool) [1]. The number of VMs in the pool is 33 and the pool is set to prestart 20 VMs. The operation ended up with 28 prestarted VMs. Some of the VMs failed with lack of memory on the hypervisors so the entire VMPool was not started. One of the possible reasons possible ways to reproduce is to edit the pool twice and click ok. [1]: engine.log: 2018-04-19 09:28:55,441+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [15f44bcd] VmPool '67e3ea67-811d-4703-919a-269af29c21a5' is missing 20 prestarted VMs, attempting to prestart 20 VMs 2018-04-19 09:28:58,497+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [15f44bcd] Running VM 'MyPool-21' as stateless 2018-04-19 09:29:07,785+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [b956bed] Running VM 'MyPool-21' as stateless succeeded 2018-04-19 09:29:09,362+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [b956bed] Running VM 'MyPool-27' as stateless 2018-04-19 09:29:24,966+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [1f5b258a] Running VM 'MyPool-27' as stateless succeeded 2018-04-19 09:29:26,062+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [1f5b258a] Running VM 'MyPool-28' as stateless 2018-04-19 09:29:44,524+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [713d03c] Running VM 'MyPool-28' as stateless succeeded 2018-04-19 09:29:47,034+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [713d03c] Running VM 'MyPool-29' as stateless 2018-04-19 09:30:06,566+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [1a1c3ceb] Running VM 'MyPool-29' as stateless succeeded 2018-04-19 09:30:10,235+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [1a1c3ceb] Running VM 'MyPool-32' as stateless 2018-04-19 09:30:24,748+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [530d5796] Running VM 'MyPool-32' as stateless succeeded 2018-04-19 09:30:27,143+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [530d5796] Running VM 'MyPool-33' as stateless 2018-04-19 09:30:27,450+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler8) [51fb1f71] VmPool '67e3ea67-811d-4703-919a-269af29c21a5' is missing 20 prestarted VMs, attempting to prestart 20 VMs 2018-04-19 09:30:34,102+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler8) [51fb1f71] Running VM 'MyPool-1' as stateless 2018-04-19 09:30:40,739+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [1df3a518] Running VM 'MyPool-33' as stateless succeeded 2018-04-19 09:30:41,264+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler8) [d0ecbe1] Running VM 'MyPool-1' as stateless succeeded 2018-04-19 09:30:44,241+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler8) [d0ecbe1] Running VM 'MyPool-2' as stateless 2018-04-19 09:30:48,853+02 INFO [org.ovirt.engine.core.bll.VmPoolMonitor] (DefaultQuartzScheduler5) [1df3a518] Running VM 'MyPool-3' as stateless (Originally by Roman Hodain) Verified on 4.2.4.1-0.1.el7 Scenario: 1. Set VmPoolMonitorIntervalInMinutes to 1 minute and VmPoolMonitorBatchSize to 10 2. Create VM pool with 16 VMs 3. Update created pool, set prestarted to 12 4. Update pool once more, don't change anything, click ok Repeated it 10 times, 10 times out of 10 there was 12 VMs prestarted Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2071 BZ<2>Jira Resync sync2jira sync2jira |