Description of problem: When you reboot a host, or your host does crash for some reason all vms are in down state, after the reboot took place. it would be useful to add a flag, similar to setting symlinks in /etc/libvirt/autostart, which would make a vm boot when the host boots. for shared storage you do not need this, as you can flag your vm as HA and it just starts on another hosts. but there are many use cases where you have local storage and you want to start the vms as soon as possible after a downtime occured. Version-Release number of selected component (if applicable): all How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: vms are down after host reboot Expected results: vms can be individually marked to autostart on boot of the host. Additional info: See this mailing list thread on the users list for the demand: http://lists.ovirt.org/pipermail/users/2014-November/029349.html
Some thoughts around this request; The ability to start VMs require a working engine, so in setups like all-in-one this will not be relevant as the engine is also expected to be down. Generally speaking, local storage is not the common use case. Such a change means we should extend the VM properties to add a general flag, which will only be used for a specific (uncommon) case of local storage. Instead of doing such a change, here are 2 lighter approaches; 1. Using VDSM hooks You can save a VM ID every time it starts and remove it once it goes down. So if the host crash you have a list of all VMs which were running previously. Now all you need to do is use another hook or even cron job to resume these VMs. 2. Using a load balancing module of the scheduler. In a similar way to the above you can record all running VMs, and resume them once the host moves to status 'up'. Please let me know if you have any questions. Doron
(In reply to Doron Fediuck from comment #1) > Some thoughts around this request; > > The ability to start VMs require a working engine, so in setups like > all-in-one this will not be relevant as the engine is also expected to be > down. Correct. But this is not about all-in-one > Generally speaking, local storage is not the common use case. Did you do a survey, or what are your metrics on which you base this assumption? There where some requests for this feature on the users mailing list. > Such a change > means we should extend the VM properties to add a general flag, which will > only be used for a specific (uncommon) case of local storage. > Instead of doing such a change, here are 2 lighter approaches; > 1. Using VDSM hooks > You can save a VM ID every time it starts and remove it once it goes down. > So if the host crash you have a list of all VMs which were running > previously. > Now all you need to do is use another hook or even cron job to resume these > VMs. This approach is just in one way "lighter": it offloads the coding to the user. > 2. Using a load balancing module of the scheduler. > In a similar way to the above you can record all running VMs, and resume them > once the host moves to status 'up'. I don't know about this one. Does the scheduler even work with local storage setups? Since which version is this supported? > Please let me know if you have any questions. > Doron Thanks for your reply.
(In reply to Sven Kieske from comment #2) > (In reply to Doron Fediuck from comment #1) > > Some thoughts around this request; > > > > The ability to start VMs require a working engine, so in setups like > > all-in-one this will not be relevant as the engine is also expected to be > > down. > Correct. But this is not about all-in-one > > Generally speaking, local storage is not the common use case. > Did you do a survey, or what are your metrics on which you base this > assumption? > There where some requests for this feature on the users mailing list. From scheduling perspective, the lion share of the functionality requires shared storage. Scheduling and placement deals with finding the right host and in such a case there are no options, which makes this an uncommon case. > > Such a change > > means we should extend the VM properties to add a general flag, which will > > only be used for a specific (uncommon) case of local storage. > > > > > Instead of doing such a change, here are 2 lighter approaches; > > 1. Using VDSM hooks > > You can save a VM ID every time it starts and remove it once it goes down. > > So if the host crash you have a list of all VMs which were running > > previously. > > Now all you need to do is use another hook or even cron job to resume these > > VMs. > > This approach is just in one way "lighter": it offloads the coding > to the user. > This is lighter on the data model of the system. I urge you to read the code and see the impact of such a change on the rest of the system. As explained above this is not the common case for scheduling and as such should not be introduced if there's a simpler alternative to get the same results. If you have a lighter approach using engine code I'll be glad to review your patches and merge them once ready. > > 2. Using a load balancing module of the scheduler. > > In a similar way to the above you can record all running VMs, and resume them > > once the host moves to status 'up'. > > I don't know about this one. Does the scheduler even work with local storage > setups? Since which version is this supported? > Scheduler is a part of the engine, and starting 3.3 we support a scheduler proxy which allows you to run your own Python code. You can read about it here: http://www.ovirt.org/Features/oVirt_External_Scheduler and some relevant samples available here: https://github.com/oVirt/ovirt-scheduler-proxy/tree/master/doc/plugin_samples > > Please let me know if you have any questions. > > Doron > > Thanks for your reply. Sure. There are some people already using the scheduler extensions. You can ask about it in the list if you want to give it a try.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
I am adding another use-case I just saw on the mailing list. There was a request to autostart some VMs (in defined order even) in case the data center is fully rebooted (power outage or maintenance). People seem to be used to this feature from VMware and use it to for example to start a DNS provider VM or Active Directory VM before other services that need that information (one example was a MS SQL server needing AD domain to be able to start). But Sven this is really not an easy thing to do with our current model (each RunVM in a separate thread, VMs treated separately), but we might be able to add an autostart flag and add a VM dependency mechanism to accomplish that for normal (non-infrastructure) VMs. On the other hand, Hosted Engine is also a VM and it won't be able to do (start DNS VM for example) anything when hypervisors are not available because the DNS or vSwitch VMs are still down.
(In reply to Martin Sivák from comment #5) > I am adding another use-case I just saw on the mailing list. > > There was a request to autostart some VMs (in defined order even) in case > the data center is fully rebooted (power outage or maintenance). > > People seem to be used to this feature from VMware and use it to for example > to start a DNS provider VM or Active Directory VM before other services that > need that information (one example was a MS SQL server needing AD domain to > be able to start). > > > But Sven this is really not an easy thing to do with our current model (each > RunVM in a separate thread, VMs treated separately), but we might be able to > add an autostart flag and add a VM dependency mechanism to accomplish that > for normal (non-infrastructure) VMs. Thanks for your comment. Well I'd suggest to split this up into 2 parts. First: be able to autostart any given vm at all. Second: Ordering autostarts (this will be the complex part). What do you think?
(In reply to Sven Kieske from comment #6) > (In reply to Martin Sivák from comment #5) > > I am adding another use-case I just saw on the mailing list. > > > > There was a request to autostart some VMs (in defined order even) in case > > the data center is fully rebooted (power outage or maintenance). > > > > People seem to be used to this feature from VMware and use it to for example > > to start a DNS provider VM or Active Directory VM before other services that > > need that information (one example was a MS SQL server needing AD domain to > > be able to start). > > > > > > But Sven this is really not an easy thing to do with our current model (each > > RunVM in a separate thread, VMs treated separately), but we might be able to > > add an autostart flag and add a VM dependency mechanism to accomplish that > > for normal (non-infrastructure) VMs. > > Thanks for your comment. > > Well I'd suggest to split this up into 2 parts. > > First: be able to autostart any given vm at all. > Second: Ordering autostarts (this will be the complex part). > > What do you think? As a user, I agree. VMware has three states for a VM: * AutoStart in Order * AutoStart * Do Not AutoStart In the UI you basically move a VM in the list. The first list (ordered start) is, obviously, ordered. Those VMs are started in the recorded order. You can specify the time to wait between starts, or probably a "wait for VM to come online" state. Then the second group is started, individually, in arbitrary order. The third group is not started. I agree that in many cases it's more important to make sure that VMs autostart, period. In my VMware deployment the vast majority of VMs are in the second category (or the third). However I do have a small number of VMs (e.g. DNS, Kerberos) that I want to start before all the rest. The Self-Hosted Engine (remember, AllInOne has been deprecated and removed now), I'm assuming, already has this logic to ensure the engine is running. I presume this is the case even if there is only a single host/node. So why couldn't the similar logic be applied to additional VMs once the engine is up? Or, similarly, if a host needs a virtual router running locally, why can't the host HA service be configured similarly? (Sorry, ovirt n00b here so I apologize for coming in with wacko ideas)
Users requesting "autostart" feature on the users mailing list: http://lists.ovirt.org/pipermail/users/2014-November/029349.html http://lists.ovirt.org/pipermail/users/2015-June/033481.html http://lists.ovirt.org/pipermail/users/2014-November/029424.html And this was just a quick look. HTH Sven
I wonder if we could do it via libvirt and Engine will learn about them as external VMs?
I did not understand this as a problem of existing VMs (that would still require some effort as the current "discovery" of external VMs is not good enough, and the recent enhancements were done for HE with only a partial reusability, which is a pity) For just storing a flag to autostart - that's still best done in engine as a regular property. It's not a complicated feature. If we want to restart without engine intervention - we can't really do that since we use transitive domains in libvirt so nothing is kept on boot.
I think there are different RFEs between this and other autostart bugs. Is there an RFE to track this feature for non local storage? (Bug #1325468)? Personally I think there are two cases (regardless of storage): 1) Autostart a VM with the engine's help 2) Autostart a VM WITHOUT the engine's help. I suspect there should be separate RFEs to track these two cases, since it downs like case #1 is easy and case #2 is harder. Perhaps that's what 1325468 is about? Then there is the follow-on case of applying an order to the startup (with or without the engine). I feel all of this is irrespective of whether the storage is "local" or "shared".
(In reply to Derek Atkins from comment #11) > I think there are different RFEs between this and other autostart bugs. Is > there an RFE to track this feature for non local storage? (Bug #1325468)? > > Personally I think there are two cases (regardless of storage): > > 1) Autostart a VM with the engine's help > 2) Autostart a VM WITHOUT the engine's help. > > I suspect there should be separate RFEs to track these two cases, since it > downs like case #1 is easy and case #2 is harder. Perhaps that's what > 1325468 is about? Agreed it makes sense to split to two RFEs. The 1st should be easy to implement with Ansible script running when the host goes up, I reckon. > > Then there is the follow-on case of applying an order to the startup (with > or without the engine). > > I feel all of this is irrespective of whether the storage is "local" or > "shared". There is a huge difference here, though. #2 is far easier with local storage, for two reasons: 1. You don't need to connect to a remote storage. 2. You are not concerned with locking to prevent data corruption.
auto start after boot it's very important feature
not in scope of 4.4 effort - that will only contain bug 1325468
This is being worked on in rhbz#1325468, which will cover this use case *** This bug has been marked as a duplicate of bug 1325468 ***