Bug 1166657 (ovirt_auto_start_vm_local_dc) - [RFE] add an autostart flag to (local storage) vms
Summary: [RFE] add an autostart flag to (local storage) vms
Keywords:
Status: CLOSED DUPLICATE of bug 1325468
Alias: ovirt_auto_start_vm_local_dc
Product: ovirt-engine
Classification: oVirt
Component: RFEs
Version: ---
Hardware: All
OS: Linux
high
high with 6 votes
Target Milestone: ---
: ---
Assignee: bugs@ovirt.org
QA Contact: Artyom
URL:
Whiteboard:
Depends On:
Blocks: RHEV_auto_start_vms_local_dc
TreeView+ depends on / blocked
 
Reported: 2014-11-21 12:42 UTC by Sven Kieske
Modified: 2023-10-06 17:27 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-13 12:54:54 UTC
oVirt Team: SLA
Embargoed:
michal.skrivanek: ovirt-future?
ylavi: planning_ack?
ylavi: devel_ack?
ylavi: testing_ack?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 817363 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Bugzilla 1108678 1 None None None 2021-09-09 11:37:17 UTC
Red Hat Bugzilla 1325468 0 high CLOSED [RFE] Autostart of VMs that are down (with Engine assistance - Engine has to be up) 2023-10-06 17:32:15 UTC

Internal Links: 817363 1108678

Description Sven Kieske 2014-11-21 12:42:45 UTC
Description of problem:
When you reboot a host, or your host does crash for some reason
all vms are in down state, after the reboot took place.

it would be useful to add a flag, similar to setting
symlinks in /etc/libvirt/autostart, which would make a vm boot
when the host boots.

for shared storage you do not need this, as you can flag your vm
as HA and it just starts on another hosts.

but there are many use cases where you have local storage and you
want to start the vms as soon as possible after a downtime occured.

Version-Release number of selected component (if applicable):
all

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:
vms are down after host reboot

Expected results:
vms can be individually marked to autostart on boot of the host.

Additional info:
See this mailing list thread on the users list for the demand:
http://lists.ovirt.org/pipermail/users/2014-November/029349.html

Comment 1 Doron Fediuck 2014-11-23 12:23:27 UTC
Some thoughts around this request;

The ability to start VMs require a working engine, so in setups like
all-in-one this will not be relevant as the engine is also expected to be down.
Generally speaking, local storage is not the common use case. Such a change
means we should extend the VM properties to add a general flag, which will
only be used for a specific (uncommon) case of local storage. 

Instead of doing such a change, here are 2 lighter approaches;
1. Using VDSM hooks
You can save a VM ID every time it starts and remove it once it goes down.
So if the host crash you have a list of all VMs which were running previously.
Now all you need to do is use another hook or even cron job to resume these VMs.

2. Using a load balancing module of the scheduler.
In a similar way to the above you can record all running VMs, and resume them
once the host moves to status 'up'.

Please let me know if you have any questions.
Doron

Comment 2 Sven Kieske 2014-11-24 07:23:49 UTC
(In reply to Doron Fediuck from comment #1)
> Some thoughts around this request;
> 
> The ability to start VMs require a working engine, so in setups like
> all-in-one this will not be relevant as the engine is also expected to be
> down.
Correct. But this is not about all-in-one
> Generally speaking, local storage is not the common use case.
Did you do a survey, or what are your metrics on which you base this assumption?
There where some requests for this feature on the users mailing list.
> Such a change
> means we should extend the VM properties to add a general flag, which will
> only be used for a specific (uncommon) case of local storage. 



> Instead of doing such a change, here are 2 lighter approaches;
> 1. Using VDSM hooks
> You can save a VM ID every time it starts and remove it once it goes down.
> So if the host crash you have a list of all VMs which were running
> previously.
> Now all you need to do is use another hook or even cron job to resume these
> VMs.

This approach is just in one way "lighter": it offloads the coding
to the user.

> 2. Using a load balancing module of the scheduler.
> In a similar way to the above you can record all running VMs, and resume them
> once the host moves to status 'up'.

I don't know about this one. Does the scheduler even work with local storage
setups? Since which version is this supported?

> Please let me know if you have any questions.
> Doron

Thanks for your reply.

Comment 3 Doron Fediuck 2014-11-24 08:27:55 UTC
(In reply to Sven Kieske from comment #2)
> (In reply to Doron Fediuck from comment #1)
> > Some thoughts around this request;
> > 
> > The ability to start VMs require a working engine, so in setups like
> > all-in-one this will not be relevant as the engine is also expected to be
> > down.
> Correct. But this is not about all-in-one
> > Generally speaking, local storage is not the common use case.
> Did you do a survey, or what are your metrics on which you base this
> assumption?
> There where some requests for this feature on the users mailing list.
From scheduling perspective, the lion share of the functionality requires
shared storage. Scheduling and placement deals with finding the right
host and in such a case there are no options, which makes this an uncommon case.

> > Such a change
> > means we should extend the VM properties to add a general flag, which will
> > only be used for a specific (uncommon) case of local storage. 
> 
> 
> 
> > Instead of doing such a change, here are 2 lighter approaches;
> > 1. Using VDSM hooks
> > You can save a VM ID every time it starts and remove it once it goes down.
> > So if the host crash you have a list of all VMs which were running
> > previously.
> > Now all you need to do is use another hook or even cron job to resume these
> > VMs.
> 
> This approach is just in one way "lighter": it offloads the coding
> to the user.
> 
This is lighter on the data model of the system. I urge you to read the code
and see the impact of such a change on the rest of the system. As explained
above this is not the common case for scheduling and as such should not be
introduced if there's a simpler alternative to get the same results. If you
have a lighter approach using engine code I'll be glad to review your patches
and merge them once ready.

> > 2. Using a load balancing module of the scheduler.
> > In a similar way to the above you can record all running VMs, and resume them
> > once the host moves to status 'up'.
> 
> I don't know about this one. Does the scheduler even work with local storage
> setups? Since which version is this supported?
> 
Scheduler is a part of the engine, and starting 3.3 we support a scheduler
proxy which allows you to run your own Python code.
You can read about it here: http://www.ovirt.org/Features/oVirt_External_Scheduler
and some relevant samples available here: https://github.com/oVirt/ovirt-scheduler-proxy/tree/master/doc/plugin_samples

> > Please let me know if you have any questions.
> > Doron
> 
> Thanks for your reply.

Sure. 
There are some people already using the scheduler extensions.
You can ask about it in the list if you want to give it a try.

Comment 4 Red Hat Bugzilla Rules Engine 2015-10-19 10:49:16 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Martin Sivák 2016-04-13 13:48:41 UTC
I am adding another use-case I just saw on the mailing list.

There was a request to autostart some VMs (in defined order even) in case the data center is fully rebooted (power outage or maintenance).

People seem to be used to this feature from VMware and use it to for example to start a DNS provider VM or Active Directory VM before other services that need that information (one example was a MS SQL server needing AD domain to be able to start).


But Sven this is really not an easy thing to do with our current model (each RunVM in a separate thread, VMs treated separately), but we might be able to add an autostart flag and add a VM dependency mechanism to accomplish that for normal (non-infrastructure) VMs.

On the other hand, Hosted Engine is also a VM and it won't be able to do (start DNS VM for example) anything when hypervisors are not available because the DNS or vSwitch VMs are still down.

Comment 6 Sven Kieske 2016-04-18 08:08:51 UTC
(In reply to Martin Sivák from comment #5)
> I am adding another use-case I just saw on the mailing list.
> 
> There was a request to autostart some VMs (in defined order even) in case
> the data center is fully rebooted (power outage or maintenance).
> 
> People seem to be used to this feature from VMware and use it to for example
> to start a DNS provider VM or Active Directory VM before other services that
> need that information (one example was a MS SQL server needing AD domain to
> be able to start).
> 
> 
> But Sven this is really not an easy thing to do with our current model (each
> RunVM in a separate thread, VMs treated separately), but we might be able to
> add an autostart flag and add a VM dependency mechanism to accomplish that
> for normal (non-infrastructure) VMs.

Thanks for your comment.

Well I'd suggest to split this up into 2 parts.

First: be able to autostart any given vm at all.
Second: Ordering autostarts (this will be the complex part).

What do you think?

Comment 7 Derek Atkins 2016-10-18 15:19:42 UTC
(In reply to Sven Kieske from comment #6)
> (In reply to Martin Sivák from comment #5)
> > I am adding another use-case I just saw on the mailing list.
> > 
> > There was a request to autostart some VMs (in defined order even) in case
> > the data center is fully rebooted (power outage or maintenance).
> > 
> > People seem to be used to this feature from VMware and use it to for example
> > to start a DNS provider VM or Active Directory VM before other services that
> > need that information (one example was a MS SQL server needing AD domain to
> > be able to start).
> > 
> > 
> > But Sven this is really not an easy thing to do with our current model (each
> > RunVM in a separate thread, VMs treated separately), but we might be able to
> > add an autostart flag and add a VM dependency mechanism to accomplish that
> > for normal (non-infrastructure) VMs.
> 
> Thanks for your comment.
> 
> Well I'd suggest to split this up into 2 parts.
> 
> First: be able to autostart any given vm at all.
> Second: Ordering autostarts (this will be the complex part).
> 
> What do you think?

As a user, I agree.  VMware has three states for a VM:
* AutoStart in Order
* AutoStart
* Do Not AutoStart

In the UI you basically move a VM in the list.  The first list (ordered start) is, obviously, ordered.  Those VMs are started in the recorded order.  You can specify the time to wait between starts, or probably a "wait for VM to come online" state.  Then the second group is started, individually, in arbitrary order.  The third group is not started.

I agree that in many cases it's more important to make sure that VMs autostart, period.  In my VMware deployment the vast majority of VMs are in the second category (or the third).  However I do have a small number of VMs (e.g. DNS, Kerberos) that I want to start before all the rest.

The Self-Hosted Engine (remember, AllInOne has been deprecated and removed now), I'm assuming, already has this logic to ensure the engine is running.  I presume this is the case even if there is only a single host/node.  So why couldn't the similar logic be applied to additional VMs once the engine is up?  Or, similarly, if a host needs a virtual router running locally, why can't the host HA service be configured similarly?

(Sorry, ovirt n00b here so I apologize for coming in with wacko ideas)

Comment 8 Sven Kieske 2016-10-19 08:06:47 UTC
Users requesting "autostart" feature on the users mailing list:

http://lists.ovirt.org/pipermail/users/2014-November/029349.html
http://lists.ovirt.org/pipermail/users/2015-June/033481.html
http://lists.ovirt.org/pipermail/users/2014-November/029424.html

And this was just a quick look.

HTH

Sven

Comment 9 Yaniv Kaul 2016-11-13 11:48:35 UTC
I wonder if we could do it via libvirt and Engine will learn about them as external VMs?

Comment 10 Michal Skrivanek 2016-11-13 12:27:41 UTC
I did not understand this as a problem of existing VMs (that would still require some effort as the current "discovery" of external VMs is not good enough, and the recent enhancements were done for HE with only a partial reusability, which is a pity)

For just storing a flag to autostart - that's still best done in engine as a regular property. It's not a complicated feature.

If we want to restart without engine intervention - we can't really do that since we use transitive domains in libvirt so nothing is kept on boot.

Comment 11 Derek Atkins 2016-11-14 15:23:10 UTC
I think there are different RFEs between this and other autostart bugs.  Is there an RFE to track this feature for non local storage? (Bug #1325468)?

Personally I think there are two cases (regardless of storage):

1) Autostart a VM with the engine's help
2) Autostart a VM WITHOUT the engine's help.

I suspect there should be separate RFEs to track these two cases, since it downs like case #1 is easy and case #2 is harder.  Perhaps that's what 1325468 is about?

Then there is the follow-on case of applying an order to the startup (with or without the engine).

I feel all of this is irrespective of whether the storage is "local" or "shared".

Comment 12 Yaniv Kaul 2016-11-30 11:23:53 UTC
(In reply to Derek Atkins from comment #11)
> I think there are different RFEs between this and other autostart bugs.  Is
> there an RFE to track this feature for non local storage? (Bug #1325468)?
> 
> Personally I think there are two cases (regardless of storage):
> 
> 1) Autostart a VM with the engine's help
> 2) Autostart a VM WITHOUT the engine's help.
> 
> I suspect there should be separate RFEs to track these two cases, since it
> downs like case #1 is easy and case #2 is harder.  Perhaps that's what
> 1325468 is about?

Agreed it makes sense to split to two RFEs.
The 1st should be easy to implement with Ansible script running when the host goes up, I reckon.

> 
> Then there is the follow-on case of applying an order to the startup (with
> or without the engine).
> 
> I feel all of this is irrespective of whether the storage is "local" or
> "shared".

There is a huge difference here, though. #2 is far easier with local storage, for two reasons:
1. You don't need to connect to a remote storage.
2. You are not concerned with locking to prevent data corruption.

Comment 13 Mohamed Kasem 2017-01-08 00:04:05 UTC
auto start after boot it's very important feature

Comment 15 Michal Skrivanek 2019-07-25 11:40:51 UTC
not in scope of 4.4 effort - that will only contain bug 1325468

Comment 16 Ryan Barry 2019-09-13 12:54:54 UTC
This is being worked on in rhbz#1325468, which will cover this use case

*** This bug has been marked as a duplicate of bug 1325468 ***


Note You need to log in before you can comment on or make changes to this bug.