Bug 1466864 - [RFE] Support for extended power management, cooperation with network UPS master
Summary: [RFE] Support for extended power management, cooperation with network UPS master
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact: Lukas Svaty
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-30 15:14 UTC by Marian Jankular
Modified: 2020-08-13 09:32 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Infra
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)

Description Marian Jankular 2017-06-30 15:14:50 UTC
Description of problem:
Customer would like to have following in future RHV:
When there would power failure detected by  UPS Manager would start shutting down the VMs, when there will be all VMs down it would start powering off hypervisors.

I can imagine that it can be achieved when manager would be standalone hw server. I know with Hosted Engine it is not that easy, for hosted engine NUT clients would have to be located on manager and hosts that are deployed with hosted_engine and there would have to be logic also on NUT master.

Comment 1 Yaniv Kaul 2017-06-30 18:18:37 UTC
If that ups manager can run an Ansible script, this sounds like a straightforward integration, no?

Comment 2 Igor Netkachev 2017-07-05 09:34:08 UTC
Hello Yaniv,

I've asked the customer to clarify his current UPS setup and here's the feedback:

<quote>
We have used a solution for an oVirt setup, using the packages "nut" and "nut-client". But using this in hosted-engine setup seems difficult and maybe not reliable. 

The shutdown script is launched from standalone oVirt Manager server and works fine. But in this RHEV 4.1 setup we use hosted-engine so it gets difficult:

- The UPS devices have to be connected to the host using USB. How do we do that with HEVM?
- This combination of upsmon+upssched probably cannot be used with the hosted-engine VM, as it will not be available to communicate to the UPS when everything is powered off correctly (as the hosted VM isn't running when there are no hypervisors online).

To make it more clear, script on Manager starts a timer for 120 seconds when it gets an 'ONBATT' event from the UPS. The shutdown is started when the timer fires after the 120 seconds. The timer will be canceled when the UPS is back on utility power in the meantime.
</quote>

So, the "UPS manager" in this case is a combination of 'upsmon'+'upssched' instaled on RHEV Manager host and waiting for UPS events on USB, and it does not seem like this solution would work with HEVM setup. And then there are HA VMs that would probably get restarted by Manager on other nodes once the script start shutting down hypervisors, and probably there's something else I'm missing, but my point here is that it looks like Manager side has to be involved in this.

Please let me know what you think.


--
Kind Regards,
Igor Netkachev
Technical Support Engineer
Red Hat Global Support Services

Comment 3 Igor Netkachev 2017-07-05 09:45:46 UTC
Hello,

Just for the reference, there's also https://bugzilla.redhat.com/show_bug.cgi?id=1334982 which looks relevant to this RFE, and Germano also mentions there that currently RHV is lacking features in this area if compared to VMWare and our customers would love to see such functionality available in RHV.

--
Kind Regards,
Igor Netkachev
Technical Support Engineer
Red Hat Global Support Services

Comment 4 Yaniv Kaul 2017-07-17 16:17:49 UTC
I'm unsure why it makes a difference for HE or not - the (Ansible) script would run from a host that is physically connected to the UPS and will talk via the REST API to the Hosted-Engine, which should power off every VM.

The extra effort would be:
1. The order of taking down hypervisors.
2. Taking down HE VM (done differently, via the host)
3. Taking down this last host.

Comment 5 Igor Netkachev 2017-07-28 13:22:30 UTC
Hello,

I got some feedback from customer:

When I try to add USB-connected UPS as Host Device to HEVM it fails with this error message in GUI:

---
Error message:

Operation Canceled
Error while executing action:

HostedEngine: There was an attempt to change Hosted Engine VM values that are locked.
---


and with this message in engine.log:

2017-07-19 15:00:09,334+02 WARN  [org.ovirt.engine.core.bll.UpdateVmCommand] (default task-45) [c33548c5-dbdc-46b5-9266-d8ba49f5d9c0] Validation of action 'UpdateVm' failed for user xxxxx@yyyyy@yyyyy-authz. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__VM,VM_CANNOT_UPDATE_HOSTED_ENGINE_FIELD

</quote>


I guess it is not possible currently with HEVM, it's possible to attach USB-connected UPS only to standalone Manager host, right? Or is there any workaround?

--
Kind Regards,
Igor Netkachev
Technical Support Engineer
Red Hat Global Support Services

Comment 6 Oved Ourfali 2017-08-02 09:20:40 UTC
Adding needinfo on Michal (maybe can give some details as for the USB device), and Martin Sivak (with regards to HE VM configuration).

Comment 7 Michal Skrivanek 2017-08-02 09:42:28 UTC
(In reply to Igor Netkachev from comment #5)
> When I try to add USB-connected UPS as Host Device to HEVM it fails with
...
> I guess it is not possible currently with HEVM, it's possible to attach
> USB-connected UPS only to standalone Manager host, right? Or is there any
> workaround?

no, that doesn't achieve what they need
See comment #4 for the suggested solution

It needs to run on selected host connected to the UPS, and communicate with engine via REST API. Shut all the VMs down, then set HE global maintenance and shut down the HE VM guest, then power off the last host

Once bug 1334982 is fixed it's a bit easier as you can just shut down all the non-HE hosts via ssh instead of shutting down VMs through REST API

Comment 8 Martin Sivák 2017-08-08 09:54:21 UTC
I agree with Michal and Yaniv.

I would not connect the UPS to the manager VM. Keep the UPS connected to some reliable machine and install all the power failure handling scripts there. The scripts will then talk to the manager over REST and initiate the necessary shutdowns using the manager's API.

It makes no difference if the manager is bare metal or hosted engine in this arrangement, maybe except the need to handle temporary REST connection failure when the manager is being restarted somewhere.

Comment 10 Michal Skrivanek 2020-03-18 15:50:01 UTC
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 11 Michal Skrivanek 2020-03-18 15:52:36 UTC
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 12 Michal Skrivanek 2020-04-01 14:48:50 UTC
ok, closing. Please reopen if still relevant/you want to work on it.

Comment 13 Michal Skrivanek 2020-04-01 14:51:48 UTC
ok, closing. Please reopen if still relevant/you want to work on it.


Note You need to log in before you can comment on or make changes to this bug.