This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1317429 - [RFE] Improve HA failover, so that even when power fencing is not available, automatic HA will work without manual confirmation on host rebooted.
[RFE] Improve HA failover, so that even when power fencing is not available, ...
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: RFEs (Show other bugs)
3.6.0
Unspecified Unspecified
urgent Severity high (vote)
: ovirt-4.1.0-beta
: ---
Assigned To: Nir Soffer
Lilach Zitnitski
http://www.ovirt.org/develop/release-...
: FutureFeature
Depends On: 1406765 1410320 1412230 1415488
Blocks: 804272 1421432
  Show dependency treegraph
 
Reported: 2016-03-14 05:03 EDT by Yaniv Lavi (Dary)
Modified: 2017-03-01 20:30 EST (History)
30 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
This update adds the ability to acquire a lease per virtual machine on shared storage, without attaching the lease to a disk. This adds the capability to avoid split-brain, and avoid starting a virtual machine on another host if the original host becomes non-responsive, therefore improving virtual machine high availability.
Story Points: ---
Clone Of: 804272
Environment:
Last Closed: 2017-02-15 10:05:22 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑4.1+
rule-engine: exception+
ylavi: priority_rfe_tracking+
gklein: testing_plan_complete+
ylavi: planning_ack+
amureini: devel_ack+
ratamir: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 65444 master MERGED xleases: Introduce the xlease module 2016-12-01 12:23 EST
oVirt gerrit 65465 master MERGED vm: Support vm leases 2016-12-22 11:32 EST
oVirt gerrit 66603 master ABANDONED xleases: Use utils.closing instead of reinventing it 2016-12-01 18:10 EST
oVirt gerrit 66604 master ABANDONED xleases: Add logging 2016-12-01 18:10 EST
oVirt gerrit 67347 master MERGED xleases: Add and remove sanlock resource 2016-12-05 04:33 EST
oVirt gerrit 67349 master MERGED xleases: Use six.PY2 instead of reinventing it 2016-12-05 04:33 EST
oVirt gerrit 67380 master MERGED xleases: Add failing tests for storage operations 2016-12-05 04:35 EST
oVirt gerrit 67381 master MERGED xleases: Robust sanlock resources management 2016-12-05 08:14 EST
oVirt gerrit 67609 master MERGED xleases: Update leases volume format 2016-12-07 01:27 EST
oVirt gerrit 67610 master MERGED xleases: Make VolumeLeaseStatus more general 2016-12-14 07:32 EST
oVirt gerrit 67611 master MERGED xleases: Introduce the Lease API's 2016-12-22 11:32 EST
oVirt gerrit 67717 master MERGED xleases: Support SPM verbs without pool uuid 2016-12-08 17:09 EST
oVirt gerrit 68046 master MERGED xleases: Move LeasesVolume.format to format_index 2016-12-10 10:43 EST
oVirt gerrit 68047 master MERGED xleases: Read and write index metadata 2016-12-20 12:16 EST
oVirt gerrit 68067 master MERGED xleases: Cleanup free records writing and lookup 2016-12-20 15:32 EST
oVirt gerrit 68069 master MERGED xleases: Separate VolumeIndex loading from file 2016-12-20 15:53 EST
oVirt gerrit 68072 master MERGED xleases: Prevent use of index during update 2016-12-20 16:12 EST
oVirt gerrit 68075 master MERGED xleases: Create and activate the external leases volume 2016-12-22 11:16 EST
oVirt gerrit 68085 master MERGED xleases: Implement basic leases APIs 2016-12-22 11:32 EST
oVirt gerrit 68762 ovirt-engine-4.1 MERGED core: minor refactoring in handling of vds network exceptions 2016-12-20 15:01 EST
oVirt gerrit 69020 ovirt-4.1 MERGED xleases: Create and activate the external leases volume 2016-12-27 02:34 EST
oVirt gerrit 69021 ovirt-4.1 MERGED xleases: Introduce the Lease API's 2016-12-27 02:34 EST
oVirt gerrit 69022 ovirt-4.1 MERGED xleases: Implement basic leases APIs 2016-12-27 02:35 EST
oVirt gerrit 69023 ovirt-4.1 MERGED vm: Support vm leases 2016-12-27 02:35 EST
oVirt gerrit 69187 master MERGED xleases: Add DirectFile.size() method 2017-01-05 12:52 EST
oVirt gerrit 69188 master MERGED xleases: Add VolumeIndex.updating context manager 2017-01-05 12:52 EST
oVirt gerrit 69189 master ABANDONED xleases: Implement rebuild_index 2017-06-17 21:58 EDT
oVirt gerrit 69190 master ABANDONED xleases: Wire the rebuild_leases API 2017-06-17 21:58 EDT
oVirt gerrit 69251 ovirt-engine-4.1 MERGED core: remove unneeded query when getting vms to move to unknown 2016-12-29 04:41 EST
oVirt gerrit 69252 ovirt-engine-4.1 MERGED core: ability to run vms in unknown status 2016-12-29 04:41 EST
oVirt gerrit 69253 ovirt-engine-4.1 MERGED core: add vm leases 2016-12-29 04:41 EST
oVirt gerrit 69254 ovirt-engine-4.1 MERGED core: add and remove vds commands for vm leases 2016-12-29 04:41 EST
oVirt gerrit 69255 ovirt-engine-4.1 MERGED core: remove vm lease on remove vm 2016-12-29 04:41 EST
oVirt gerrit 69256 ovirt-engine-4.1 MERGED core: add vm lease on add/edit/import vm 2016-12-29 04:40 EST
oVirt gerrit 69257 ovirt-engine-4.1 MERGED core: auto start vms with lease 2016-12-29 04:40 EST
oVirt gerrit 69258 ovirt-engine-4.1 MERGED core: send vm lease on run vm 2016-12-29 04:41 EST
oVirt gerrit 69259 ovirt-engine-4.1 MERGED webadmin: ability to set vm leases 2017-01-01 08:08 EST
oVirt gerrit 69336 master MERGED pylint: storage/fileSD: fix typo 2017-01-01 02:48 EST
oVirt gerrit 69343 master MERGED pylint: fix storage.sd.SDManifest 2017-01-02 08:12 EST
oVirt gerrit 69349 master MERGED vm: Add the missing VmLeaseDevice type 2017-01-06 04:22 EST
oVirt gerrit 69356 ovirt-4.1 MERGED pylint: storage/fileSD: fix typo 2017-02-01 10:11 EST
oVirt gerrit 69394 master MERGED core: initial delay before automatic start of vm with lease 2017-01-05 03:53 EST
oVirt gerrit 69397 master MERGED core: automatically start vms with lease by their priority 2017-01-12 04:21 EST
oVirt gerrit 69414 master MERGED core: vm leases are supported since 4.1 2017-01-11 02:44 EST
oVirt gerrit 69428 ovirt-4.1 MERGED pylint: fix storage.sd.SDManifest 2017-02-01 10:09 EST
oVirt gerrit 69537 master MERGED core: refactoring in vm analyzer 2017-01-05 03:05 EST
oVirt gerrit 69538 master MERGED core: refactoring vm analyzer 2017-01-05 11:18 EST
oVirt gerrit 69539 master MERGED core: better support in the monitoring for auto start of vms with lease 2017-01-05 11:18 EST
oVirt gerrit 69540 master MERGED core: increase the interval between retries to auto start vm with a lease 2017-01-05 11:18 EST
oVirt gerrit 69678 ovirt-engine-4.1 MERGED core: refactoring in vm analyzer 2017-01-05 04:50 EST
oVirt gerrit 69679 ovirt-engine-4.1 MERGED core: initial delay before automatic start of vm with lease 2017-01-05 04:51 EST
oVirt gerrit 69680 ovirt-engine-4.1 MERGED core: vm leases are supported since 4.1 2017-01-15 09:07 EST
oVirt gerrit 69681 ovirt-engine-4.1 MERGED core: refactoring vm analyzer 2017-01-15 09:07 EST
oVirt gerrit 69682 ovirt-engine-4.1 MERGED core: automatically start vms with lease by their priority 2017-01-15 09:03 EST
oVirt gerrit 69683 ovirt-engine-4.1 MERGED core: better support in the monitoring for auto start of vms with lease 2017-01-15 09:03 EST
oVirt gerrit 69684 ovirt-engine-4.1 MERGED core: increase the interval between retries to auto start vm with a lease 2017-01-15 09:07 EST
oVirt gerrit 69808 ovirt-engine-4.1 MERGED core: do not copy lease_sd_id from vm to template 2017-01-09 03:56 EST
oVirt gerrit 69823 ovirt-engine-4.1 MERGED core: do not copy lease_sd_id from vm to template 2017-01-11 04:55 EST
oVirt gerrit 70975 master MERGED spec: Update libvirt requirement on EL7 2017-01-23 18:13 EST
oVirt gerrit 71062 ovirt-4.1 MERGED spec: Update libvirt requirement on EL7 2017-01-24 06:22 EST
oVirt gerrit 71647 ovirt-4.1 MERGED vm: Add the missing VmLeaseDevice type 2017-02-08 11:33 EST

  None (edit)
Description Yaniv Lavi (Dary) 2016-03-14 05:03:40 EDT
Improve HA failover, so that even when power fencing is not available, automatic HA will work without manual confirmation on host rebooted. We need to provide a way to restart VMs and move SPM role to a running server in case power fencing does fail.

Power Fencing failing can be due to various reasons:
1. PowerOutage leaves the iLO/Drac, whatever unreachable
2. Network outage also leads to Power Fencing not reachable
3. Strange system failures that also affects the power fencing device
4. Misconfiguration of e.g. Firewalls

All these should lead to VMs running on other hypervisors afterwards so
that they are reachable again. Therefore wwe need to make sure that the host running the VM previously has no chance of reaching the storage anymore and as such it can't do any harm to the data.
Comment 1 Allon Mureinik 2016-03-14 08:16:18 EDT
We need to finilize the design, marking that we haven't completed it yet. Once the design is finilized, we can properly devel ack/nack accordingly.
Comment 2 Sandro Bonazzola 2016-05-02 06:09:37 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 4 Nir Soffer 2016-11-23 07:25:28 EST
Here is the storage-side feature page:
http://www.ovirt.org/develop/release-management/features/storage/vm-leases/

On top of this there is the virt-side feature page (in review):
https://github.com/oVirt/ovirt-site/pull/586
Comment 6 Nir Soffer 2016-12-01 12:29:50 EST
We are not finished yet, moving back to POST.
Comment 7 Yaniv Lavi (Dary) 2017-01-04 11:28:22 EST
Arik, can you please open a blocking bug on API for the feature?
Comment 8 Tal Nisan 2017-01-18 06:33:11 EST
REST API bug was opened and already solved
Comment 9 Nir Soffer 2017-01-22 09:56:45 EST
Add a patch to require the libvirt version that allows working with vm leases.

Moving back to post until this patch is merged (should be quick).

Note You need to log in before you can comment on or make changes to this bug.