Bug 1653389
Summary: | [RFE] more efficient HA logic | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Olimp Bockowski <obockows> |
Component: | ovirt-engine | Assignee: | Nobody <nobody> |
Status: | CLOSED DUPLICATE | QA Contact: | meital avital <mavital> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.9 | CC: | gveitmic, klaas, rbarry, Rhev-m-bugs |
Target Milestone: | ovirt-4.4.0 | Keywords: | FutureFeature |
Target Release: | --- | Flags: | lsvaty:
testing_plan_complete-
|
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-03-09 22:07:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Olimp Bockowski
2018-11-26 17:18:13 UTC
Hey Olimp - There are a number of related bugs. The closest is probably https://bugzilla.redhat.com/show_bug.cgi?id=844083 The best case here would be to have a sliding window for restarts, since our options otherwise are essentially intinite restarts (engine can't be aware of every possible failure case from libvirt/qemu -- only that it failed). In theory, we could also watch network/storage and restart, but it's difficult to make guarantees without unlimited restarts. Unlimited restarts may be acceptable as an interim solution, but risks flooding the logs by failing over and over again. https://bugzilla.redhat.com/show_bug.cgi?id=844083 is not public :) The gist of that RFE is "if there's an environment failure, engine should re-start VMs which were powered on when the environment comes back up" (In reply to Olimp Bockowski from comment #0) > This leads to the situation when in > some cases HA VMs are not restarted when RHV environment is up and healthy > after a longer outage. > > Instead, HA VMs could be restarted in a more sophisticated way, e.g. the > attempt could be triggered when there is sense to do it, e.g. are enough > resources on a cluster level and storage dependencies are available. Attaching another ticket. On this one the SD with the VM leases had a long outage. The VMs failed and then HA tried to restart them but gave up after a few attempts. Then later the SD with the leases became valid again, but the VMs were not restarted. The user had to start them manually. *** This bug has been marked as a duplicate of bug 912723 *** |