Bug 1275606
Summary: | [hosted-engine-ha] ha-agent service is not restarted once connectivity to the storage is restored | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-hosted-engine-ha | Reporter: | Elad <ebenahar> | ||||
Component: | Agent | Assignee: | Martin Sivák <msivak> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Ilanit Stein <istein> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 1.3.1 | CC: | bugs, dfediuck, ebenahar, ylavi | ||||
Target Milestone: | ovirt-3.6.1 | Flags: | dfediuck:
ovirt-3.6.z?
gklein: blocker? ebenahar: planning_ack? dfediuck: devel_ack+ ebenahar: testing_ack? |
||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Unspecified | ||||||
Whiteboard: | sla | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-11-18 09:50:36 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | SLA | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Elad
2015-10-27 10:17:03 UTC
Please note that the current EL7.1 systemd (208) does not support Restart=on-abnormal. Can you test this using Fedora or EL7.2 (systemd 217 or higher iirc)? (In reply to Martin Sivák from comment #1) > Please note that the current EL7.1 systemd (208) does not support > Restart=on-abnormal. Can you test this using Fedora or EL7.2 (systemd 217 or > higher iirc)? It was tested using el7.2 So did systemd try to restart the service? Can you check the full journal log? If not attempt was made, check the systemd version please. Created attachment 1088371 [details]
journalctl
Martin, checked it again using the latest systemd:
systemd-python-219-19.el7.x86_64
systemd-libs-219-19.el7.x86_64
systemd-219-19.el7.x86_64
systemd-sysv-219-19.el7.x86_64
ovirt-hosted-engine-ha-1.3.1-1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.0-1.el7ev.noarch
vdsm-4.17.10-5.el7ev.noarch
Blocked connectivity between all hosts in the DC to the hosted-engine storage domain. The VM moved to paused, the ha-agent service failed. Restored the connectivity to the storage server and waited for ~30 minutes. No attempt to restart the ha-agent was done by systemd.
Attached journalctl log.
It seems we need the more traditional on-failure restart mode in this case. This should be resolved as part of a wider fix for the referenced bug #1030441 *** This bug has been marked as a duplicate of bug 1030441 *** |