Bug 978737 - [Docs] [Tech Ref] add soft fencing over SSH (restart VDSM) as a preliminary step before fencing a None-Responsive host
[Docs] [Tech Ref] add soft fencing over SSH (restart VDSM) as a preliminary s...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation (Show other bugs)
3.3.0
Unspecified Unspecified
unspecified Severity medium
: ---
: 3.3.0
Assigned To: Zac Dover
ecs-bugs
infra
: FutureFeature
Depends On: 967328 975301
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-27 02:29 EDT by Andrew Burden
Modified: 2016-02-10 14:08 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 975301
Environment:
Last Closed: 2014-04-06 23:06:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 15797 None None None Never
oVirt gerrit 15798 None None None Never

  None (edit)
Comment 1 Zac Dover 2013-07-14 22:33:39 EDT
This changes part of

RHEV 3.3 Tech Ref Chapter 5, "Power management and fencing".
Comment 2 Tim Hildred 2013-08-06 01:28:45 EDT
From BZ#975301, comment #1:
Looking at builds currently available, and based on Barak's comment from 2013-06-16 04:49:10 EDT, there will be no UI impact from this bug. 

If there is anything that changes in the Admin Guide, it would be the inclusion of a topic from the Technical Reference Guide called:

Soft Fencing Hosts in Red Hat Enterprise Virtualization

When that topic is written, we'll see if it is appropriate for inclusion in the Hosts Resilience section of the Administration Guide.
Comment 3 Zac Dover 2013-08-24 11:54:13 EDT
I have written the following:

Soft-fencing Hosts

Sometimes a host becomes non-responsive due to an unexpected problem, and though VDSM is unable to respond to requests, the virtual machines that depend upon VDSM remain alive and accessible. In these situations, simply restarting VDSM returns VDSM to a responsive state and resolves this issue.

Red Hat Enterprise Virtualization 3.3 introduces "soft-fencing over SSH". Prior to Red Hat Enterprise Virtualization 3.3, non-responsive hosts were fenced only by external fencing devices. In Red Hat Enterprise Virtualization 3.3, the fencing process has been expanded to include "SSH Soft Fencing", a process whereby the Manager (the engine) attempts to restart VDSM via SSH on non-responsive hosts; if the Manager fails to restart VDSM via SSH, the responsibility for fencing falls to the external fencing agent (if an external fencing agent has been configured).

SSH Soft Fencing works as follows. Fencing must be configured and enabled on the host, and a valid proxy host (a second host, in an UP state, in the data center) must exist. When the connection between the engine (the Manager) and the host times out, the following happens. On the first network failure, the status of the host changes to "connecting". The engine (the Manager) then does one of two things: it makes three attempts to ask VDSM for its status, or it waits for an interval determined by the host's load. The formula for determining the length of the interval is configured by the the configuration values TimeoutToResetVdsInSeconds (the deafult is 60 seconds) + [DelayResetPerVmInSeconds (the default is 0.5 seconds)]*(the count of running vms on host) + [DelayResetForSpmInSeconds (the default is 20 seconds] * 1 (if host runs as SPM) or 0 (if the host does not run as SPM). In order to give VDSM the maximum amount of time to respond, the engine (the Manager) chooses the longer of the two options mentioned above (three attempts to retrieve the status of VDSM or the interval determined by the above formula). If the host doesn't respond when that interval has elapsed, <command>vdsm restart</command> is executed via SSH. If <command>vdsm restart</command> does not succeed in re-establishing the connection between the host and the manager, the status of the host changes to <literal>non responsive</literal> and, if power management is configured, fencing is handed off to the external fencing agent.

Note
SSH soft-fencing can be executed on hosts that have no power management configured. This is distinct from "fencing": fencing can be executed only on hosts that have power management configured.
Comment 5 Zac Dover 2013-09-24 21:20:10 EDT
Documentation Link
------------------
http://documentation-devel.engineering.redhat.com/docs/en-US/Red_Hat_Enterprise_Virtualization/3.3/html-single/Technical_Reference_Guide/index.html#Soft-Fencing_Hosts

What Changed
------------
I added the following content:

Sometimes a host becomes non-responsive due to an unexpected problem, and though VDSM is unable to respond to requests, the virtual machines that depend upon VDSM remain alive and accessible. In these situations, simply restarting VDSM returns VDSM to a responsive state and resolves this issue.
Red Hat Enterprise Virtualization 3.3 introduces "soft-fencing over SSH". Prior to Red Hat Enterprise Virtualization 3.3, non-responsive hosts were fenced only by external fencing devices. In Red Hat Enterprise Virtualization 3.3, the fencing process has been expanded to include "SSH Soft Fencing", a process whereby the Manager (the engine) attempts to restart VDSM via SSH on non-responsive hosts; if the Manager fails to restart VDSM via SSH, the responsibility for fencing falls to the external fencing agent (if an external fencing agent has been configured).
SSH soft-fencing works as follows. Fencing must be configured and enabled on the host, and a valid proxy host (a second host, in an UP state, in the data center) must exist. When the connection between the engine (the Manager) and the host times out, the following happens. On the first network failure, the status of the host changes to "connecting". The engine (the Manager) then does one of two things: it makes three attempts to ask VDSM for its status, or it waits for an interval determined by the host's load. The formula for determining the length of the interval is configured by the the configuration values TimeoutToResetVdsInSeconds (the deafult is 60 seconds) + [DelayResetPerVmInSeconds (the default is 0.5 seconds)]*(the count of running vms on host) + [DelayResetForSpmInSeconds (the default is 20 seconds] * 1 (if host runs as SPM) or 0 (if the host does not run as SPM). In order to give VDSM the maximum amount of time to respond, the engine (the Manager) chooses the longer of the two options mentioned above (three attempts to retrieve the status of VDSM or the interval determined by the above formula). If the host doesn't respond when that interval has elapsed, vdsm restart is executed via SSH. If vdsm restart does not succeed in re-establishing the connection between the host and the manager, the status of the host changes to non responsive and, if power management is configured, fencing is handed off to the external fencing agent.
Note
SSH soft-fencing can be executed on hosts that have no power management configured. This is distinct from "fencing": fencing can be executed only on hosts that have power management configured.

NVR
---
Red_Hat_Enterprise_Virtualization-Technical_Reference_Guide-3.3-en-US-3.3.0-004

Moving to ON_QA.

Note You need to log in before you can comment on or make changes to this bug.