Bug 1120817 - [RFE] Monitor the interface used to establish connectivity from Engine to enhance the fencing logic
Summary: [RFE] Monitor the interface used to establish connectivity from Engine to enh...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.6.0
Assignee: Nobody
QA Contact:
URL:
Whiteboard: infra
Depends On: 1117943
Blocks: 1110176
TreeView+ depends on / blocked
 
Reported: 2014-07-17 18:49 UTC by Scott Herold
Modified: 2016-02-10 19:31 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 1117943
Environment:
Last Closed: 2015-04-27 12:26:43 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:
sherold: Triaged+


Attachments (Terms of Use)

Description Scott Herold 2014-07-17 18:49:54 UTC
+++ This bug was initially created as a clone of Bug #1117943 +++

Description of problem:

This RFE is part of the request is to introduce logic in the fencing workflow for the engine to determine if an inability to communicate with external hosts is because it is having network connectivity issues, or if there is a legitimate problem with the remote host.

We should track the physical devices (i.e., interface, bond) used to establish connectivity from Engine to the hosts in the 'rhevm' logical network. In the event of a physical connectivity problem of an interface, we should be able to alert that so that the fencing flow will take that into consideration and allow the environment time to stabilize/reestablish connectivity with the hosts.

Comment 1 Scott Herold 2014-07-17 19:14:06 UTC
Infra/Engine portion of interface monitoring logic for fencing storms.

When Triggered
--------------
This process is running on an ongoing basis and providing a mechanism to let the engine know if one of its interfaces has changed state within a certain amount of time.  Since connectivity to, and status of, hosts is sensitive to network connectivity, it is important that we understand first if the engine host is having networking issues, or if there is an actual problem lies within the network or against hypervisor hosts.  

In this flow, before initiating the fence command over the fencing network to the target host, the engine will first check to determine if its network is suspect from a state change.  If the networking state has changed within a configurable amount of time (default 10 minutes), the engine will not issue fencing commands, and will instead provide sufficient time for the environment to stabilize and for all hypervisor hosts to return to an operational state.

UX
--
There will be an option in the Fencing Policy sub menu (Defined by BZ 1118879) to enable or disable the following option:
"Disrupt fence request if engine has network connectivity failures"
Default: ENABLED

Comment 2 Scott Herold 2014-07-17 19:14:52 UTC
Targeting 3.6 pending networking effort required to complete BZ 1117943.

Comment 3 Oved Ourfali 2015-04-27 12:26:43 UTC
Per past discussions with scott,

Closing this RFE as won't fix.


Note You need to log in before you can comment on or make changes to this bug.