Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1120817

Summary: [RFE] Monitor the interface used to establish connectivity from Engine to enhance the fencing logic
Product: Red Hat Enterprise Virtualization Manager Reporter: Scott Herold <sherold>
Component: ovirt-engineAssignee: Nobody <nobody>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.4.0CC: ecohen, gklein, howey.vernon, iheim, lpeer, lsurette, myakove, nyechiel, oourfali, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---Keywords: FutureFeature
Target Release: 3.6.0Flags: sherold: Triaged+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1117943 Environment:
Last Closed: 2015-04-27 12:26:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1117943    
Bug Blocks: 1110176    

Description Scott Herold 2014-07-17 18:49:54 UTC
+++ This bug was initially created as a clone of Bug #1117943 +++

Description of problem:

This RFE is part of the request is to introduce logic in the fencing workflow for the engine to determine if an inability to communicate with external hosts is because it is having network connectivity issues, or if there is a legitimate problem with the remote host.

We should track the physical devices (i.e., interface, bond) used to establish connectivity from Engine to the hosts in the 'rhevm' logical network. In the event of a physical connectivity problem of an interface, we should be able to alert that so that the fencing flow will take that into consideration and allow the environment time to stabilize/reestablish connectivity with the hosts.

Comment 1 Scott Herold 2014-07-17 19:14:06 UTC
Infra/Engine portion of interface monitoring logic for fencing storms.

When Triggered
--------------
This process is running on an ongoing basis and providing a mechanism to let the engine know if one of its interfaces has changed state within a certain amount of time.  Since connectivity to, and status of, hosts is sensitive to network connectivity, it is important that we understand first if the engine host is having networking issues, or if there is an actual problem lies within the network or against hypervisor hosts.  

In this flow, before initiating the fence command over the fencing network to the target host, the engine will first check to determine if its network is suspect from a state change.  If the networking state has changed within a configurable amount of time (default 10 minutes), the engine will not issue fencing commands, and will instead provide sufficient time for the environment to stabilize and for all hypervisor hosts to return to an operational state.

UX
--
There will be an option in the Fencing Policy sub menu (Defined by BZ 1118879) to enable or disable the following option:
"Disrupt fence request if engine has network connectivity failures"
Default: ENABLED

Comment 2 Scott Herold 2014-07-17 19:14:52 UTC
Targeting 3.6 pending networking effort required to complete BZ 1117943.

Comment 3 Oved Ourfali 2015-04-27 12:26:43 UTC
Per past discussions with scott,

Closing this RFE as won't fix.