Bug 1361115

Summary: [RFE] Add fencing policies for gluster hosts
Product: [oVirt] ovirt-engine Reporter: Sahina Bose <sabose>
Component: RFEsAssignee: Ramesh N <rnachimu>
Status: CLOSED CURRENTRELEASE QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.7CC: bgraveno, bugs
Target Milestone: ovirt-4.1.0-alphaKeywords: FutureFeature
Target Release: 4.1.0.2Flags: sabose: ovirt-4.1?
sabose: planning_ack?
sabose: devel_ack+
sasundar: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
This update adds gluster related fencing policies for hyper-converged clusters. Previous fencing policies ignored Gluster processes. But in Hyper-converged mode, fencing policies are required to ensure that a host is not fenced if there is a brick process running, or to ensure no loss of quorum when shutting down the host with an active brick. The following fencing policies have been added to Hyper-converged clusters: - SkipFencingIfGlusterBricksUp: Fencing will be skipped if bricks are running and can be reached from other peers. - SkipFencingIfGlusterQuorumNotMet: Fencing will be skipped if bricks are running and shutting down the host will cause loss of quorum
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-15 14:55:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Gluster RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1388824, 1415983    
Bug Blocks: 1277939, 1316692, 1422341    

Description Sahina Bose 2016-07-28 11:16:11 UTC
Description of problem:

In HC mode, we need fencing policies that ensure that a host is not fenced if
1. there's a brick process running and the brick is a source for healing (in replica set)
2. shutting down the host with active brick will cause loss of quorum


Version-Release number of selected component (if applicable):
NA

How reproducible:
NA

Steps to Reproduce:
NA

Additional info:

Comment 1 Ramesh N 2016-10-13 03:50:39 UTC
Gluster volume heal info command takes long time (more than 3 minutes when there are more unhealed entries) to complete. We will not be able to meet the HA requirements if we run 'gluster volume heal info' command in the host fencing flow. So we decided to skip the 'Self-Heal' related fencing policies for the time being until gluster comes with a better way to identify the heal entries. 


Following fencing policies are added to Hyper-converged cluster.

1. SkipFencingIfGlusterBricksUp
    Fencing will be skipped if bricks are running and can be reached from other peers.

2. SkipFencingIfGlusterQuorumNotMet
    Fencing will be skipped if  bricks are running  and shutting down the host will cause loss of quorum.

Comment 2 Sandro Bonazzola 2016-12-12 13:57:16 UTC
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.

Comment 3 SATHEESARAN 2017-02-09 03:24:20 UTC
Tested with RHV 4.1 Beta1 ( Red Hat Virtualization Manager Version: 4.1.0.3-0.1.el7 )

There are 2 fencing policies added which can be seen under the 'Fencing Policies' tab while editing the cluster