Bug 1257610

Summary: Automatic fencing doesn't work when network is killed on host
Product: Red Hat Enterprise Virtualization Manager Reporter: Petr Matyáš <pmatyas>
Component: ovirt-engineAssignee: Martin Perina <mperina>
Status: CLOSED ERRATA QA Contact: Petr Matyáš <pmatyas>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: lsurette, mgoldboi, pneedle, pstehlik, rbalakri, Rhev-m-bugs, yeylon, ykaul
Target Milestone: ovirt-3.6.0-rcKeywords: Regression, ZStream
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3.6.0-12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1260619 (view as bug list) Environment:
Last Closed: 2016-03-09 21:12:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1145099, 1158861, 1260619    
Attachments:
Description Flags
engine log
none
engine log
none
engine log + debug none

Description Petr Matyáš 2015-08-27 12:45:37 UTC
Created attachment 1067725 [details]
engine log

Description of problem:
When I stop network service on host machine, is stops responding in engine and the host should be automatically fenced, but the stop command never succeeds.

Version-Release number of selected component (if applicable):
rhevm-3.6.0-0.12.master.el6.noarch

How reproducible:
always

Steps to Reproduce:
1. log into hosts console
2. stop network service
3. wait for host to be fenced

Actual results:
fencing fails

Expected results:
successful fencing

Additional info:
Host can be fenced manually.

Comment 1 Petr Matyáš 2015-09-02 09:30:09 UTC
Created attachment 1069314 [details]
engine log

the log is cut to relevant parts

Also the second host is not used at all for stop operation

Comment 2 Petr Matyáš 2015-09-03 10:06:18 UTC
Created attachment 1069724 [details]
engine log + debug

Comment 3 Martin Perina 2015-09-03 13:08:14 UTC
I found the issue: newest vdsm reports as supported cluster levels only 3.4+ but on engine we support cluster levels 3.0+. This change covered in BZ1229177 was merged later than fencing refactoring patches (before this change vdsm also reported 3.0+), so that's why we haven't found out it sooner.

When we test if the host can be used as a fencing proxy we check if it supports fencing policy for the cluster using: "is minimal supported version for fencing policy contained in version set reported by VDSM" (as noted above minimal support version in engine is 3.0 vs. 3.4 in VDSM). So we will have to change this condition to: "is there a version in VDSM version set which is higher or equal to minimal version need for fencing policy".

Without this change engine is not capable to fence any host in cluster with version <= 3.4 and also hosts in 3.5+ cluster if options "skip fencing if connected to storage" and "skip fencing if connectivity issues" were disable in cluster fencing policy.

Comment 6 errata-xmlrpc 2016-03-09 21:12:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html