Bug 1044089

Summary: Allow manual fence in connecting state
Product: Red Hat Enterprise Virtualization Manager Reporter: Lee Yarwood <lyarwood>
Component: ovirt-engineAssignee: Eli Mesika <emesika>
Status: CLOSED ERRATA QA Contact: Tareq Alayan <talayan>
Severity: high Docs Contact:
Priority: high    
Version: 3.2.0CC: aberezin, acathrow, amureini, bazulay, emesika, flo_bugzilla, iheim, jentrena, lpeer, lyarwood, pstehlik, Rhev-m-bugs, sputhenp, tpoitras, yeylon
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: ovirt-3.4.0-beta3 Doc Type: Bug Fix
Doc Text:
Previously, a full host power outage resulted in a nine minute reconnection time before manual SPM relocation could be performed. Now, a host in connecting state can be manually fenced.
Story Points: ---
Clone Of:
: 1066400 (view as bug list) Environment:
Last Closed: 2014-06-09 15:07:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1044088, 1066400, 1078909, 1142926    

Description Lee Yarwood 2013-12-17 19:15:58 UTC
Description of problem:
In the event of a full host power outage (including fence devices) a user must wait 9 mins (3 x 3 minute timeouts) until they can manually fence a host to relocate the SPM.

Version-Release number of selected component (if applicable):
rhevm-3.2.3-0.43.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Remove all power to an active SPM, including any fence agents that are configured.
2. Attempt to manual fence the SPM to relocate the role.

Actual results:
The role can only be relocated once the host has moved from a state of 'connecting'.

Expected results:
The role can be relocated while the host is still marked as 'connecting' if the user confirms the host is down.

Additional info:

Comment 3 Itamar Heim 2013-12-17 21:44:19 UTC
lee - is this still true with bug 863211 fixed for 3.3?
ayal - why do we need to wait for fencing with spm being based on sanlock?

Comment 4 Ayal Baron 2013-12-17 21:48:35 UTC
(In reply to Itamar Heim from comment #3)
> lee - is this still true with bug 863211 fixed for 3.3?
> ayal - why do we need to wait for fencing with spm being based on sanlock?

I don't see what sanlock has to do with it.  It should be just as safe with the old locking mechanism.

I have no idea why we wait

Comment 5 Lee Yarwood 2013-12-17 23:13:04 UTC
(In reply to Itamar Heim from comment #3)
> lee - is this still true with bug 863211 fixed for 3.3?

Thanks Itamar, that looks promising but I'll need to verify. Setting needinfo as a reminder for the morning.

Comment 6 Lee Yarwood 2014-01-07 16:47:23 UTC
(In reply to Lee Yarwood from comment #5)
> (In reply to Itamar Heim from comment #3)
> > lee - is this still true with bug 863211 fixed for 3.3?
> 
> Thanks Itamar, that looks promising but I'll need to verify. Setting
> needinfo as a reminder for the morning.

Testing this shows a drastically reduced time for the SPM to failover in the event of a complete power outage. I'm going to close this out as a dup of 863211.

Thanks,

Lee

*** This bug has been marked as a duplicate of bug 863211 ***

Comment 8 Eli Mesika 2014-02-09 13:00:58 UTC
Changing the BZ title according to comment 7 and re-assigning the BZ 
We will support manual fence  in connecting state

Comment 9 Eli Mesika 2014-02-09 13:01:48 UTC
Arthur please approve

Comment 10 Arthur Berezin 2014-02-09 13:46:54 UTC
If this doesn't break any existing flows - ACK

Comment 13 Sandro Bonazzola 2014-02-19 12:27:55 UTC
This bug is referenced in ovirt-engine-3.4.0-beta3 logs. Moving to ON_QA

Comment 15 Tareq Alayan 2014-02-20 12:23:27 UTC
tested on ovirt-engine-3.4.0-0.11.beta3.el6.noarch


1. Put host in connecting state by iptables -D INPUT -p tcp --dport 54321 -j ACCEPT
2. Host have unreachable PM
3. Host state is now connecting and there is attempts to check host status 
4. /etc/init.d/iptables restart
5. right-click and confirm host is rebooted
Result host came up immediately 


Can i move this to verify?

Comment 16 Tareq Alayan 2014-02-20 12:24:14 UTC
see comment 15

Comment 17 Eli Mesika 2014-02-20 12:27:56 UTC
(In reply to Tareq Alayan from comment #15)
> tested on ovirt-engine-3.4.0-0.11.beta3.el6.noarch
> 
> 
> 1. Put host in connecting state by iptables -D INPUT -p tcp --dport 54321 -j
> ACCEPT
> 2. Host have unreachable PM
> 3. Host state is now connecting and there is attempts to check host status 
> 4. /etc/init.d/iptables restart
> 5. right-click and confirm host is rebooted
> Result host came up immediately 
> 
> 
> Can i move this to verify?

Yes

Comment 18 Tareq Alayan 2014-02-20 12:29:00 UTC
verified per comment 17

Comment 19 errata-xmlrpc 2014-06-09 15:07:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0506.html