Bug 891085 (engine_as_fence_proxy) - [RFE] [engine]: Add the ability to the engine to serve as a fencing proxy
Summary: [RFE] [engine]: Add the ability to the engine to serve as a fencing proxy
Keywords:
Status: CLOSED DEFERRED
Alias: engine_as_fence_proxy
Product: ovirt-engine
Classification: oVirt
Component: RFEs
Version: ---
Hardware: Unspecified
OS: Unspecified
medium
medium vote
Target Milestone: ---
: ---
Assignee: Nobody's working on this, feel free to take it
QA Contact:
URL:
Whiteboard:
: 1303111 (view as bug list)
Depends On:
Blocks: 1148638 1373957
TreeView+ depends on / blocked
 
Reported: 2013-01-01 16:21 UTC by Tareq Alayan
Modified: 2022-03-13 14:00 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 1373957 (view as bug list)
Environment:
Last Closed: 2020-04-01 14:46:31 UTC
oVirt Team: Infra
ylavi: ovirt-future?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-45163 0 None None None 2022-03-13 14:00:24 UTC

Description Tareq Alayan 2013-01-01 16:21:15 UTC
Description of problem:

The current implementation of PM proxy selection is based on selection of host from the DC that is 'UP' status.

This implementation is not robust enough since in some cases the host stays in non-responsive status because there is no proxy in UP status available in DC.

After this is implemented: 
FenceProxyDefaultPreferences can look like this RHEVM,CLUSTER,DC

In addition add the ability for RHEVM to check if vdsmd is running & fenece-agents package is installed on localhost (if not, we will ignore that and continue to next option)


additional info:
This issue is derived from https://bugzilla.redhat.com/show_bug.cgi?id=747305

Comment 1 Itamar Heim 2014-05-04 10:26:36 UTC
barak - iirc we did some analysis on this a version or two ago, worth documenting the findings

Comment 2 Barak 2014-09-01 13:00:49 UTC
there were several thoughts here:
- directly execute fence-agents on the engne host 
  cons:
   * different code that handles the fencing entirely
   * was nacked as this is a bad behavior of an application that resides in 
     application server
- install local vdsm on the engine host that will serve as fencing proxy only
  cons:
    * collides with All-In-One
- create a new light weight VDSM that will serve only fencing requests:
  cons:
    * still collides with All-In-One or we'll listen to a different port
    * vdsm is not still ready in terms of modular builds


Post 3.5 this is actually possible once you have all-in-one like deployment (= the engine host is actually an hypervisor) you can set the proxy-selection policy to be other_dc

Comment 3 Yaniv Kaul 2015-11-25 11:29:57 UTC
(In reply to Barak from comment #2)

> - create a new light weight VDSM that will serve only fencing requests:
>   cons:
>     * still collides with All-In-One or we'll listen to a different port
>     * vdsm is not still ready in terms of modular builds

A micro service for fencing sounds like a good idea. Probably dockerized already.

Comment 4 Oved Ourfali 2015-11-25 11:35:35 UTC
(In reply to Yaniv Kaul from comment #3)
> (In reply to Barak from comment #2)
> 
> > - create a new light weight VDSM that will serve only fencing requests:
> >   cons:
> >     * still collides with All-In-One or we'll listen to a different port
> >     * vdsm is not still ready in terms of modular builds
> 
> A micro service for fencing sounds like a good idea. Probably dockerized
> already.

The entire fencing capabilities are based on VDSM and underneath on the fence agents package.

That's why lightweight VDSM was proposed here (dockerized would be nice).
Do we think/want to pursue it in 4.0? Sounds a bit premature for me.

What do you think?

Just adding that in large clusters the use-case is less important, as you'll probably find a host that is up. Unless you lost communication to all hosts in the cluster, and in that case the fencing policy prevents fencing as more than 50% of the hosts are non-responsive (configurable per-cluster).

Therefore, I'm reducing the severity to medium.

Comment 5 Yaniv Kaul 2015-11-25 11:45:32 UTC
(In reply to Oved Ourfali from comment #4)
> (In reply to Yaniv Kaul from comment #3)
> > (In reply to Barak from comment #2)
> > 
> > > - create a new light weight VDSM that will serve only fencing requests:
> > >   cons:
> > >     * still collides with All-In-One or we'll listen to a different port
> > >     * vdsm is not still ready in terms of modular builds
> > 
> > A micro service for fencing sounds like a good idea. Probably dockerized
> > already.
> 
> The entire fencing capabilities are based on VDSM and underneath on the
> fence agents package.
> 
> That's why lightweight VDSM was proposed here (dockerized would be nice).
> Do we think/want to pursue it in 4.0? Sounds a bit premature for me.
> 
> What do you think?

We do want to begin splitting VDSM into micro-services. This one seems like a good candidate, since it's pretty much isolated and should not require and privileges or changes to the docker image / selinux / ...

4.0 - may or may not be. It's certainly not a hot item for 4.0. We just need to begin the container story somewhere.
 
> 
> Just adding that in large clusters the use-case is less important, as you'll
> probably find a host that is up. Unless you lost communication to all hosts
> in the cluster, and in that case the fencing policy prevents fencing as more
> than 50% of the hosts are non-responsive (configurable per-cluster).
> 
> Therefore, I'm reducing the severity to medium.
Indeed.

Comment 6 Oved Ourfali 2016-02-04 13:28:50 UTC
*** Bug 1303111 has been marked as a duplicate of this bug. ***

Comment 9 Michal Skrivanek 2020-03-19 15:42:27 UTC
We didn't get to this bug for more than 2 years, and it's not being considered for the upcoming 4.4. It's unlikely that it will ever be addressed so I'm suggesting to close it.
If you feel this needs to be addressed and want to work on it please remove cond nack and target accordingly.

Comment 10 Michal Skrivanek 2020-04-01 14:46:31 UTC
ok, closing. Please reopen if still relevant/you want to work on it.

Comment 11 Michal Skrivanek 2020-04-01 14:50:26 UTC
ok, closing. Please reopen if still relevant/you want to work on it.


Note You need to log in before you can comment on or make changes to this bug.