Bug 1393902

Summary: [TRACKER] Clarify and improve the hosted engine related storage flows to avoid compromising high availability
Product: [oVirt] ovirt-distribution Reporter: Martin Sivák <msivak>
Component: TrackersAssignee: Allon Mureinik <amureini>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.0.5CC: bugs, ebenahar, gveitmic, mavital, mkalinin, nsoffer, ratamir, stirabos, ylavi
Target Milestone: ovirt-4.2.0Keywords: Tracking, Triaged
Target Release: 4.2.2Flags: rule-engine: ovirt-4.2+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-04 10:43:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1315074, 1351213, 1455169    
Bug Blocks: 1294348, 1299427, 1337914, 1373930, 1378310, 1387085, 1411133    

Description Martin Sivák 2016-11-10 15:11:56 UTC
I am opening this tracker to be able to properly prioritize the storage team involvement in the improvements for hosted engine storage flow.

The text of the original email follows:

We need only two basic operations from the storage subsystem for agent
operation (setup is more complicated, but I want to ignore that aspect
for now):

- connect and initialize of hosted engine storage domain + all
necessary symlinks
- get and read the OVF store

Now we have couple of constraints:

- we need to make sure the disks are accessible and reconnect
immediately when they are not (or we lose synchronization)

- no network connection issues at all (except total loss of the hosted
engine domain) can cause the vdsm call to block indefinitely (or
longer than about a minute) or the agent is blocked and we lose
synchronization


We are making sure everything is connected by making sure domain
monitor is up and calling the connectStorageServer and prepareImage
verbs before every operation cycle, but that is causing #1337914. Is
there any other (atomic) way? Can getVolumesList or prepareImage block
when the network goes down?

We call getStorageDomainStats as it has (according our code comments)
the side effect of populating/refreshing the /rhev tree. Do we need it
to be able to call prepareImages? Can it block when some other domain
is stuck?

The OVF code calls getImagesList to find the OVF disk, is it possible
this will block when the storage goes down?

Is there an atomic way to make sure the domain monitor is up (that
does not poll repoStats)? Can repoStats block when some domain goes
down? (like when an unrelated NFS based ISO domain becomes
inaccessible)


Those are the reoccurring issues we are facing in bugs like:

https://bugzilla.redhat.com/1337914
https://bugzilla.redhat.com/1378310
https://bugzilla.redhat.com/1294348


We would appreciate it very much if the storage team took a more
active role here and either created a maintained API for us or assumed
the ownership of the hosted engine storage interactions. Otherwise we
are always chasing a moving target...

Comment 1 Yaniv Kaul 2016-11-23 13:08:50 UTC
Allon - please assign someone (or yourself) for this task, so we can make a noticeable improvement in this area in 4.2.

Comment 3 Allon Mureinik 2017-05-17 21:52:58 UTC
Echoing email thread here too for better tracking

(In reply to Martin Sivák from comment #0)
> - connect and initialize of hosted engine storage domain + all
> necessary symlinks
The entire thing is backwards.
We need to create a domain (this does not, or at least should not, require an SPM), and then inject it to the engine's database (via cloudinit or something like that) instead of the convoluted registration thing-of-a-jig
Nir - please keep me honest here.

> - get and read the OVF store
The API *today* is in the engine, but in a somewhat twisted way - VDSM has all the information, but engine has all the logic. I wonder if this logic can be pushed down to VDSM, or even another micro-service.
Nir, you two cents?

> - we need to make sure the disks are accessible and reconnect
> immediately when they are not (or we lose synchronization)
I don't even understand what this means.
Care to rephrase?

> - no network connection issues at all (except total loss of the hosted
> engine domain) can cause the vdsm call to block indefinitely (or
> longer than about a minute) or the agent is blocked and we lose
> synchronization

NFS can hang. HE should use a lease device and fence the VM if the lease cannot be retained.

Seems like anything related to HE monitoring should just be scrapped and replaced with a VM lease. HE agent can activate the lease volume and query its state (may need a couple of new APIs, not sure if have these capabilities ATM, but it shouldn't be too hard to add)
Nir - please keep me honest here.

> 
> 
> We are making sure everything is connected by making sure domain
> monitor is up and calling the connectStorageServer and prepareImage
> verbs before every operation cycle, but that is causing #1337914. Is
> there any other (atomic) way? Can getVolumesList or prepareImage block
> when the network goes down?
What does this have to do with monitoring?
Just use the lease.

> We call getStorageDomainStats as it has (according our code comments)
> the side effect of populating/refreshing the /rhev tree. Do we need it
> to be able to call prepareImages? Can it block when some other domain
> is stuck?
Calling something to get a side effect is clearly a bad idea.
What is the *functional* requirement here?

> The OVF code calls getImagesList to find the OVF disk, is it possible
> this will block when the storage goes down?
Defnitely - see above.
Why do you even need this call?

> Is there an atomic way to make sure the domain monitor is up (that
> does not poll repoStats)? Can repoStats block when some domain goes
> down? (like when an unrelated NFS based ISO domain becomes
> inaccessible)
Again, what is the functional requirement?
I understand leases weren't avilable when HE was first designed, but they are now, and we should migrate to use them.

> We would appreciate it very much if the storage team took a more
> active role here and either created a maintained API for us or assumed
> the ownership of the hosted engine storage interactions. Otherwise we
> are always chasing a moving target...
We aren't creating APIs or taking ownership for anything we did not participate in designing.
The design above seems [sort-of] sane. Assuming this will stand up to the team's design review, you can go forth and start implementing it with the standard APIs.

Comment 4 Martin Sivák 2017-05-18 08:30:57 UTC
> > - connect and initialize of hosted engine storage domain + all
> > necessary symlinks
> The entire thing is backwards.
> We need to create a domain (this does not, or at least should not, require
> an SPM), and then inject it to the engine's database (via cloudinit or
> something like that) instead of the convoluted registration thing-of-a-jig
> Nir - please keep me honest here.

But the engine DB does not exist yet when we are creating the supporting domains. Or in case of total datacenter restart - the engine DB is not running.

> > - get and read the OVF store
> The API *today* is in the engine, but in a somewhat twisted way - VDSM has
> all the information, but engine has all the logic. I wonder if this logic
> can be pushed down to VDSM, or even another micro-service.
> Nir, you two cents?

Hosted engine tooling on the host needs access to the OVF or the new XML Francesco is working on (to start the engine VM) + to another volume we use for shared configuration (we have just raw TAR there without filesystem.

> > - we need to make sure the disks are accessible and reconnect
> > immediately when they are not (or we lose synchronization)
> I don't even understand what this means.
> Care to rephrase?

When the engine volume (or domain), hosted engine sanlock volume or hosted engine synchronization volume goes offline, we need to reconnect as soon as possible.

> > - no network connection issues at all (except total loss of the hosted
> > engine domain) can cause the vdsm call to block indefinitely (or
> > longer than about a minute) or the agent is blocked and we lose
> > synchronization
> 
> NFS can hang. HE should use a lease device and fence the VM if the lease
> cannot be retained.

This is not about VM. There we can't do much (we are using lease). But we also use a volume to synchronize the hosted engine agents between themselves. And we are making sure the volume is up and running.. but the current way to get the status is too fragile and when some unrelated NFS domain stalls (like the ISO domain we do not use at all) the whole thing can freeze.

> 
> Seems like anything related to HE monitoring should just be scrapped and
> replaced with a VM lease. HE agent can activate the lease volume and query
> its state (may need a couple of new APIs, not sure if have these
> capabilities ATM, but it shouldn't be too hard to add)
> Nir - please keep me honest here.

Again, this is not just about the VM, we already have lease for it (the old one - VM lease). But we also monitor the hosted engine agent's volumes (we have sanlock lockspace for hosted engine node IDs, synchronization whiteboard volume and shared configuration volume).

> > We are making sure everything is connected by making sure domain
> > monitor is up and calling the connectStorageServer and prepareImage
> > verbs before every operation cycle, but that is causing #1337914. Is
> > there any other (atomic) way? Can getVolumesList or prepareImage block
> > when the network goes down?
> What does this have to do with monitoring?
> Just use the lease.

See above. We have more volumes than just the one engine VM uses.

> 
> > We call getStorageDomainStats as it has (according our code comments)
> > the side effect of populating/refreshing the /rhev tree. Do we need it
> > to be able to call prepareImages? Can it block when some other domain
> > is stuck?
> Calling something to get a side effect is clearly a bad idea.
> What is the *functional* requirement here?

Fully populated symlinks to all volumes we need.

> > The OVF code calls getImagesList to find the OVF disk, is it possible
> > this will block when the storage goes down?
> Defnitely - see above.
> Why do you even need this call?

Well, we need to find the OVF volume somehow... and I guess this was the only way we were able to do that.
 
> > Is there an atomic way to make sure the domain monitor is up (that
> > does not poll repoStats)? Can repoStats block when some domain goes
> > down? (like when an unrelated NFS based ISO domain becomes
> > inaccessible)
> Again, what is the functional requirement?
> I understand leases weren't avilable when HE was first designed, but they
> are now, and we should migrate to use them.

And again, we use more volumes than just the VM disk. And some of them are needed even when the VM is down so leases do not help us here.

> > We would appreciate it very much if the storage team took a more
> > active role here and either created a maintained API for us or assumed
> > the ownership of the hosted engine storage interactions. Otherwise we
> > are always chasing a moving target...
> We aren't creating APIs or taking ownership for anything we did not
> participate in designing.
> The design above seems [sort-of] sane. Assuming this will stand up to the
> team's design review, you can go forth and start implementing it with the
> standard APIs.

There are no standard APIs, nor there is proper documentation for those. "Read engine code" is not the answer we can accept.

We can't fix ancient management mistakes, but hosted engine involves storage and has been with us since 3.3. It may be a good time you finally took a look at it.

Comment 5 Yaniv Kaul 2017-05-18 11:17:24 UTC
Nir, can you reply on questions on comment 3 ?

Comment 6 Allon Mureinik 2017-05-18 15:57:20 UTC
(In reply to Martin Sivák from comment #4)
> > > - connect and initialize of hosted engine storage domain + all
> > > necessary symlinks
> > The entire thing is backwards.
> > We need to create a domain (this does not, or at least should not, require
> > an SPM), and then inject it to the engine's database (via cloudinit or
> > something like that) instead of the convoluted registration thing-of-a-jig
> > Nir - please keep me honest here.
> 
> But the engine DB does not exist yet when we are creating the supporting
> domains. Or in case of total datacenter restart - the engine DB is not
> running.
You use VDSM's API to create a domain and a pool on top of it, and pass the info to the VM (via cloudinit, or whatever).
Once it brings the db up, you inject the aforementioned into into it.
What am I missing?

> 
> > > - get and read the OVF store
> > The API *today* is in the engine, but in a somewhat twisted way - VDSM has
> > all the information, but engine has all the logic. I wonder if this logic
> > can be pushed down to VDSM, or even another micro-service.
> > Nir, you two cents?
> 
> Hosted engine tooling on the host needs access to the OVF or the new XML
> Francesco is working on (to start the engine VM) + to another volume we use
> for shared configuration (we have just raw TAR there without filesystem.
We already have a reasonable API to get the contents of a volume (via prepare + imageio), which should cover this. I wonder if given the importance of the HE VM, it's worth creating a dedicated volume (similar to the OVF store) just for it.
It "wastes" a volume, but it's probably safer (as this VM is rarely updated), and faster (as it contains just this VM).

> 
> > > - we need to make sure the disks are accessible and reconnect
> > > immediately when they are not (or we lose synchronization)
> > I don't even understand what this means.
> > Care to rephrase?
> 
> When the engine volume (or domain), hosted engine sanlock volume or hosted
> engine synchronization volume goes offline, we need to reconnect as soon as
> possible.
I don't understand this sentance at all. Can you please elaborate?

> 
> > > - no network connection issues at all (except total loss of the hosted
> > > engine domain) can cause the vdsm call to block indefinitely (or
> > > longer than about a minute) or the agent is blocked and we lose
> > > synchronization
> > 
> > NFS can hang. HE should use a lease device and fence the VM if the lease
> > cannot be retained.
> 
> This is not about VM. There we can't do much (we are using lease). But we
> also use a volume to synchronize the hosted engine agents between
> themselves. And we are making sure the volume is up and running.. but the
> current way to get the status is too fragile and when some unrelated NFS
> domain stalls (like the ISO domain we do not use at all) the whole thing can
> freeze.
Again, I don't understand this sentance.
If the domain is active (re: vdsm's domain monitoring) how can a volume suddently go offline?

> 
> > 
> > Seems like anything related to HE monitoring should just be scrapped and
> > replaced with a VM lease. HE agent can activate the lease volume and query
> > its state (may need a couple of new APIs, not sure if have these
> > capabilities ATM, but it shouldn't be too hard to add)
> > Nir - please keep me honest here.
> 
> Again, this is not just about the VM, we already have lease for it (the old
> one - VM lease). But we also monitor the hosted engine agent's volumes (we
> have sanlock lockspace for hosted engine node IDs, synchronization
> whiteboard volume and shared configuration volume).
Same comment as above. One of us (probably me :-)) is missing something very basic here.

> 
> > > We are making sure everything is connected by making sure domain
> > > monitor is up and calling the connectStorageServer and prepareImage
> > > verbs before every operation cycle, but that is causing #1337914. Is
> > > there any other (atomic) way? Can getVolumesList or prepareImage block
> > > when the network goes down?
> > What does this have to do with monitoring?
> > Just use the lease.
> 
> See above. We have more volumes than just the one engine VM uses.
see above `-)

> 
> > 
> > > We call getStorageDomainStats as it has (according our code comments)
> > > the side effect of populating/refreshing the /rhev tree. Do we need it
> > > to be able to call prepareImages? Can it block when some other domain
> > > is stuck?
> > Calling something to get a sleaide effect is crly a bad idea.
> > What is the *functional* requirement here?
> 
> Fully populated symlinks to all volumes we need.
see above `-)

> 
> > > The OVF code calls getImagesList to find the OVF disk, is it possible
> > > this will block when the storage goes down?
> > Defnitely - see above.
> > Why do you even need this call?
> 
> Well, we need to find the OVF volume somehow... and I guess this was the
> only way we were able to do that.
I think the way around this is just to have another dedicated volume[s] for HE's special stuff.
Having to find them with getImageList is probably too fragile, as you described above. I think that if we preemtively use VDSM to create a domain and a pool, we can go ahead and create all of these too. Once they are created, we can get the IDs back and register them somewhere (in HE's conf somehow?), so whenever HE needs to access them, it targets them directly (prepare + download/upload), and reduces the risk of something breaking.

>  
> > > Is there an atomic way to make sure the domain monitor is up (that
> > > does not poll repoStats)? Can repoStats block when some domain goes
> > > down? (like when an unrelated NFS based ISO domain becomes
> > > inaccessible)
> > Again, what is the functional requirement?
> > I understand leases weren't avilable when HE was first designed, but they
> > are now, and we should migrate to use them.
> 
> And again, we use more volumes than just the VM disk. And some of them are
> needed even when the VM is down so leases do not help us here.
> 
> > > We would appreciate it very much if the storage team took a more
> > > active role here and either created a maintained API for us or assumed
> > > the ownership of the hosted engine storage interactions. Otherwise we
> > > are always chasing a moving target...
> > We aren't creating APIs or taking ownership for anything we did not
> > participate in designing.
> > The design above seems [sort-of] sane. Assuming this will stand up to the
> > team's design review, you can go forth and start implementing it with the
> > standard APIs.
> 
> There are no standard APIs, nor there is proper documentation for those.
> "Read engine code" is not the answer we can accept.
> 
> We can't fix ancient management mistakes, but hosted engine involves storage
> and has been with us since 3.3. It may be a good time you finally took a
> look at it.
As noted, you will be given a design you should be able to implement. If we're missing APIs we decide we need, they will be created.

Comment 7 Allon Mureinik 2017-05-18 16:12:34 UTC
(In reply to Allon Mureinik from comment #6)
> > Hosted engine tooling on the host needs access to the OVF or the new XML
> > Francesco is working on (to start the engine VM) + to another volume we use
> > for shared configuration (we have just raw TAR there without filesystem.
> We already have a reasonable API to get the contents of a volume (via
> prepare + imageio), which should cover this. I wonder if given the
> importance of the HE VM, it's worth creating a dedicated volume (similar to
> the OVF store) just for it.
> It "wastes" a volume, but it's probably safer (as this VM is rarely
> updated), and faster (as it contains just this VM).
Actually, I wonder if we should just add these special volumes to the domain format, and thus make sure they are part of any domain upon creation/upgrade.
Upgrade from v4 to v5 will be an issue here, but it may be worth thinking about it.

Comment 8 Nir Soffer 2017-05-22 09:04:35 UTC
(In reply to Allon Mureinik from comment #3)
> Echoing email thread here too for better tracking
> 
> (In reply to Martin Sivák from comment #0)
> > - connect and initialize of hosted engine storage domain + all
> > necessary symlinks
> The entire thing is backwards.
> We need to create a domain (this does not, or at least should not, require
> an SPM), and then inject it to the engine's database (via cloudinit or
> something like that) instead of the convoluted registration thing-of-a-jig
> Nir - please keep me honest here.

I suggested something like this in 3.6 when we added the import storage domain
feature. However, the way hosted engine is bootstrapped is not the issue in this
bug.
 
> > - get and read the OVF store
> The API *today* is in the engine, but in a somewhat twisted way - VDSM has
> all the information, but engine has all the logic. I wonder if this logic
> can be pushed down to VDSM, or even another micro-service.
> Nir, you two cents?

We are abusing OVF store for communicating vm configuration from engine to hosted
engine. OVF store is a mechanism for disaster recovery, not for communication.

We should design a proper way to pass data between engine and hosted engine agents
instead.

This will eliminate the issue of looking up the OVF storage images and using them
to extract the info. These operations can always block, and vdsm cannot help
with this.

> > - we need to make sure the disks are accessible and reconnect
> > immediately when they are not (or we lose synchronization)
> I don't even understand what this means.
> Care to rephrase?

I already answered that in
https://bugzilla.redhat.com/show_bug.cgi?id=1337914#c18

> > - no network connection issues at all (except total loss of the hosted
> > engine domain) can cause the vdsm call to block indefinitely (or
> > longer than about a minute) or the agent is blocked and we lose
> > synchronization
> 
> NFS can hang. HE should use a lease device and fence the VM if the lease
> cannot be retained.

Vdsm will typically timeout when using NFS (ioprocess has 60 seconds timeout).
On block storage vdsm will typically fail faster due to recent multipath and
iscsi fixes in rhel 7.2.

Some operations may require scsi scan, which can block up to 30 seconds, and
lvm refreshes, which can take time.

Some storage domain operations can be blocked while system is doing scsi scan
or another thread holds a global lock.

HE agent must be prepared for timeouts and blocking calls whenever storage is
accessed.

> Seems like anything related to HE monitoring should just be scrapped and
> replaced with a VM lease. HE agent can activate the lease volume and query
> its state (may need a couple of new APIs, not sure if have these
> capabilities ATM, but it shouldn't be too hard to add)
> Nir - please keep me honest here.

We should switch to vm lease instead of volume lease so we can enable all the 
features which are blocked today (snapshot, live storage migration, etc), but
I don't see how this is related to monitoring.

For monitoring storage state HE agent should use Host.getStorageRepoStats.
This api was designed for monitoring. It returns data cached in the last 10
seconds and it never blocks even if all storage is not accessible.

> > We are making sure everything is connected by making sure domain
> > monitor is up and calling the connectStorageServer and prepareImage
> > verbs before every operation cycle, 

I already answered that in
https://bugzilla.redhat.com/show_bug.cgi?id=1337914#c18

This is very bad idea, these apis are dropping all caches, causing needless
lvm operation to refresh the dropped caches.

> but that is causing #1337914. Is
> > there any other (atomic) way? Can getVolumesList or prepareImage block
> > when the network goes down?
> What does this have to do with monitoring?
> Just use the lease.

Anything accessing storage can and will block, this is why we have 
Host.getStorageRepoStats.

> > Is there an atomic way to make sure the domain monitor is up (that
> > does not poll repoStats)? 

Why do you want to avoid polling repoStat? this is the api designed for getting
the status of the domain monitor.

> Can repoStats block when some domain goes
> > down? (like when an unrelated NFS based ISO domain becomes
> > inaccessible)

No.

If you can reproduce this, file a bug and we will fix it.

> > We would appreciate it very much if the storage team took a more
> > active role here and either created a maintained API for us or assumed
> > the ownership of the hosted engine storage interactions. Otherwise we
> > are always chasing a moving target...
> We aren't creating APIs or taking ownership for anything we did not
> participate in designing.
> The design above seems [sort-of] sane. Assuming this will stand up to the
> team's design review, you can go forth and start implementing it with the
> standard APIs.

The storage team is not aware of the hosted engine flows and design. If you want
help we need to understand the design the the various flows, and how they are
implemented.

I suspect that most of the stuff HE agent is doing was implemented in engine
years ago and was battle tested for 8 years. The best way would be to learn how
engine is doing this stuff.

Comment 9 Simone Tiraboschi 2017-09-19 14:38:41 UTC
We should ensure that https://bugzilla.redhat.com/show_bug.cgi?id=1351213 is addressed as well.
Requiring a dedicated target fro the hosted-engine storage domain is a too severe requirement.

Comment 10 Martin Sivák 2017-11-23 16:50:31 UTC
We can probably almost close this as there is just a single major task ahead of us once Node 0 is finished.

Passing info from engine to hosted engine

- VM configuration incl. all disks
- storage domain connection info to prepare the host for running the VM

This method must be able to bootstrap the environment in case of total power failure of the data center. Meaning it must keep storage domain connection data on all HE hosts.

Well and one other thing: we will still need to make sure the hosted engine storage domains are not disconnected from the engine

Comment 11 Simone Tiraboschi 2017-11-23 17:27:33 UTC
(In reply to Martin Sivák from comment #10)
> Passing info from engine to hosted engine
> 
> - VM configuration incl. all disks
> - storage domain connection info to prepare the host for running the VM

The ansible playbook is already fetching disks and storage domain connection details from the engine and writing /etc/ovirt-hosted-engine/hosted-engine.conf with the right values.

Comment 12 Martin Sivák 2017-11-24 10:07:48 UTC
But only during setup time. If somebody adds a disk to the HE VM we might need to update the configuration again.

Comment 13 Simone Tiraboschi 2017-11-24 13:15:48 UTC
(In reply to Martin Sivák from comment #12)
> But only during setup time. If somebody adds a disk to the HE VM we might
> need to update the configuration again.

I don't think so.
The definition of the new disk should be also in the OVF_STORE and ovirt-ha-agent is going to take VM definition from there.
AFAIK vdsm is already going to prepare all the required disks by itself while it's completely up to ovirt-ha-agent/broker to prepare the volumes internally used like metadata, configuration and lockspace ones.

Comment 14 Martin Sivák 2017-11-24 13:46:45 UTC
And if somebody extends the storage domain or adds a disk from a different storage domain? The agent only handles disks that reside on already connected storage domains.

Comment 15 Raz Tamir 2017-12-31 08:40:05 UTC
Yaniv,

You moved this bug to ON_QA from NEW.

Please explain what needs to be tested here?
The bug says that the storage-related flows should be improved not sure what that means.

Comment 16 Yaniv Lavi 2017-12-31 10:13:38 UTC
(In reply to Raz Tamir from comment #15)
> Yaniv,
> 
> You moved this bug to ON_QA from NEW.
> 
> Please explain what needs to be tested here?
> The bug says that the storage-related flows should be improved not sure what
> that means.

It's a tracker, once the dependent ticket are VERIFIED, move this one as well.
One specific action item to it.

Comment 17 Nikolai Sednev 2018-05-03 17:05:11 UTC
(In reply to Yaniv Lavi from comment #16)
> (In reply to Raz Tamir from comment #15)
> > Yaniv,
> > 
> > You moved this bug to ON_QA from NEW.
> > 
> > Please explain what needs to be tested here?
> > The bug says that the storage-related flows should be improved not sure what
> > that means.
> 
> It's a tracker, once the dependent ticket are VERIFIED, move this one as
> well.
> One specific action item to it.

Moving to verified forth to previous comment.
Works for me on these components:
ovirt-hosted-engine-setup-2.2.20-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.11-1.el7ev.noarch
rhvm-appliance-4.2-20180427.0.el7.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Comment 18 Sandro Bonazzola 2018-05-04 10:43:37 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.