Bug 2079361 - Cluster upgrade fails with: Problem following 'gluster_volumes' link in Clusters entity.
Summary: Cluster upgrade fails with: Problem following 'gluster_volumes' link in Clust...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: RestAPI
Version: 4.5.0.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.5.2
: ---
Assignee: Ori Liel
QA Contact: Barbora Dolezalova
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-27 12:49 UTC by Jean-Louis Dupond
Modified: 2022-08-30 08:49 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-08 08:17:16 UTC
oVirt Team: Infra
Embargoed:
mperina: ovirt-4.5+


Attachments (Terms of Use)
Ansible Log (3.42 KB, text/plain)
2022-07-05 09:23 UTC, Jean-Louis Dupond
no flags Details
Engine log with stacktrace (21.07 KB, text/plain)
2022-07-05 09:23 UTC, Jean-Louis Dupond
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-45878 0 None None None 2022-04-27 13:01:19 UTC

Description Jean-Louis Dupond 2022-04-27 12:49:47 UTC
Description of problem:
When upgrading the cluster, the following error is given in the ovirt-engine.log.

2022-04-27 14:27:55,156+02 ERROR [io.undertow.request] (default task-3) UT005023: Exception handling request to /ovirt-engine/api/v4/clusters: java.lang.RuntimeException: org.jboss.resteasy.spi.UnhandledException: java.lang.IllegalStateException: Problem following 'gluster_volumes' link in Clusters entity.

Seems to come from https://github.com/oVirt/ovirt-ansible-collection/commit/82b9bff3b580927d0a39f7262aede1aa89475e67

And the upgrade stops.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Try cluster upgrade in oVirt UI
2. Upgrade starts
3. And fails in the first stages due to the error above

Comment 1 Jean-Louis Dupond 2022-04-28 07:15:34 UTC
Did some more troubleshooting.

The root cause is 
https://github.com/oVirt/ovirt-ansible-collection/blob/82b9bff3b580927d0a39f7262aede1aa89475e67/roles/cluster_upgrade/tasks/main.yml#L43

Here its requested to follow gluster_volumes property of the clusters.
BUT this property is NOT returned on Virt-Only clusters, which causes nullpointer exception on this.

See:
https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/restapi/jaxrs/src/main/java/org/ovirt/engine/api/restapi/resource/BackendClustersResource.java#L28

Comment 2 Martin Perina 2022-06-01 11:05:56 UTC
(In reply to Jean-Louis Dupond from comment #1)
> Did some more troubleshooting.
> 
> The root cause is 
> https://github.com/oVirt/ovirt-ansible-collection/blob/
> 82b9bff3b580927d0a39f7262aede1aa89475e67/roles/cluster_upgrade/tasks/main.
> yml#L43
> 
> Here its requested to follow gluster_volumes property of the clusters.
> BUT this property is NOT returned on Virt-Only clusters, which causes
> nullpointer exception on this.
> 
> See:
> https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/
> restapi/jaxrs/src/main/java/org/ovirt/engine/api/restapi/resource/
> BackendClustersResource.java#L28

Ori, should the NPE be handled within RESTAPI backend? Meaning ignoring follow parameters which are not relevant for the entity instance? Or does the client need to check what to add to follow?

Comment 3 Ori Liel 2022-06-13 12:56:31 UTC
Good qusetion.

Currently if a *structurally* invalid string is provided for 'follow' - for example 'xyz' for a cluster - an IllegalStateException is thrown with a meaningful message.

In contrast, in the case of this bugzilla, the string 'gluster_volumes' is structurally valid, as there is such a sub-collection in a Cluster. The flow reaches the engine where an exception is thrown because virt-only clusters don't have gluster-volumes. 

Should we catch the exception, log it, and return the cluster without the gluster-volumes? I guess that could be nice, as long as we make sure to log it. I'm trying to think if there's any scenario where this can screw us. For example what if for a perfectly valid request there's some unexpected exception in the engine - I don't know, something hangs, DB problem - then we might get missing information where there should information - if not for some temporary funky circumstances. And yes you'd be able to see in the log, but you might not even know that there's something to look for there

Comment 4 Martin Perina 2022-06-13 13:33:24 UTC
(In reply to Ori Liel from comment #3)
> Good qusetion.
> 
> Currently if a *structurally* invalid string is provided for 'follow' - for
> example 'xyz' for a cluster - an IllegalStateException is thrown with a
> meaningful message.
> 
> In contrast, in the case of this bugzilla, the string 'gluster_volumes' is
> structurally valid, as there is such a sub-collection in a Cluster. The flow
> reaches the engine where an exception is thrown because virt-only clusters
> don't have gluster-volumes. 
> 
> Should we catch the exception, log it, and return the cluster without the
> gluster-volumes? I guess that could be nice, as long as we make sure to log
> it. I'm trying to think if there's any scenario where this can screw us. For
> example what if for a perfectly valid request there's some unexpected
> exception in the engine - I don't know, something hangs, DB problem - then
> we might get missing information where there should information - if not for
> some temporary funky circumstances. And yes you'd be able to see in the log,
> but you might not even know that there's something to look for there

Shouldn't speciality like that be handled per each appearance? Meaning can we add a code to cluster related RESTAPI backend classes ignore following gluster_volumes for virt only clusters?

Comment 5 Ori Liel 2022-06-14 09:20:52 UTC
You mean to handle it at the validation stage, not the error-handing, and not generically but rather by adding a specific check. Yes, this is possible. I have no objection to such a solution

Comment 6 Martin Perina 2022-06-16 08:46:20 UTC
Let's fix that specifically for gluster_volumes in virt-only clusters

Comment 7 Ori Liel 2022-07-05 09:01:43 UTC
I'm not successful in reproducing the problem. When I try to follow gluster_volumes for a virt-only cluster, there is no failure and no errors in the log. Could you please attach engine.log? I'm looking for more information about the exception which is thrown by the Engine, I want to know exactly where it comes from. Also, if in ansible logs we can locate the http request that was made, that would be helpful. When I try to reproduce I use: 

  GET .../ovirt-engine/api/clusters?follow=gluster_volumes

Regardless, I've posted a PR which should resolve the problem: 

  https://github.com/oVirt/ovirt-engine/pull/503

Comment 8 Jean-Louis Dupond 2022-07-05 09:23:38 UTC
Created attachment 1894635 [details]
Ansible Log

Comment 9 Jean-Louis Dupond 2022-07-05 09:23:56 UTC
Created attachment 1894636 [details]
Engine log with stacktrace

Comment 12 Sandro Bonazzola 2022-08-30 08:49:07 UTC
This bugzilla is included in oVirt 4.5.2 release, published on August 10th 2022.
Since the problem described in this bug report should be resolved in oVirt 4.5.2 release, it has been closed with a resolution of CURRENT RELEASE.
If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.