Description of problem: When upgrading the cluster, the following error is given in the ovirt-engine.log. 2022-04-27 14:27:55,156+02 ERROR [io.undertow.request] (default task-3) UT005023: Exception handling request to /ovirt-engine/api/v4/clusters: java.lang.RuntimeException: org.jboss.resteasy.spi.UnhandledException: java.lang.IllegalStateException: Problem following 'gluster_volumes' link in Clusters entity. Seems to come from https://github.com/oVirt/ovirt-ansible-collection/commit/82b9bff3b580927d0a39f7262aede1aa89475e67 And the upgrade stops. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Try cluster upgrade in oVirt UI 2. Upgrade starts 3. And fails in the first stages due to the error above
Did some more troubleshooting. The root cause is https://github.com/oVirt/ovirt-ansible-collection/blob/82b9bff3b580927d0a39f7262aede1aa89475e67/roles/cluster_upgrade/tasks/main.yml#L43 Here its requested to follow gluster_volumes property of the clusters. BUT this property is NOT returned on Virt-Only clusters, which causes nullpointer exception on this. See: https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/restapi/jaxrs/src/main/java/org/ovirt/engine/api/restapi/resource/BackendClustersResource.java#L28
(In reply to Jean-Louis Dupond from comment #1) > Did some more troubleshooting. > > The root cause is > https://github.com/oVirt/ovirt-ansible-collection/blob/ > 82b9bff3b580927d0a39f7262aede1aa89475e67/roles/cluster_upgrade/tasks/main. > yml#L43 > > Here its requested to follow gluster_volumes property of the clusters. > BUT this property is NOT returned on Virt-Only clusters, which causes > nullpointer exception on this. > > See: > https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/ > restapi/jaxrs/src/main/java/org/ovirt/engine/api/restapi/resource/ > BackendClustersResource.java#L28 Ori, should the NPE be handled within RESTAPI backend? Meaning ignoring follow parameters which are not relevant for the entity instance? Or does the client need to check what to add to follow?
Good qusetion. Currently if a *structurally* invalid string is provided for 'follow' - for example 'xyz' for a cluster - an IllegalStateException is thrown with a meaningful message. In contrast, in the case of this bugzilla, the string 'gluster_volumes' is structurally valid, as there is such a sub-collection in a Cluster. The flow reaches the engine where an exception is thrown because virt-only clusters don't have gluster-volumes. Should we catch the exception, log it, and return the cluster without the gluster-volumes? I guess that could be nice, as long as we make sure to log it. I'm trying to think if there's any scenario where this can screw us. For example what if for a perfectly valid request there's some unexpected exception in the engine - I don't know, something hangs, DB problem - then we might get missing information where there should information - if not for some temporary funky circumstances. And yes you'd be able to see in the log, but you might not even know that there's something to look for there
(In reply to Ori Liel from comment #3) > Good qusetion. > > Currently if a *structurally* invalid string is provided for 'follow' - for > example 'xyz' for a cluster - an IllegalStateException is thrown with a > meaningful message. > > In contrast, in the case of this bugzilla, the string 'gluster_volumes' is > structurally valid, as there is such a sub-collection in a Cluster. The flow > reaches the engine where an exception is thrown because virt-only clusters > don't have gluster-volumes. > > Should we catch the exception, log it, and return the cluster without the > gluster-volumes? I guess that could be nice, as long as we make sure to log > it. I'm trying to think if there's any scenario where this can screw us. For > example what if for a perfectly valid request there's some unexpected > exception in the engine - I don't know, something hangs, DB problem - then > we might get missing information where there should information - if not for > some temporary funky circumstances. And yes you'd be able to see in the log, > but you might not even know that there's something to look for there Shouldn't speciality like that be handled per each appearance? Meaning can we add a code to cluster related RESTAPI backend classes ignore following gluster_volumes for virt only clusters?
You mean to handle it at the validation stage, not the error-handing, and not generically but rather by adding a specific check. Yes, this is possible. I have no objection to such a solution
Let's fix that specifically for gluster_volumes in virt-only clusters
I'm not successful in reproducing the problem. When I try to follow gluster_volumes for a virt-only cluster, there is no failure and no errors in the log. Could you please attach engine.log? I'm looking for more information about the exception which is thrown by the Engine, I want to know exactly where it comes from. Also, if in ansible logs we can locate the http request that was made, that would be helpful. When I try to reproduce I use: GET .../ovirt-engine/api/clusters?follow=gluster_volumes Regardless, I've posted a PR which should resolve the problem: https://github.com/oVirt/ovirt-engine/pull/503
Created attachment 1894635 [details] Ansible Log
Created attachment 1894636 [details] Engine log with stacktrace
Fix merged: https://github.com/oVirt/ovirt-engine/commit/1691f2380092489bd0ffc89c2762c469d63f8646
This bugzilla is included in oVirt 4.5.2 release, published on August 10th 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.