The following snippet sds_service = connection.system_service().storage_domains_service() sd = sds_service.list(search='name=aname')[0] fails with: ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16". HTTP response code is 404. It the client log, I got: GET /ovirt-engine/api/storagedomains?search=name%3Daname HTTP/1.1 ... HTTP/1.1 404 Not Found ... <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <fault> <detail>Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16</detail> <reason>Operation Failed</reason> </fault> But I try to list the available services: GET /ovirt-engine/api HTTP/1.1 ... <link href="/ovirt-engine/api/storagedomains?search={query}" rel="storagedomains/search"/> ... And in engine.log: 2017-05-05 13:11:38,395+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-17) [] Operation Failed: Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16 Nothing else. The version I use: The ovirt's version I use: <product_info> <name>oVirt Engine</name> <vendor>ovirt.org</vendor> <version> <build>1</build> <full_version>4.1.1.8-1.el7.centos</full_version> <major>4</major> <minor>1</minor> <revision>0</revision> </version> </product_info>
I think we shouldn't fail with 404 if storage connection doesn't exists, we should ignore and continue.
To workaround this issue just remove the nonexistent storage connection.
sds_service.list() fails to: GET /ovirt-engine/api/storagedomains HTTP/1.1 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <fault> <detail>Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16</detail> <reason>Operation Failed</reason> </fault>
Hi Fabrice, * Can you please attach full logs and script. * What's the status of the relevant storage domain? Did you manually removed the storage connection? * Also, please attach a db dump or the content of storage_server_connections table.
Created attachment 1277812 [details] storage_domain_static dump
Created attachment 1277813 [details] storage_server_connections dump
Using the content of storage_domain_static, I now have a list of my domains. So I'm running the following code, using my SDK: for i in ('2a9fe2d7-ea38-4ced-a274-32734b7b571b', '072fbaa1-08f3-4a40-9f34-a5ca22dd1d74', 'f38b1422-82f2-44ff-b081-d3183ac2c11e', '90765c23-f911-4ab0-ba9e-dfcc11c83acb', '42e01e9c-a5f5-441d-8e90-a36eddd8eb03', '2ea4a078-3a66-4d1c-9239-622fbd45dd3b', '3d086f11-03a2-4fe7-ae84-6efdcb9c950a', '74e1dc39-ee39-4385-988f-a3e8ced63d84', '5dd13cd0-2fe9-4bd4-9769-613a4f700c7b', 'de424c0c-8c33-46d7-a08f-ebe769239a26', '814984aa-c521-4c4f-b066-05317b4d8daf', '7c5291d3-11e2-420f-99ad-47a376013671'): try: print context.storagedomains.get(id=i).name except Exception as e: print e try: print context.storagedomains.list() except Exception as e: print e And getting the following results: ISO_DOMAIN ovirt-image-repository vmsys01 3dpse-data01 vmslow01 vmdata01 ng318 3dpse-data02 Fault reason is "Operation Failed". Fault detail is "Entity not found: Storage server connection: id=739c5f7e-e09c-4b96-a8a7-576b7d56be21". HTTP response code is 404. Fault reason is "Operation Failed". Fault detail is "Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16". HTTP response code is 404. ng319 ng314 Fault reason is "Operation Failed". Fault detail is "Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16". HTTP response code is 404. I have two broken domain, ng317 and ng316. If I want to manage then using the GUI, I get an Uncaught exception when I click on "Manage domain" So there is inconsistency in my database, I don't know where they are coming from, and how old they are. But I should not get 404, but 500 instead. My request is good, the error is coming from the engine.
There are staging domains in staging datacenter, so I'm not afraid of loosing them and can do any tests you want on them.
The storage connections for domains ng316 and ng317 are indeed missing. This is an unexpected situation that could have been caused due to a bug or manual manipulation of the db. So the given error code is actually correct as the entities are missing (i.e. the issue here is the missing entities). As we don't know what triggered the issue or how to reproduce it, closing the bug for now. Please re-open if reproduced.
I don't agree with closing this bug. Ok, I agree you can't help me about the storage connection missing, and so my storage domain are in a broken state. But there is still many problems: Look at this log: DEBUG:root:GET /ovirt-engine/api/storagedomains HTTP/1.1 DEBUG:root:User-Agent: PythonSDK/4.1.3 DEBUG:root:Version: 4 DEBUG:root:Content-Type: application/xml DEBUG:root:Accept: application/xml DEBUG:root:Content-Length: 0 DEBUG:root: DEBUG:root:HTTP/1.1 404 Not Found DEBUG:root:Date: Mon, 15 May 2017 08:39:36 GMT DEBUG:root:Server: Apache DEBUG:root:Content-Type: application/xml DEBUG:root:Content-Length: 217 DEBUG:root:HTTP error before end of send, stop sending DEBUG:root: DEBUG:root:? DEBUG:root:<?xml version="1.0" encoding="UTF-8" standalone="yes"?> DEBUG:root:<fault> DEBUG:root: <detail>Entity not found: Storage server connection: id=6860d96f-557e-4d82-a209-401d72bd6e16</detail> DEBUG:root: <reason>Operation Failed</reason> DEBUG:root:</fault> DEBUG:root:Closing connection 1 I just wanted to enumerate domains and getting a 404. I should still be able to see good ones, in the current situation, I can see none. And even if storage domain are broken they still exist. So they should be returned when listing all domains. It's individual domain manipulation that should fails, I can see them on the web UI, why not in the REST API ? I'm getting a 404. In REST, I can expect http error code to be meaningful. There is a internal error in ovirt, the request is good, so I should get a 500 only when doing a GET /ovirt-engine/api/storagedomains/<bad-domain-UUID>
@Fabrice - I understand your point and frustration, in order to mitigate the issue I suggest manually adding the missing storage connections (with mock values): E.g. ng316: INSERT INTO storage_server_connections VALUES ('6860d96f-557e-4d82-a209-401d72bd6e16', '/data/ng316', null, null, null, null, null, 1, null, null, null, null, null, null); ng317: INSERT INTO storage_server_connections VALUES ('739c5f7e-e09c-4b96-a8a7-576b7d56be21', '/data/ng317', null, null, null, null, null, 1, null, null, null, null, null, null); Since a storage connection is a crucial part of the storage domain entity, we can't support a situation of missing connections entities. I.e. the solution should be either finding the root cause of the issue or adding the connections manually.
Thanks for the tip. But about the the HTTP status ? You see no problem at having an internal error of the db transforming as a 404 ? The actual problem with those particular domains are not big deal for me, as they are only test domain, I can drop them and continue my work (I hop I can do that without having a more inconsistent database). And I understand that with such a problem I can't expect a 100% perfect solution for my use case, as you can't manage all the failure mode. But I think good and clear error management, even for end user, is an important aspect of a good software. When I saw 404, and no domain returned, I started to think that the URL /ovirt-engine/api/storagedomains was broken. The message was good, but I was confronted with inconsistent diagnostic messages: 404 for the URL or missing storage connection ? It made the problem more difficult to resolve. I think that the main point of this bug.
(In reply to Fabrice Bacchella from comment #12) > Thanks for the tip. But about the the HTTP status ? You see no problem at > having an internal error of the db transforming as a 404 ? @Juan - What's your take on that? Is 404 code status acceptable in such scenario? I.e. listing storage domains (on python sdk) with missing storage connection entity in db (the storage connection entity is mandatory and should be always available on a proper env). > > The actual problem with those particular domains are not big deal for me, as > they are only test domain, I can drop them and continue my work (I hop I can > do that without having a more inconsistent database). And I understand that > with such a problem I can't expect a 100% perfect solution for my use case, > as you can't manage all the failure mode. > > But I think good and clear error management, even for end user, is an > important aspect of a good software. When I saw 404, and no domain returned, > I started to think that the URL /ovirt-engine/api/storagedomains was > broken. The message was good, but I was confronted with inconsistent > diagnostic messages: 404 for the URL or missing storage connection ? It > made the problem more difficult to resolve. I think that the main point of > this bug.
Returning 404 in this case isn't correct, it should be 500. This is quite similar to bug 1332881. I think we should investigate it deeper, find the root cause and fix it.
(In reply to Juan Hernández from comment #14) > Returning 404 in this case isn't correct, it should be 500. > > This is quite similar to bug 1332881. I think we should investigate it > deeper, find the root cause and fix it. The root cause is the missing storage connection, but we didn't find a reproducing scenario for it (and, afaik, this issue occurred only once). In bug 1332881, it was a cleanup issue, i.e. a stale storage connection remained in db. Any way, is there anything we can do in the rest-api in order to return code 500 in such scenario?
Currently in the API we are using the 'getEntity' method to find the storage server connection. That method, by default, generates the 404 error response if the entity can't be found. We can use the 'runQuery' method instead, explicilty check the result and generate the 500 error response. Something like this: private StorageServerConnections getStorageServerConnection(String id) { VdcQueryReturnValue result = runQuery( VdcQueryType.GetStorageServerConnectionById, new StorageServerConnectionQueryParametersBase(id) ); if (result.getSucceeded() && result.getReturnValue() != null) { return (StorageServerConnections) result.getReturnValue(); } throw new WebFaultException( null, "Can't find storage server connection for id '" + id + "'.", Status.INTERNAL_SERVER_ERROR ); } There are other places in the BackendStorageDomainsResource class that are using the 'getEntity' method in a similar way. For each of then please consider if it is correct to return 404.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Please provide steps to reproduce this issue in order to verify the fix
(In reply to Kevin Alon Goldblatt from comment #18) > Please provide steps to reproduce this issue in order to verify the fix We haven't found a specific scenario for reproducing the issue. However, you can simulate it manually by removing the relevant storage server connection from DB (or just temporarily renaming it's id - 'storage_server_connections' table). Then, querying the storage domain should return error code 500.
Verified with the following code: --------------------------------------- ovirt-engine-4.2.0-0.0.master.20170723141021.git463826a.el7.centos.noarch vdsm-4.20.1-218.git1b7671f.el7.centos.x86_64 Verified with the following scenario: --------------------------------------- 1. Change the id of a storage domain in the database 2. Ran a query via the browser as follows: https://xxx.xxx.xxx.xxx.redhat.com/ovirt-engine/api/storagedomains >>>> this returned error 500 Moving to VERIFIED!
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.