Created attachment 1079387 [details] Numbers of command executions for 1000 calls with 2 hosts to /api/hosts Description of problem: When calling the /api/hosts endpoint the command execution count grows lineary with the number of the hosts in the datacenter and so do the database queries. The attached screenshot shows that GetVdsCertificateSubjectByVdsId is called N times when we have N hosts in the cluster. Version-Release number of selected component (if applicable): How reproducible: To reproduce the numbers in the screenshot, create 2 hosts and query /api/hosts one thousand times with a maximum of 10 requests in parallel. Steps to Reproduce: 1. Create 2 hosts 2. run seq 1 1000 | parallel -j 10 curl -H "Accept: application/json" -H "Content-type: application/json" -X GET --user admin@internal:engine http://localhost:8080/ovirt-engine/api/hosts 3. Actual results: You will see that SearchVDS is called exactly 1000 times, like expected. Further you will see that GetVdsCertificateSubjectByVdsId is called 2000 times. This means 2 times per call which corresponds to the number of hosts. Expected results: Do a bulk query for GetVdsCertificateSubjectByVdsId and/or hit a cache after the first query Additional info: Note that the average execution time for GetVdsCertificateSubjectByVdsId was about 16 ms. So when querying a datacenter with 500 hosts a single call takes on my notebook at least 8 seconds longer than necessary.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Eli - please check what can be improved here. We should at least reduce from 2000 to 1000 the number of times the query is called (see description).
(In reply to Oved Ourfali from comment #2) > I see no problem in the above We have lot of places when we get a list of entities and for each entity we call a specific query to get more details (Host, VM, SD and so on) By the way GetVdsCertificateSubjectByVdsIdQuery is not a real query and if you look in its implementation code you will see that it only formats AD string from the organization and host name. In that case GetVdsCertificateSubjectByVdsIdQuery must be called per host and I see no point in caching an inexpensive operation result If we want to add caching, we should do that in more general manner and cover first the places when caching will really improve performance Oved, please decide how we should proceed
(In reply to Eli Mesika from comment #3) > (In reply to Oved Ourfali from comment #2) > > I see no problem in the above > We have lot of places when we get a list of entities and for each entity we > call a specific query to get more details (Host, VM, SD and so on) > > By the way GetVdsCertificateSubjectByVdsIdQuery is not a real query and if > you look in its implementation code you will see that it only formats AD > string from the organization and host name. > > In that case GetVdsCertificateSubjectByVdsIdQuery must be called per host > and I see no point in caching an inexpensive operation result > > If we want to add caching, we should do that in more general manner and > cover first the places when caching will really improve performance > > Oved, please decide how we should proceed I wonder whether it is called once or twice per host, as I got confused by the description. However, I see in the code, in: backend/manager/modules/restapi/jaxrs/src/main/java/org/ovirt/engine/api/restapi/resource/BackendHostsResource.java That there is a comment to revisit that change when the certificate will be exposed in the VDS object. (//TODO: REVISIT when backend expose CertificateSubject in vds). Perhaps this can be optimized that way? Juan - thoughts?
(In reply to Oved Ourfali from comment #4) > I wonder whether it is called once or twice per host, as I got confused by > the description It is called once per host, I had debugged that ...
Restoring needinfo on Juan.
GetVdsCertificateSubjectByVdsIdQuery runs real a query to retrieve the host, and then it just computes a subject based on the host name and the OrganizationName configuration parameter. So if I understand correctly it is expensive, as it requires at least one network roundtrip to the database. Note that this computation is in itself problematic because it uses magic strings that are potentially different to the ones used to generate the actual certificates. If possible the subject, or the complete certificate, should be stored in the database, then the computation would be more reliable. Can we move this computation to the VDS backend entity? Adding a method like getCertificateSubject()? Then the API can call this method from the host mapper, and no additional query is required. Later, we can improve that to use the actual and correct certificate.
(In reply to Eli Mesika from comment #5) > (In reply to Oved Ourfali from comment #4) > > > I wonder whether it is called once or twice per host, as I got confused by > > the description > > It is called once per host, I had debugged that ... Yes, once per host. Sorry if my description was confusing. I always use a high number of calls to see if I see the request number multiplied by the number of hosts.That helps me to find suspicious commands because sometimes there is a lot going on in the engine.
(In reply to Juan Hernández from comment #7) > GetVdsCertificateSubjectByVdsIdQuery runs real a query to retrieve the host, > and then it just computes a subject based on the host name and the > OrganizationName configuration parameter. So if I understand correctly it is > expensive, as it requires at least one network roundtrip to the database. > Exactly. It does VDS vds = getDbFacade() .getVdsDao() .get(getParameters().getId(), getUserID(), getParameters().isFiltered()); for every host. And the request time of 16ms response time was on my notebook with a small cluster with a local database. > Note that this computation is in itself problematic because it uses magic > strings that are potentially different to the ones used to generate the > actual certificates. If possible the subject, or the complete certificate, > should be stored in the database, then the computation would be more > reliable. > > Can we move this computation to the VDS backend entity? Adding a method like > getCertificateSubject()? Then the API can call this method from the host > mapper, and no additional query is required. Later, we can improve that to > use the actual and correct certificate. That sounds reasonable for me.
Closed due to capacity, if still reproduce, please reopen.