Bug 1221238
Summary: | [RFE][performance] - generate large scale list running too slow. | ||
---|---|---|---|
Product: | [oVirt] ovirt-engine-sdk-python | Reporter: | Eldad Marciano <emarcian> |
Component: | RFEs | Assignee: | Juan Hernández <juan.hernandez> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Eldad Marciano <emarcian> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | --- | CC: | bugs, gklein, juan.hernandez, lsurette, mgoldboi, omachace, oourfali, pstehlik, rbalakri, Rhev-m-bugs, sbonazzo, s.kieske, srevivo, ykaul |
Target Milestone: | ovirt-4.0.0-alpha | Keywords: | FutureFeature, Improvement, Performance |
Target Release: | 4.0.0a | Flags: | rule-engine:
ovirt-4.0.0+
mgoldboi: planning_ack+ juan.hernandez: devel_ack+ pstehlik: testing_ack+ |
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | ovirt 4.0.0 alpha1 | Doc Type: | Enhancement |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-08-22 12:27:18 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eldad Marciano
2015-05-13 14:36:15 UTC
We know that the code generated by generateDS.py is slow when handling large XML documents. Not sure what is the root cause, but it is unlikely that we can improve it. After studying this I see that the part of the code that is consuming the time isn't generated by generateDS.py, but by our code generator. It is a "__setattr__" method that we use to delegate attribute access from brokers (classes defined in brokers.py) to containers (classes defined in params.py): https://github.com/oVirt/ovirt-engine-sdk/blob/master/generator/src/main/java/org/ovirt/engine/sdk/generator/python/templates/SuperAttributesTemplate This method makes heavy use of the Python "inspect" module, and the result is really slow. Changing this isn't trivial, but more feasible than changing generateDS.py. This performance issue will be fixed with version 4 of the SDK, as it uses a completely different approach for parsing the XML documents. These are the results of parsing 500 hosts, note that it goes from approx 168 seconds down to less than one second: Wed Feb 24 14:40:30 2016 profile.txt 566633 function calls (566592 primitive calls) in 0.624 seconds Ordered by: internal time List reduced from 1289 to 30 due to restriction <30> ncalls tottime percall cumtime percall filename:lineno(function) 2 0.295 0.148 0.295 0.148 {method 'perform' of 'pycurl.Curl' objects} 500 0.047 0.000 0.293 0.001 readers.py:6328(read_one) 27000 0.022 0.000 0.022 0.000 {method 'read_element' of 'ovirtsdk4.xml.XmlReader' objects} 76001 0.014 0.000 0.014 0.000 {method 'get_attribute' of 'ovirtsdk4.xml.XmlReader' objects} 19500 0.013 0.000 0.013 0.000 {method 'next_element' of 'ovirtsdk4.xml.XmlReader' objects} 1500 0.013 0.000 0.039 0.000 readers.py:16605(read_one) 11000 0.009 0.000 0.009 0.000 struct.py:28(__init__) 45500 0.007 0.000 0.007 0.000 {method 'node_name' of 'ovirtsdk4.xml.XmlReader' objects} 500 0.007 0.000 0.015 0.000 types.py:14119(__init__) 22002 0.007 0.000 0.007 0.000 {method 'read' of 'ovirtsdk4.xml.XmlReader' objects} 68002 0.007 0.000 0.007 0.000 {method 'forward' of 'ovirtsdk4.xml.XmlReader' objects} 3500 0.006 0.000 0.012 0.000 types.py:1844(__init__) 1500 0.006 0.000 0.014 0.000 readers.py:14822(read_one) 10000 0.006 0.000 0.006 0.000 reader.py:111(parse_integer) 1 0.006 0.006 0.010 0.010 services.py:20(<module>) 10000 0.005 0.000 0.017 0.000 reader.py:125(read_integer) 500 0.005 0.000 0.016 0.000 readers.py:11996(read_one) 500 0.004 0.000 0.011 0.000 types.py:10896(__init__) 500 0.004 0.000 0.012 0.000 readers.py:6058(read_one) 31000 0.004 0.000 0.004 0.000 struct.py:115(_check_type) 500 0.004 0.000 0.021 0.000 readers.py:1855(read_one) 1 0.004 0.004 0.011 0.011 http.py:19(<module>) 1 0.004 0.004 0.624 0.624 list_hosts.py:20(<module>) 1 0.004 0.004 0.013 0.013 http.py:273(system_service) 1500 0.004 0.000 0.010 0.000 types.py:9005(__init__) 500 0.003 0.000 0.009 0.000 readers.py:1281(read_one) 500 0.003 0.000 0.012 0.000 readers.py:8244(read_one) 500 0.003 0.000 0.019 0.000 readers.py:11256(read_one) 500 0.003 0.000 0.005 0.000 types.py:2504(__init__) 500 0.003 0.000 0.010 0.000 readers.py:14426(read_one) Version 4 of the SDK will be available when this patch is merged: sdk: Add version 4 https://gerrit.ovirt.org/53720 Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone. This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?. Juan, can you elaborate how to use the SDK. I find it working differently compare to 3.6 how can i access to an object etc: api.hosts.list For this particular bug you can use the following example: https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/list_vms.py In general there is a description of the SDK in its source code repository: https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk#usage And there is a collection of examples as well: https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk/examples Sorry, that first example is to list virtual machines. Listing hosts is very similar: # Get the reference to the "hosts" service: hosts_service = connection.system_service().hosts_service() # Use the "list" method of the "vms" service to list all the # hosts of the system: hosts = hosts_service.list() # Print the hosts names and identifiers: for host in hosts: print("%s: %s" % (host.name, host.id)) Cool 10x, well i think we can verify it see the results: list size cap 100 response time 4.50143885612 list size cap 300 response time 11.0167760849 list size cap 500 response time 27.9535710812 comparing to 167sec we faced before the patch for 500 entities. later on i'll add some profiler results to see how we can speed it up further more. by using comperes=True we can find the same response time the Juan mention: list size cap 100 response time 2.03919100761 list size cap 300 response time 1.44519710541 list size cap 500 response time 2.07873082161 I set up a new bug https://bugzilla.redhat.com/show_bug.cgi?id=1367826, that compress should be True by default. and moving this one to verified. |