Bug 1221238

Summary:	[RFE][performance] - generate large scale list running too slow.
Product:	[oVirt] ovirt-engine-sdk-python	Reporter:	Eldad Marciano <emarcian>
Component:	RFEs	Assignee:	Juan Hernández <juan.hernandez>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Eldad Marciano <emarcian>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	---	CC:	bugs, gklein, juan.hernandez, lsurette, mgoldboi, omachace, oourfali, pstehlik, rbalakri, Rhev-m-bugs, sbonazzo, s.kieske, srevivo, ykaul
Target Milestone:	ovirt-4.0.0-alpha	Keywords:	FutureFeature, Improvement, Performance
Target Release:	4.0.0a	Flags:	rule-engine: ovirt-4.0.0+ mgoldboi: planning_ack+ juan.hernandez: devel_ack+ pstehlik: testing_ack+
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	ovirt 4.0.0 alpha1	Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-08-22 12:27:18 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Infra	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Eldad Marciano 2015-05-13 14:36:15 UTC

Description of problem:
generate large scale list (500) very expensive action and runs too long.

example:
hosts = self.apiObj.hosts.list(query='cluster = *fake*', max=500)

by simple profiling looks like params module is very expensive specially Host.buildChildren

Clock type: CPU
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
..py:805 EngineSetup.action_executor  1      0.000024  167.7574  167.7574
...py:142 EngineSetup.update_entitys  1      0.000030  167.7572  167.7572
..ructure/brokers.py:9823 Hosts.list  1      0.000047  167.7568  167.7568
../params.py:272 Version.__setattr__  186..  0.799005  166.5537  0.000892
..lectionhelper.py:47 isModuleMember  256..  2.166224  165.4891  0.000646
..ython2.7/inspect.py:247 getmembers  256..  69.11561  162.8685  0.000636
..ructure/proxy.py:100 Proxy.request  3      0.000275  148.2319  49.41062
..ure/proxy.py:120 Proxy.__doRequest  3      0.000031  148.2315  49.41051
..ucture/proxy.py:147 Proxy.__xml2py  2      0.003301  148.1560  74.07802
..dk/xml/params.py:31385 parseString  2      0.000111  148.1527  74.07637
..frastructure/proxy.py:50 Proxy.get  1      0.000003  147.9025  147.9025
..sdk/xml/params.py:9325 Hosts.build  2      0.001923  147.8068  73.90342
..params.py:9333 Hosts.buildChildren  502    0.003668  147.8042  0.294431
..tsdk/xml/params.py:8581 Host.build  500    0.079909  136.1151  0.272230
../params.py:8589 Host.buildChildren  19484  0.107480  134.6855  0.006913
..64/python2.7/inspect.py:59 isclass  603..  36.93347  51.74893  0.000001
..l/params.py:5527 API.buildChildren  20123  0.068090  47.53944  0.002362
..rtsdk/xml/params.py:900 Link.build  10048  0.060966  26.21297  0.002609
..params.py:906 Link.buildAttributes  10048  0.071323  26.15200  0.002603
..l/params.py:5355 API.buildChildren  20631  0.010165  25.76497  0.001249
..k/xml/params.py:1690 Actions.build  500    0.023060  25.10372  0.050207
..rams.py:1698 Actions.buildChildren  5500   0.021121  25.06705  0.004558
..ils/parsehelper.py:78 toCollection  1      0.001498  19.85408  19.85408
..sdk/utils/parsehelper.py:73 toType  500    0.001826  19.85243  0.039705
..ture/brokers.py:7977 Host.__init__  1000   0.021979  19.84900  0.019849
..ovirtsdk/xml/params.py:836 factory  10048  0.020716  19.56181  0.001947
..dk/xml/params.py:832 Link.__init__  10048  0.055590  19.54110  0.001945
..ml/params.py:5366 Version.__init__  2996   0.041362  17.51248  0.005845
..l/params.py:28388 Version.__init__  1493   0.020207  14.53058  0.009732
..irtsdk/xml/params.py:28396 factory  1492   0.004459  14.52337  0.009734
..virtsdk/xml/params.py:8253 factory  500    0.001655  11.68154  0.023363
..k/xml/params.py:8224 Host.__init__  500    0.018691  11.67988  0.023360
..astructure/brokers.py:7988 __new__  500    0.001301  9.606648  0.019213

Version-Release number of selected component (if applicable):
ovirt-engine-sdk-python-3.4.3.0-1.fc20.noarch

How reproducible:
100%

Steps to Reproduce:
1. build engine with 500 hosts and 500.
2. run the following using pythn sdk: "api.hosts.list(query='cluster = *fake*', max=500)"


Actual results:
long parsing response time once engine receive data.

Expected results:
fast parsing response time for large scale results.

Additional info:

Comment 1 Juan Hernández 2015-05-13 14:44:11 UTC

We know that the code generated by generateDS.py is slow when handling large XML documents. Not sure what is the root cause, but it is unlikely that we can improve it.

Comment 2 Juan Hernández 2015-05-14 07:42:13 UTC

After studying this I see that the part of the code that is consuming the time isn't generated by generateDS.py, but by our code generator. It is a "__setattr__" method that we use to delegate attribute access from brokers (classes defined in brokers.py) to containers (classes defined in params.py):

  https://github.com/oVirt/ovirt-engine-sdk/blob/master/generator/src/main/java/org/ovirt/engine/sdk/generator/python/templates/SuperAttributesTemplate

This method makes heavy use of the Python "inspect" module, and the result is really slow.

Changing this isn't trivial, but more feasible than changing generateDS.py.

Comment 6 Juan Hernández 2016-02-24 13:51:35 UTC

This performance issue will be fixed with version 4 of the SDK, as it uses a completely different approach for parsing the XML documents. These are the results of parsing 500 hosts, note that it goes from approx 168 seconds down to less than one second:

Wed Feb 24 14:40:30 2016    profile.txt

         566633 function calls (566592 primitive calls) in 0.624 seconds

   Ordered by: internal time
   List reduced from 1289 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    0.295    0.148    0.295    0.148 {method 'perform' of 'pycurl.Curl' objects}
      500    0.047    0.000    0.293    0.001 readers.py:6328(read_one)
    27000    0.022    0.000    0.022    0.000 {method 'read_element' of 'ovirtsdk4.xml.XmlReader' objects}
    76001    0.014    0.000    0.014    0.000 {method 'get_attribute' of 'ovirtsdk4.xml.XmlReader' objects}
    19500    0.013    0.000    0.013    0.000 {method 'next_element' of 'ovirtsdk4.xml.XmlReader' objects}
     1500    0.013    0.000    0.039    0.000 readers.py:16605(read_one)
    11000    0.009    0.000    0.009    0.000 struct.py:28(__init__)
    45500    0.007    0.000    0.007    0.000 {method 'node_name' of 'ovirtsdk4.xml.XmlReader' objects}
      500    0.007    0.000    0.015    0.000 types.py:14119(__init__)
    22002    0.007    0.000    0.007    0.000 {method 'read' of 'ovirtsdk4.xml.XmlReader' objects}
    68002    0.007    0.000    0.007    0.000 {method 'forward' of 'ovirtsdk4.xml.XmlReader' objects}
     3500    0.006    0.000    0.012    0.000 types.py:1844(__init__)
     1500    0.006    0.000    0.014    0.000 readers.py:14822(read_one)
    10000    0.006    0.000    0.006    0.000 reader.py:111(parse_integer)
        1    0.006    0.006    0.010    0.010 services.py:20(<module>)
    10000    0.005    0.000    0.017    0.000 reader.py:125(read_integer)
      500    0.005    0.000    0.016    0.000 readers.py:11996(read_one)
      500    0.004    0.000    0.011    0.000 types.py:10896(__init__)
      500    0.004    0.000    0.012    0.000 readers.py:6058(read_one)
    31000    0.004    0.000    0.004    0.000 struct.py:115(_check_type)
      500    0.004    0.000    0.021    0.000 readers.py:1855(read_one)
        1    0.004    0.004    0.011    0.011 http.py:19(<module>)
        1    0.004    0.004    0.624    0.624 list_hosts.py:20(<module>)
        1    0.004    0.004    0.013    0.013 http.py:273(system_service)
     1500    0.004    0.000    0.010    0.000 types.py:9005(__init__)
      500    0.003    0.000    0.009    0.000 readers.py:1281(read_one)
      500    0.003    0.000    0.012    0.000 readers.py:8244(read_one)
      500    0.003    0.000    0.019    0.000 readers.py:11256(read_one)
      500    0.003    0.000    0.005    0.000 types.py:2504(__init__)
      500    0.003    0.000    0.010    0.000 readers.py:14426(read_one)

Version 4 of the SDK will be available when this patch is merged:

  sdk: Add version 4
  https://gerrit.ovirt.org/53720

Comment 7 Red Hat Bugzilla Rules Engine 2016-02-24 13:51:44 UTC

Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 8 Red Hat Bugzilla Rules Engine 2016-02-24 13:53:22 UTC

This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.

Comment 9 Eldad Marciano 2016-08-17 13:48:33 UTC

Juan,
can you elaborate how to use the SDK.

I find it working differently compare to 3.6

how can i access to an object etc:
api.hosts.list

Comment 10 Juan Hernández 2016-08-17 13:51:38 UTC

For this particular bug you can use the following example:

  https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/list_vms.py

In general there is a description of the SDK in its source code repository:

  https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk#usage

And there is a collection of examples as well:

  https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk/examples

Comment 11 Juan Hernández 2016-08-17 13:53:39 UTC

Sorry, that first example is to list virtual machines. Listing hosts is very similar:

  # Get the reference to the "hosts" service:
  hosts_service = connection.system_service().hosts_service()

  # Use the "list" method of the "vms" service to list all the
  # hosts of the system:
  hosts = hosts_service.list()

  # Print the hosts names and identifiers:
  for host in hosts:
    print("%s: %s" % (host.name, host.id))

Comment 12 Eldad Marciano 2016-08-17 14:08:16 UTC

Cool 10x,

well i think we can verify it see the results:
list size cap 100 response time 4.50143885612
list size cap 300 response time 11.0167760849
list size cap 500 response time 27.9535710812

comparing to 167sec we faced before the patch for 500 entities.

later on i'll add some profiler results to see how we can speed it up further more.

Comment 13 Eldad Marciano 2016-08-17 14:53:51 UTC

by using comperes=True we can find the same response time the Juan mention:


list size cap 100 response time 2.03919100761
list size cap 300 response time 1.44519710541
list size cap 500 response time 2.07873082161


I set up a new bug
https://bugzilla.redhat.com/show_bug.cgi?id=1367826, that compress should be True by default.

and moving this one to verified.