Description of problem: The test environment is as follows: 3 OSP Controllers 3 Computes 3 ODLs clustered on OpenStack Controllers We ran the Browbeat+Rally neutron test suite with concurrencies of 8,16 and 32 and times set to 500. What essentially happens is neutron resources such as ports, networks, routers etc are created 500 times with 8,16 or 32 resources being created in parallel based on concurrency. After creation the resources are deleted and the cycle is repeated over and over again. Most perf measurement tools like Rally capture the time taken for the resource entry to be created in the neutron database after the create request is sent, and there is no guarantee that the resource exists and is created in the ML2 backend. With Lucas Gomes' help, I was able to use [1] to profile the time taken for a resource entry in the opendaylightjournal to to be sent to ODL API and for it to return. This was done by looking at the time difference between the 'last_retried' and 'created_at' timestamps in the opendaylightjournal table. My understanding is, when a journaling thread picks up an entry in the table for processing it updates the 'last_retried' field to the current timstamp and when the entry is sent to ODL API and the call returns successfully the entry is marked 'completed', and the 'created_at' timestamp is updated. So, the difference between these two timestamps gives us the time taken for the HTTP call to ODL to return. I reran the Browbeat+Rally neutron test suite about 4 times with "concurrency" set to 8 and 16 and "time" set to 500. Our python script [2] gives us the average time taken for the different operations on the database and also prints the top 15 time consuming entries and their resource UUIDs. It also informs us of any failed entries in the journal I also pulled in https://review.openstack.org/#/c/496125/ manually and set maintenance_interval to a large number so that the maintenance thread never runs during the test and set completed_rows_retention to -1 so that I can average out values during the entire test run and no entries are deleted during that (I realize this could mean more rows to scan for the maintenance thread) Observations: After the first run of the test suite, we see that the a maximum of 888 seconds was taken for an update port operation and the average for all operations was around 165s (https://gist.github.com/smalleni/fa05a627e29a50c68dc1d2368ba48061) After the second run of the test suite, we see that the maximum is now around 3400s for a create port and the average for all operations is around 950s (https://gist.github.com/smalleni/68102613516b6a3a387bc75ec5e85bbe) After the third run of the test suite, we see that the maximum is now around 8000s for an update port and the average for all operations is around 1500s (https://gist.github.com/smalleni/68102613516b6a3a387bc75ec5e85bbe) After the fourth run of the test suite, we see that the maximum is now 15700s for an update port and the average for all operations is around 2300s (https://gist.github.com/smalleni/28c4a855984cc23c5d58238639a48d1e) mysqld CPU usage during the entire test duration: https://snapshot.raintank.io/dashboard/snapshot/KLVyeojHtfX46pYI1YGNdIrhyeXE57FX We can see that it reaches a peak of around 43 cores on a 56 core machine and slowly goes back to normal when the test suite runs are finished MySQL slow query log: http://8.43.86.1:8088/smalleni/overcloud-controller-0-slow-profile.log Since each test suite run did not take more than 1.5hrs, the maximum values for the 3rd and 4th iterations of the test suite indicate that entries from the previous run of test suite were still being processed when running the current iteration. Also, looking at the MySQL slow query log, it doesn't seem like any query takes longer than 15s. We are curious to find out as to why it takes so long for calls to ODL to return when the query time itself isn't *that* huge. Also, we use a default timeout of 10s in ml2_odl configuration so that should mean the that the HTTP call to ODL returns before 10s for a journal entry to be marked complete, unless I'm reading it wrong. Version-Release number of selected component (if applicable): OSP12+ODL Carbon python-networking-odl-11.0.0-0.20170806093629.2e78dca.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy overcloud with ODL 2. Run Browbeat+Rally stress tests 3. Profile the opendaylightjournal table using [1] Actual results: Some resources take several hours to create under heavy load Expected results: Resources should be created fairly quickly Additional info: https://gist.github.com/smalleni/8da8c75f1765c23d7706706126dd4c2d
[1]- https://gist.github.com/smalleni/8da8c75f1765c23d7706706126dd4c2d
> I also pulled in https://review.openstack.org/#/c/496125/ manually and set > maintenance_interval to a large number so that the maintenance thread never runs > during the test and set completed_rows_retention to -1 so that I can average out > values during the entire test run and no entries are deleted during that (I realize > this could mean more rows to scan for the maintenance thread) There are several problems with this approach, as you mentioned: 1) Since rows are not deleted, the operations quickly pile up and since there is no index on status then there is a performance impact on the other processes that need to happen for it to go to ODL. 2) You shouldn't put maintenance_interval at a high number, as cleanup is part of the process. One specific part of cleanup is to put PROCESSING rows that failed but the error was not properly handled in PENDING. You don't specify how big your maintenance_interval is, but what could be happening is that the some of your operations are left in PROCESSING and never reset back to PENDING until your maintenance runs and then it is processed successfully. Under this conditions you'd see a long time delta. If you wanted rows not to be deleted, completed_rows_retention=-1 should be enough without the need to raise maintenance_interval. In regard to these two problems, I propose the following: - re (1)*: Create a DB hook or an external process that would "archive" or log the time deltas but would allow them to be maintained (including their deletion) - re (2): At least allow maintenance to run, just don't allow the deletion during the maintenance. * PREFERABLE
Sai, Can you please review comment #2 above? Thanks, Nir
Nir, I will get to this in the next round of testing de to begin in first week of November.
Closing for now since we don't have the details, let's re-open if we have logs or something more concrete..