Bug 1487788
| Summary: | High CPU usage of Mysqld process with opendaylight journaling | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Sai Sindhur Malleni <smalleni> |
| Component: | python-networking-odl | Assignee: | Mike Kolesnik <mkolesni> |
| Status: | CLOSED ERRATA | QA Contact: | Sai Sindhur Malleni <smalleni> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 12.0 (Pike) | CC: | sgaddam, smalleni, trozet, tvignaud |
| Target Milestone: | beta | Keywords: | Triaged |
| Target Release: | 13.0 (Queens) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | scale_lab | ||
| Fixed In Version: | python-networking-odl-11.0.0-3.el7ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: |
N/A
|
|
| Last Closed: | 2018-03-28 19:08:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Sai Sindhur Malleni
2017-09-01 22:24:28 UTC
Mysqld CPU usage with ML2/OVS: https://snapshot.raintank.io/dashboard/snapshot/9fjDBAbNZqt80fUi6ElAFJXy2R1ynoHm?orgId=2 MySQL logs during another test run where the primary changes (it can be seen that mysqld is shutting down) https://gist.githubusercontent.com/smalleni/5fb89d7826bcbcb5df0d204fe4bf49ae/raw/b0b1c1212dea5adc4263c34370e6cbc5c2b7a2b4/gistfile1.txt Bottom line here is mysql CPU is spiking 360 x ML2/OVS average CPU for the same test of creating 500 routers, 8 at a time. Also results in a sql crash. This test seems more like a stress test than normal operation. Can we get a datapoint for the CPU usage during normal operation? Bottom line here is mysql CPU is spiking 360 x ML2/OVS average CPU for the same test of creating 500 routers, 8 at a time. Also results in a sql crash. This test seems more like a stress test than normal operation. Can we get a datapoint for the CPU usage during normal operation? Tim. So, I did some longevity testing over the weekend which pretty much simulates normal cloud operation. I create 40 neutron resources 2 at a time and then delete all 40. The same test was run over an over again, for more than 48 hours. So at no given point of time were there more than 40 neutron resources present nor were resources being created to "stress" the cloud (only 2 resource being created concurrently). After about 24 hours of operation, we see the CPU usage hovering at around 3000% consistently. Here is a link to Grafana https://snapshot.raintank.io/dashboard/snapshot/kcjv6kl7tLlD2cTGuT5RRjy45Ui7ho5O It seems to be related to the number of rows i nthe opendaylightjournaltable, since rows keep piling up. The number of rows after about 48 hours of operation is as follows: MariaDB [(none)]> use ovs_neutron; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [ovs_neutron]> select count(*) from opendaylightjournal; +----------+ | count(*) | +----------+ | 182463 | +----------+ Looks like all the entries are pending MariaDB [ovs_neutron]> select count(*) from opendaylightjournal where state='pending' -> ; +----------+ | count(*) | +----------+ | 182944 | There are several errors in the karaf logs as well as neutron server logs and I am going to leave the system in the current state for some debugging. However, given that this high CPU usage is a possibility even during normal operation (most likely due to the large number of rows to scan), I feel this bug is high priority to fix. Happy to provide more information. Sai, If I understand correctly - the following scenario should verify: Over 24 hours do the following 1. Create 20 routers and add interfaces to an internal network 2. Delete the interfaces and the routers Get the CPU data and check that Mysqld hasn't consumed too much CPU. Yeh, what I used was rally with times set to 40 and concurrency set to 2. The scenario was to create routers. Keep running the same scenario mentioned above for a long time (wrap the rally command for above scenario in a bash script for example) and observe the mysqld usage. At no point should it peak. It is also worth inspecting the DB after the tests. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0617 I have rerun scale test son OSP13 + ODL OXygen and can confirm that I am no longer seeing this. Mysqld CPU usage never goes above 1 core. https://snapshot.raintank.io/dashboard/snapshot/orgKEjMKRFqM5qYE9TW5YEVm5b5byc30 |