Bug 1709075
| Summary: | mysql process consuming 400% CPU | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Brendan Shephard <bshephar> | |
| Component: | openstack-tripleo-heat-templates | Assignee: | Martin Magr <mmagr> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Leonid Natapov <lnatapov> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 13.0 (Queens) | CC: | asimonel, dciabrin, emacchi, jschluet, maufart, mbayer, mburns, michele, mmagr, mrunge, mvalsecc, nsharabi, pkilambi, slinaber, sputhenp, ssmolyak, vkapalav | |
| Target Milestone: | z7 | Keywords: | Reopened, TestOnly, Triaged, ZStream | |
| Target Release: | 13.0 (Queens) | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-tripleo-heat-templates-8.3.1-79.el7ost | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1721557 1721647 (view as bug list) | Environment: | ||
| Last Closed: | 2019-09-19 10:44:11 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1721557 | |||
|
Description
Brendan Shephard
2019-05-13 01:29:03 UTC
It appears that there is a large Panko query that is running longer than the connection is staying open for"
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1033, in _read_bytes
CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
Should we potentially try increasing net_read_timeout from 30 to 60 seconds here?
Right, so the problem here is:
There is a cron job for panko-expirer in the panko_api container, under /var/spool/cron/panko that looks like this:
# HEADER: This file was autogenerated at 2019-05-20 03:43:36 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: panko-expirer
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
1 0 * * * panko-expirer
But no crond running:
()[root@overcloud-controller-0 /]# ps -ef | grep cron
root 11598 11552 0 03:39 ? 00:00:00 grep --color=auto cron
Other containers do the same thing, but they all have accompanying cron containers:
[root@overcloud-controller-0 etc]# docker ps --filter name=_cron --format "{{.Names}}"
heat_api_cron
cinder_api_cron
logrotate_crond
nova_api_cron
keystone_cron
So the problem here is that we aren't deploying a panko_api_cron container, therefore we never run panko-expirer and the panko events just continue to fill up.
In order to verify: After fixed - need to enter controller & find panko_api_cron container. See that it's healthy & running. Also need to see the cron job: sudo docker exec -ti panko_api_cron ps -elf According to our records, this should be resolved by openstack-tripleo-heat-templates-8.3.1-54.el7ost. This build is available now. openstack-tripleo-heat-templates-8.3.1-55 is not in the latest puddle of OSP13 (2019-06-28.1), failing QA: (undercloud) [stack@undercloud-0 ~]$ rpm -qa | grep tripleo-heat openstack-tripleo-heat-templates-8.3.1-54.el7ost.noarch According to our records, this should be resolved by openstack-tripleo-heat-templates-8.3.1-79.el7ost. This build is available now. Tested according to description in comment #35. |