Bug 1947989
| Summary: | pmproxy hangs and consume 100% cpu if the redis datasource in grafana is configured with TLS | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Michele Casaburo <mcasabur> |
| Component: | pcp | Assignee: | Mark Goodwin <mgoodwin> |
| Status: | CLOSED ERRATA | QA Contact: | Jan Kurik <jkurik> |
| Severity: | medium | Docs Contact: | Apurva Bhide <abhide> |
| Priority: | unspecified | ||
| Version: | 8.3 | CC: | agerstmayr, jkurik, mgoodwin, nathans, patrickm |
| Target Milestone: | rc | Keywords: | Bugfix, Triaged |
| Target Release: | 8.5 | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | pcp-5.3.1-2.el8 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-11-09 17:49:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 18
Mark Goodwin
2021-05-19 10:56:28 UTC
Fixed upstream (pcp-5.3.2 devel) with PR https://github.com/performancecopilot/pcp/pull/1321 which has the following fixes: commit 3f5ba221842e6a02e9fb22e23c754854271c3c9a Author: Mark Goodwin <mgoodwin> Date: Wed Jun 9 16:44:30 2021 +1000 libpcp_web: add mutex to struct webgroup protecting the context dict Add a mutex to the local webgroups structure in libpcp_web and use it to protect multithreaded parallel updates (dictAdd, dictDelete) to the groups->contexts dict and the dict traversal in the timer driven garbage collector. Tested by qa/297 and related tests and also an updated version of qa/1457 (which now stress tests parallel http and https/tls pmproxy RESTAPI calls .. in a later commit). Related: RHBZ#1947989 Resolves: https://github.com/performancecopilot/pcp/issues/1311 commit 2bad6aef10339f000f7cb578108db5ee80bd640c Author: Mark Goodwin <mgoodwin> Date: Wed Jun 9 17:04:33 2021 +1000 pmproxy: add mutex for client req lists, fix https/tls support, QA Add a new mutex to struct proxy and use it to protect parallel multithreaded updates to the proxy->first client list. Also use the same mutex to protect updates to the pending_writes client list and avoid the doubly linked list corruption that was causing parallel https/tls requests to get stuck spinning in flush_secure_module(), as reported in BZ#1947989. qa/1457 is extensively updated to test parallel http, https/tls (and combinations of http and https/tls) RESTAPI calls. Previously it only tested a single https/tls call. With these changes, parallel https/tls RESTAPI requests from the grafana-pcp datasource to pmproxy now work correctly whereas previously pmproxy would hang/spin. Resolves: RHBZ#1947989 - pmproxy hangs and consume 100% cpu if the redis datasource is configured with TLS. Related: https://github.com/performancecopilot/pcp/issues/1311 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcp bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4171 |