Bug 1755154

Summary: postgresql-libs - Deadlocks occur when using SSL in a multi-threaded environment
Product: Red Hat Enterprise Linux 7 Reporter: Matt Prahl <mprahl>
Component: postgresqlAssignee: Patrik Novotný <panovotn>
Status: CLOSED ERRATA QA Contact: Vaclav Danek <vdanek>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.7CC: databases-maint, dgregor, fjanus, hhorak, pkubat, praiskup, vdanek
Target Milestone: rcKeywords: Reproducer, TestCaseNeeded
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 20:11:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1716961    
Attachments:
Description Flags
A patch based on the upstream patch that resolves the issue
none
A video showing the deadlock using the Python reproducer script
none
A modified version of the reproducer Python 2 script none

Description Matt Prahl 2019-09-24 21:42:30 UTC
Created attachment 1618713 [details]
A patch based on the upstream patch that resolves the issue

Description of problem:

A deadlock occurs when connecting to Postgresql using SSL with postgresql-libs in a multi-threaded environment with other threads performing SSL independently. This issue has been causing Module Build Service (https://pagure.io/fm-orchestrator) outages, which is deployed on RHEL 7.4 but with the latest postgresql-libs package installed.

You can find a reproducer script and a more in-depth description of the issue here:
https://postgrespro.com/list/thread-id/1861629

The upstream patch that resolves this issue is here:
https://commitfest.postgresql.org/4/140/

The patch does not apply cleanly, but I attached a patch that worked for our team.


Version-Release number of selected component (if applicable):

postgresql-libs-9.2.24-1.el7_5.x86_64


How reproducible:

Easily reproducible. See the reproducer script in this upstream discussion:
https://postgrespro.com/list/thread-id/1861629


Actual results:

A deadlock occasionally occurs.


Expected results:

A deadlock does not occur.

Comment 4 Filip Januš 2019-10-25 13:01:28 UTC
Hi,
I was trying to reproduce this bug on various version of RHEL(7.4,7.7,7.8) by using attached reproducer, but I am not able to attain deadlock. Are you able to add here Your configuration? (openssl package version and pg_hba.conf).

Comment 5 Matt Prahl 2019-10-25 13:54:32 UTC
Created attachment 1629194 [details]
A video showing the deadlock using the Python reproducer script

Comment 6 Matt Prahl 2019-10-25 13:58:14 UTC
Created attachment 1629195 [details]
A modified version of the reproducer Python 2 script

Comment 7 Matt Prahl 2019-10-25 14:40:09 UTC
Hi Filip,
I attached a short video that shows the deadlock and I attached a slightly modified version of the Python 2 reproducer script that I used. It usually takes less than a couple of minutes for the deadlock to occur, but it took only a few seconds in the video I shared.

We are currently using openssl-1.0.2k-16.el7_6.1.x86_64 on the server that ran the reproducer script and on the server with Postgresql. As for the contents of pg_hba.conf, there is nothing special. We have one `hostssl` entry for MBS to connect to using a password.

Comment 8 Filip Januš 2019-10-29 15:09:28 UTC
Hi,
thank You for Your advise but still I am not able to reproduce it. But I applied attached patch and build package. Here you can download it: http://download.eng.bos.redhat.com/brewroot/work/tasks/7480/24307480/postgresql-libs-9.2.24-2.el7_7.x86_64.rpm . Please are you able to test if deadlock occur after install new package?

Comment 9 Matt Prahl 2019-10-29 18:41:47 UTC
Hi Filip,
That RPM installed fine and I can't reproduce the deadlock with it.

Comment 11 Matt Prahl 2019-10-30 14:06:13 UTC
Filip what environment do you have setup for testing the reproducer script?

Comment 12 Matt Prahl 2019-10-30 15:58:42 UTC
Filip, I created an environment using docker-compose that reproduces the issue. Please see the following repository:
https://github.com/mprahl/rhbz1755154-reproducer

Comment 20 errata-xmlrpc 2020-03-31 20:11:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1182