Bug 1347904 - Ceph RGW deadlocks in curl_multi_wait
Summary: Ceph RGW deadlocks in curl_multi_wait
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: curl
Version: 7.3
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Kamil Dudka
QA Contact: Stefan Dordevic
URL:
Whiteboard:
: 1367614 (view as bug list)
Depends On:
Blocks: 1327142
TreeView+ depends on / blocked
 
Reported: 2016-06-18 04:35 UTC by Ken Dreyer (Red Hat)
Modified: 2016-11-03 17:44 UTC (History)
3 users (show)

Fixed In Version: curl-7.29.0-32.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-03 17:44:44 UTC
Target Upstream Version:


Attachments (Terms of Use)
beaker regression test for curl_multi_wait (2.43 KB, application/x-bzip)
2016-06-20 19:13 UTC, Casey Bodley
no flags Details
curl-7.29.0-32.el7.src.rpm (2.17 MB, application/octet-stream)
2016-08-19 08:25 UTC, Kamil Dudka
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2575 0 normal SHIPPED_LIVE Moderate: curl security, bug fix, and enhancement update 2016-11-03 12:06:39 UTC

Description Ken Dreyer (Red Hat) 2016-06-18 04:35:03 UTC
Description of problem:
Ceph's RGW uses curl_multi_wait and hits a deadlock in this code (bug 1327142). The following PR for RHEL 7.2's curl fixes this: https://github.com/ktdreyer/curl/pull/1

Version-Release number of selected component (if applicable):
curl-7.29.0-25.el7

How reproducible:
always

Steps to Reproduce:
1. See details at https://github.com/ktdreyer/curl/pull/1

Actual results:
Ceph RGW deadlocks

Expected results:
Ceph RGW does not deadlock

Comment 3 Kamil Dudka 2016-06-20 16:05:06 UTC
Thanks a lot for identifying the fix and preparing the patches!  To summarize it here, this is a request to backport the following upstream commits:

https://github.com/curl/curl/commit/curl-7_29_0-273-g136a3a0
https://github.com/curl/curl/commit/curl-7_31_0-68-g6d30f8e
https://github.com/curl/curl/commit/curl-7_31_0-78-g513e587

Comment 5 Casey Bodley 2016-06-20 19:13:20 UTC
Created attachment 1169974 [details]
beaker regression test for curl_multi_wait

I've attached a beaker regression test to validate the fix.

Comment 7 Kamil Dudka 2016-06-21 07:36:13 UTC
(In reply to Casey Bodley from comment #5)
> I've attached a beaker regression test to validate the fix.

Works reliably for me.  Thank you for preparing the test!

@QE: Please make sure that libcurl-devel is installed for the test to run (unless it is installed somehow automatically).

Comment 12 Kamil Dudka 2016-08-17 07:40:01 UTC
*** Bug 1367614 has been marked as a duplicate of this bug. ***

Comment 13 Steven Haigh 2016-08-18 02:35:15 UTC
Just wondering if there is any chance of getting a copy of the curl-7.29.0-32.el7
 packages for testing?

This problems is currently hitting our systems hard with 100% CPU usage on everything. Would like to test this and feed back the info to the DotNet Core team.

Comment 14 Kamil Dudka 2016-08-19 08:25:31 UTC
Created attachment 1192069 [details]
curl-7.29.0-32.el7.src.rpm

(In reply to Steven Haigh from comment #13)
> Just wondering if there is any chance of getting a copy of the
> curl-7.29.0-32.el7 packages for testing?

I am attaching an *unsupported* source RPM for *TESTING PURPOSES ONLY*.  Please do not use it on production systems.  Feedback is appreciated!

Comment 15 Steven Haigh 2016-08-20 10:03:31 UTC
Just as an update - I've built these packages and pushed them to my testing repo for testing on the machine with this problem.

I don't want to restart the C# dotnet core app at the moment - but should probably have more info during the work week...

Comment 16 Steven Haigh 2016-08-24 05:50:31 UTC
I can confirm that this package fixes the issues we were seeing with the dotnet core applications as per BZ 1367614.

Comment 17 Kamil Dudka 2016-08-24 06:10:47 UTC
Perfect.  Thanks for confirmation!

Comment 19 errata-xmlrpc 2016-11-03 17:44:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2575.html


Note You need to log in before you can comment on or make changes to this bug.