Bug 479967 - curl uses 100% of CPU if upload connection is broken
curl uses 100% of CPU if upload connection is broken
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: curl (Show other bugs)
5.2
All Linux
high Severity high
: rc
: ---
Assigned To: Kamil Dudka
BaseOS QE
:
Depends On:
Blocks: 499522
  Show dependency treegraph
 
Reported: 2009-01-14 05:36 EST by Martin Poole
Modified: 2010-10-23 03:00 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 479674
Environment:
Last Closed: 2010-03-30 04:04:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch for POLLHUP return code (377 bytes, patch)
2009-01-14 05:38 EST, Martin Poole
no flags Details | Diff

  None (edit)
Description Martin Poole 2009-01-14 05:36:58 EST
+++ This bug was initially created as a clone of Bug #479674 +++

Escalated to Bugzilla from IssueTracker

--- Additional comment from tao@redhat.com on 2009-01-12 08:11:30 EDT ---

Description of problem:


url uses 100% of CPU if connection is broken

How reproducible:

100 %

Steps to Reproduce:

1) A 200M file was created with dd to test the transfer.
2) On client computer, start curl like so:
   curl -F "ufile=@/simlux/dummy.file" http://server/upload.php?t=log &
3) During the upload, after 300 ms,HTTPd processes are killed on the server in order to terminate the connection

Actual results:

4) curl instantly jumps to 100 % CPU usage

Expected results:

4) curl should simply lead to an error (connection terminated or sth similar)

Additional info:

1) This was reported agains RHEL 4.7. The latest version of curl we ship in RHEL4 is 7.12.1-11. 
2) The problem reported here is fixed upstream (see http://curl.netmirror.org/changes.html - "Fixed in 7.16.1 - January 29 2007 - CPU 100% load when HTTP upload connection broke") 
From the Changelog of the curl I can see 
- Matt Witherspoon fixed a problem case when the CPU load went to 100% when a HTTP upload was disconnected:

 "What appears to be happening is that my system (Linux 2.6.17 and 2.6.13) is setting *only* POLLHUP on poll() when the conditions in my previous mail occur. As you can see, select.c:Curl_select() does not check for POLLHUP. So
 basically what was happening, is poll() was returning immediately (with POLLHUP set), but when Curl_select() looked at the bits, neither POLLERR or POLLOUT was set. This still caused Curl_readwrite() to be called, which quickly returned. Then the transfer() loop kept continuing at full speed forever."

3) With the information of 2) and looking at the coded (especially the diffs from 7.16.0 to 7.16.1 I came up with the following patch

-- curl-7.12.1/lib/select.c    2008-11-28 04:19:25.000000000 -0500
+++ curl-7.12.1.new/lib/select.c        2008-11-28 04:13:38.000000000 -0500
@@ -82,7 +82,7 @@
  if (writefd != CURL_SOCKET_BAD) {
    if (pfd[num].revents & POLLOUT)
      ret |= CSELECT_OUT;
-    if (pfd[num].revents & POLLERR)
+    if (pfd[num].revents & (POLLERR|POLLHUP))
      ret |= CSELECT_ERR;
  }

However that does not appear to have solved the issue for the customer. 

Expected actions from/questions to SEG:

1) Are there plans to rebase curl in RHEL 4 and/or RHEL5 ?
2) Provide a patch to curl-7.12.1-11 which fixes the problem aforementioned.
This event sent from IssueTracker by mpoole  [Support Engineering Group]
 issue 253089

--- Additional comment from mpoole@redhat.com on 2009-01-13 10:59:44 EDT ---

Created an attachment (id=328878)
patch for POLLHUP return code

--- Additional comment from mpoole@redhat.com on 2009-01-13 11:03:39 EDT ---

Just attached a patch against 7.12.1 which provides for handling of the POLLHUP return to abort upload/download

Test cgi on server can be simplified to

#!/bin/bash
cat >/dev/null
echo -n "Content-Type: text/plain\n\nThanks\n"


and then test the package with

curl -F "ufile=@/var/tmp/dummy.file" http://localhost/cgi-bin/dump.sh
Comment 1 Martin Poole 2009-01-14 05:38:42 EST
Created attachment 328964 [details]
patch for POLLHUP return code
Comment 2 Jindrich Novy 2009-01-14 07:30:28 EST
Created attachment 328978 [details]
Attempt to fix the 7.12.1

It doesn't seem to be an issue on RHEL5. We have curl-7.15.5 there. So only RHEL4 seems to be affected.

Please try to add the POLLHUP test also for the readfd part. Hopefully that helps.
Comment 3 Martin Poole 2009-01-14 07:50:36 EST
the complete patch for 7.12.1 is already attached to bz479674

RHEL5 _does_ suffer from the same problem. It has _part_ of the fix for the connection going away, that being the check for POLLHUP on the read fd, and the error checking in the main transfer routine. What RHEL5 lacks is the check for the POLLHUP event on the writefd.
Comment 4 Jindrich Novy 2009-01-14 09:31:39 EST
Ok, things are much clearer now.

So we now know curl-7.15.5 is affected as well. Are the packages with the POLLHUP mask added to the writefd check tested by the costumer with this patch on RHEL5? The issue is still there with RHEL5 as well as on old curl on RHEL4?

Note that this whole bugreport is very confusing and is maybe it is better to close it NOTABUG and report it again for RHEL5 with removal of any RHEL4 information which is misleading here for RHEL5.
Comment 5 Martin Poole 2009-01-14 09:55:04 EST
The same bug is present in RHEL5 and RHEL4.

Which bit is confusing ?  The same code path is being used, RHEL5 just has a small part of the code needed for the fix.

Specifically RHEL5 copes with the case where the curl command is primarily handling a download.  The bug the customer reported is that it does not also handle upload cleanly.

I have confirmed the bug is present on the current released version on RHEL5.

I have tested the patch in this BZ and confirmed that it fixes the problem.
Comment 6 Jindrich Novy 2009-01-14 13:28:42 EST
Only this snippet from the bug description:

<snip>
3) With the information of 2) and looking at the coded (especially the diffs
from 7.16.0 to 7.16.1 I came up with the following patch

-- curl-7.12.1/lib/select.c    2008-11-28 04:19:25.000000000 -0500
+++ curl-7.12.1.new/lib/select.c        2008-11-28 04:13:38.000000000 -0500
@@ -82,7 +82,7 @@
  if (writefd != CURL_SOCKET_BAD) {
    if (pfd[num].revents & POLLOUT)
      ret |= CSELECT_OUT;
-    if (pfd[num].revents & POLLERR)
+    if (pfd[num].revents & (POLLERR|POLLHUP))
      ret |= CSELECT_ERR;
  }
</snip>

where is written that it is created from diff between 7.16.0 and 7.16.1, but there is 7.12.1 in the patch header. But that's likely because it was then backported to RHEL4.

Another thing that confused me was this message right under it:
 
<snip>
However that does not appear to have solved the issue for the customer. 
</snip>

what made me believe the proposed patch doesn't fix the problem.

But you made all clear in comment #5 so we can start looking for ACKs now.
Comment 9 RHEL Product and Program Management 2009-03-26 13:26:24 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 21 errata-xmlrpc 2010-03-30 04:04:29 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0273.html

Note You need to log in before you can comment on or make changes to this bug.