849177 – python-urlgrabber has timeout issue

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 849177 - python-urlgrabber has timeout issue

Summary: python-urlgrabber has timeout issue

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	python-urlgrabber
Sub Component:
Version:	6.6
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Valentina Mukhamedzhanova
QA Contact:	Martin Frodl
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1293363
TreeView+	depends on / blocked

Reported:	2012-08-17 15:24 UTC by Cris Rhea
Modified:	2016-05-10 21:34 UTC (History)
CC List:	7 users (show)
Fixed In Version:	python-urlgrabber-3.9.1-11.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1293363 1310665 (view as bug list)
Environment:
Last Closed:	2016-05-10 21:34:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
strace for "curl" (53.13 KB, text/plain) 2013-04-25 17:53 UTC, Cris Rhea	no flags	Details
tcpdump for curl strace (14.19 KB, application/octet-stream) 2013-04-25 17:54 UTC, Cris Rhea	no flags	Details
strace with more options... (56.62 KB, text/plain) 2013-04-26 14:26 UTC, Cris Rhea	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1305970	0	low	CLOSED	--disable-epsv option ignored for IPv6 hosts	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHEA-2016:0839	0	normal	SHIPPED_LIVE	python-urlgrabber bug fix update	2016-05-10 22:42:14 UTC

Internal Links: 1305970

Description Cris Rhea 2012-08-17 15:24:00 UTC

Description of problem: urlgrabber takes long time/timeout issue while wget/curl are sub-second.


Version-Release number of selected component (if applicable):

python-urlgrabber-3.9.1-8.el6.noarch.rpm

How reproducible:
I see this happen at all times on two systems. Running CentOS 6.3 x86_64.
yum prints timeout messages while doing simple operations like "yum update".
urlgrabber shows same behavior, which eliminates rest of yum package.

Steps to Reproduce:
1. Software repo uses anon-ftp for package download.
2. urlgrabber ftp://clustersw-212.mayo.edu/CentOS/6/os/x86_64/repodata/repomd.xml
3. Above command will succeed, but only after 1 min+. wget or curl are sub-second.
  
Actual results:
Appears code or related library has socket timeout issue. 

Expected results:
yum (and urlgrabber) able to access FTP repository at proper speed.

Additional info:

     0.000140 close(5)                  = 0
     0.000109 gettimeofday({1345216575, 228119}, NULL) = 0
     0.000097 clock_gettime(CLOCK_MONOTONIC, {150816, 341872742}) = 0
     0.000097 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
     0.000226 fcntl(5, F_GETFL)         = 0x2 (flags O_RDWR)
     0.000094 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0
     0.000089 connect(5, {sa_family=AF_INET, sin_port=htons(31574), sin_addr=inet_addr("129.176.212.87")}, 16) = -1 EINPROGRESS (Operation now in progress)
     0.000126 clock_gettime(CLOCK_MONOTONIC, {150816, 342506413}) = 0
     0.000098 poll([{fd=5, events=POLLOUT|POLLWRNORM}], 1, 299962) = 1 ([{fd=5, revents=POLLERR|POLLHUP}])
    62.999706 getsockopt(5, SOL_SOCKET, SO_ERROR, [8589934702], [4]) = 0
     0.000131 close(5)                  = 0
     0.000108 clock_gettime(CLOCK_MONOTONIC, {150879, 342549587}) = 0
     0.000120 sendto(4, "PASV\r\n", 6, MSG_NOSIGNAL, NULL, 0) = 6
     0.000137 clock_gettime(CLOCK_MONOTONIC, {150879, 342807916}) = 0

Comment 2 RHEL Program Management 2012-09-07 05:35:52 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.

Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.

Comment 3 Zdeněk Pavlas 2013-04-10 15:37:52 UTC

There's no explicit getsockopt() call in urlgrabber, and I have no idea why it blocks for 63 seconds, although the socket is O_NONBLOCK.

Note that the poll() call has returned {fd=5, revents=POLLERR|POLLHUP}, this indicates that the server has disconnected.  Could you provide a similiar strace of curl, too?  (tcpdump log would be nice, too).

Comment 4 Cris Rhea 2013-04-25 17:53:44 UTC

Created attachment 740006 [details]
strace for "curl"

Comment 5 Cris Rhea 2013-04-25 17:54:13 UTC

Created attachment 740007 [details]
tcpdump for curl strace

Comment 6 Cris Rhea 2013-04-25 17:57:59 UTC

Machine was updated to CentOS 6.4 (yum update) prior to running the newest tests.
Same issue occurs with lengthy delay in the middle...

wget of same URL is still sub-second.


curl is curl-7.19.7-36.el6_4.x86_64

Comment 7 Cris Rhea 2013-04-25 18:07:17 UTC

Just to be clear, NOW "curl" also has a 60+ second nap in the middle. So, whatever code changed between 6.3 -> 6.4 affected "curl" in addition to "urlgrabber".

Comment 8 Zdeněk Pavlas 2013-04-26 11:45:41 UTC

Thanks for the logs.  The FTP server (vsFTPd 2.0.1) does not support extended passive mode (it claims it does, but sends invalid responses).

     0.001410 recvfrom(3, "227 Entering Passive Mode (129,1"..., 16384, 0, NULL, NULL) = 52

According to RFC 2428:

   The first two fields contained in the parenthesis MUST be blank.  The
   third field MUST be the string representation of the TCP port number
   on which the server is listening for a data connection.

strace has truncated the response, and I can't connect to the FTP server, but the reply clearly seems to be bogus.  Curl however parsed port number 57566 somehow, and sent 7 tcp SYN packets that were silently dropped by the server (that's the timeout issue reported).

The curl handling of EPSV responses should be improved, and the client should use PASV immediately.  As a temporary workaround, you can disable it with --disable-epsv option.

In python-urlgrabber, a corresponding setopt(pycurl.FTP_USE_EPSV, 0) could be added to the _set_opts() method, but since that can break other use cases, I don't think it should be made permanent.

According to this http://curl.haxx.se/mail/lib-2011-02/0060.html the problem might not be in the server itself, but a nat/proxy that handles PASV responses fine, but mangles EPSV responses.

Comment 9 Zdeněk Pavlas 2013-04-26 12:29:57 UTC

Oh, sorry, swapped lines.. "227 Entering Passive Mode (129,1"... was the PASV response.  The problem is with the oher one:

     0.004663 recvfrom(3, "229 Entering Extended Passive Mo"..., 16384, 0, NULL, NULL) = 48

Could you re-run strace with "-s 64", to see the server reply in full?  Thanks.

Comment 10 Cris Rhea 2013-04-26 14:26:56 UTC

Created attachment 740437 [details]
strace with more options...

w/ strace -s 64

Comment 11 Zdeněk Pavlas 2013-04-29 08:49:01 UTC

> recvfrom(3, "229 Entering Extended Passive Mode (|||62284|)\r\n", 16384, 0, NULL, NULL) = 48
> connect(4, {sa_family=AF_INET, sin_port=htons(62284), sin_addr=inet_addr("129.176.212.87")}, 16) = -1 EINPROGRESS (Operation now in progress)
> poll([{fd=4, events=POLLOUT|POLLWRNORM}], 1, 299956) = 1 ([{fd=4, revents=POLLERR|POLLHUP}])

Ok, so the server reply was valid, and curl has correctly parsed it and tried to connect to port 62284.  There's no bug in the client side, neither in curl nor in the urlgrabber.

The connect has timed out (server or network dropping all SYN packets).  There's probably something wrong with the FTP server.

As I wrote before, this may be worked around by disabling EPSV in curl.  Could you confirm that "curl --disable-epsv ftp:///.." helps?

Comment 12 Cris Rhea 2013-04-29 14:10:11 UTC

Yes, adding the "--disable-epsv" does resolve the timeout issue. 

So, how does one configure Yum to do this?  

(Remember, my original issue is that the ability to perform yum installs/updates stopped in one of the 6.X updates. Yes, my FTP server is ancient: vsftpd-2.0.1-5.EL4.5.)

Comment 13 Zdeněk Pavlas 2013-04-29 16:13:45 UTC

I'm afraid you'll have to patch python-urlgrabber, as there's so such option yet, and I don't think we'll add one to support (presumably very few) broken FTP servers.

# patch /usr/lib/python2.7/site-packages/urlgrabber/grabber.py
--- urlgrabber-3.9.1/urlgrabber/grabber.py.old
+++ urlgrabber-3.9.1/urlgrabber/grabber.py
@@ -1159,6 +1159,9 @@ class PyCurlFileObject():
         if not opts:
             opts = self.opts
 
+        # --disable-epsv
+        self.curl_obj.setopt(pycurl.FTP_USE_EPSV, 0)
+
         # keepalives
         if not opts.keepalive:
             self.curl_obj.setopt(pycurl.FORBID_REUSE, 1)

Comment 14 Marc Riddle 2015-05-07 21:54:59 UTC

We have run into this same issue on our network. While there may be a limited number of FTP servers that have broken EPSV support, many enterprise firewalls (including Cisco ASRs, various CheckPoint products, and others) do not handle EPSV requests properly (as mentioned in comment #8). I suspect the number of users impacted by this is larger than you may expect. Consider this another request for this feature to be reconsidered, as we are unable to use yum with ftp repos at this point in time.

Comment 19 errata-xmlrpc 2016-05-10 21:34:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0839.html

Note You need to log in before you can comment on or make changes to this bug.