Bug 643656 - some FTP servers hang if curl closes the data connection too early
Summary: some FTP servers hang if curl closes the data connection too early
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: curl
Version: 14
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-16 20:23 UTC by Pasi Karkkainen
Modified: 2014-01-21 23:04 UTC (History)
8 users (show)

Fixed In Version: curl-7.20.1-5.fc13
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-28 20:41:49 UTC
Type: ---


Attachments (Terms of Use)
fix proposed by Daniel Stenberg (1.76 KB, patch)
2010-10-17 23:11 UTC, Kamil Dudka
no flags Details | Diff
lzma compressed updates floppy image for Fedora 14 Anaconda (156.72 KB, application/octet-stream)
2010-10-18 18:05 UTC, Kamil Dudka
no flags Details
gzip compressed cpio archive with patched libcurl (x86_64) (542.99 KB, application/octet-stream)
2010-10-19 20:22 UTC, Kamil Dudka
no flags Details
gzip compressed cpio archive with patched libcurl (x86_64) (51 bytes, text/plain)
2010-10-19 21:40 UTC, Kamil Dudka
no flags Details
gzip compressed cpio archive with patched libcurl (x86_64) (543.05 KB, application/octet-stream)
2010-10-19 21:51 UTC, Kamil Dudka
no flags Details
/tmp/anaconda.log from my attempt (10.55 KB, text/plain)
2010-10-19 21:54 UTC, Kamil Dudka
no flags Details

Description Pasi Karkkainen 2010-10-16 20:23:49 UTC
Description of problem:
Curl fails to fetch files from some FTP servers if "--range" parameter is used.
Specifying --range makes curl use the ftp "REST" command to fetch the specified range of bytes.

Version-Release number of selected component (if applicable):
# rpm -qa|grep -i curl
libcurl-7.20.1-4.fc13.x86_64
curl-7.20.1-4.fc13.x86_64
libcurl-devel-7.20.1-4.fc13.x86_64
python-pycurl-7.19.0-7.fc13.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. Run the command:
curl -o iputils-20071127-10.fc13.x86_64.rpm --range 1384-1400 ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/13/Fedora/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm

 
Actual results:
curl gives error: "(28) FTP response timeout" after around one minute.

Expected results:
File is transferred OK.

Additional info:
Removing the "--range" parameter makes curl work OK without problems.

This bug causes ftp:// URL installation of Fedora fail, since Fedora Anaconda installer urlgrabber uses PyCurl, which uses curl.

The FTP server in question (ftp.funet.fi) runs Pure-ftpd.

Comment 1 Pasi Karkkainen 2010-10-16 20:26:22 UTC
More info about the Fedora Anaconda installation failures here: https://bugzilla.redhat.com/show_bug.cgi?id=624431

Basicly Anaconda wants to skip the headers of rpm files using the range method, which makes curl use the ftp REST command, and thus triggers this bug and installation fails.

Comment 2 Kamil Dudka 2010-10-16 23:12:35 UTC
It looks like a server bug to me.  The server cowardly refuses to respond if we close the data channel before the server has sent entire file.  It does not seem to be anyhow related to REST.

$ curl -svo/dev/null --range 0-7 ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/13/Fedora/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm

...
* DO phase is complete
* DO-MORE phase starts
* FTP RANGE from 0 getting 8 bytes
* range-download from 0 to 7, totally 8 bytes
> TYPE I
* FTP 0x62e4c8 state change from STOP to RETR_TYPE
< 200 TYPE is now 8-bit binary
> SIZE iputils-20071127-10.fc13.x86_64.rpm
* FTP 0x62e4c8 state change from RETR_TYPE to RETR_SIZE
< 213 121012
> RETR iputils-20071127-10.fc13.x86_64.rpm
* FTP 0x62e4c8 state change from RETR_SIZE to RETR
< 150-Accepted data connection
< 150 118.2 kbytes to download
* Maxdownload = 8
* Getting file with size: 8
* FTP 0x62e4c8 state change from RETR to STOP
* DO-MORE phase ends with 0
{ [data not shown]
* Remembering we are in dir "pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/13/Fedora/x86_64/os/Packages/"

Then it continues to hang with the following backtrace:

#0  0x0000003a112d7e13 in poll () from /lib64/libc.so.6
#1  0x00007ffff7dba310 in Curl_socket_ready (readfd=7, writefd=-1, timeout_ms=1000) at select.c:217
#2  0x00007ffff7d83f14 in Curl_GetFTPResponse (nreadp=0x7fffffffd8c8, conn=0x62dff8, ftpcode=0x7fffffffd8f8) at ftp.c:505
#3  0x00007ffff7d89063 in ftp_done (conn=0x62dff8, status=CURLE_OK, premature=false) at ftp.c:3113
#4  0x00007ffff7d97748 in Curl_done (connp=0x7fffffffd990, status=CURLE_OK, premature=false) at url.c:5167
#5  0x00007ffff7da2722 in Curl_do_perform (data=0x623568) at transfer.c:2118
#6  0x00007ffff7da287b in Curl_perform (data=0x623568) at transfer.c:2188
#7  0x00007ffff7da33af in curl_easy_perform (curl=0x623568) at easy.c:567
#8  0x0000000000408e0f in operate (argc=0, argv=<value optimized out>) at main.c:5445
#9  main (argc=0, argv=<value optimized out>) at main.c:5775


If I stop curl for a second just before closing the data connection, the server is happy:

(gdb) break ftp.c:3095
(gdb) run

Breakpoint 1, ftp_done (conn=0x62dff8, status=CURLE_OK, premature=false) at ftp.c:3095
3095          sclose(conn->sock[SECONDARYSOCKET]);

(gdb) list
3090
3091          /* Note that we keep "use" set to TRUE since that (next) connection is
3092             still requested to use SSL */
3093        }
3094        if(CURL_SOCKET_BAD != conn->sock[SECONDARYSOCKET]) {
3095          sclose(conn->sock[SECONDARYSOCKET]);
3096          conn->sock[SECONDARYSOCKET] = CURL_SOCKET_BAD;
3097        }
3098      }
3099

(gdb) continue
< 226-File successfully transferred
< 226 0.288 seconds (measured here), 410.81 Kbytes per second
* Connection #0 to host ftp.funet.fi left intact
> QUIT
* FTP 0x62e4c8 state change from STOP to QUIT
< 221-Goodbye. You uploaded 0 and downloaded 119 kbytes.
< 221 Logout.
* FTP 0x62e4c8 state change from QUIT to STOP
* Closing connection #0

Comment 3 Pasi Karkkainen 2010-10-16 23:33:26 UTC
Thanks for looking at it!

What do you suggest to do about it? I'm sure ftp.funet.fi is not the only server running pure-ftpd.

I've tried every ftp client I can think of, and I've only seen this problem with curl.. wget, firefox, lftp, ftp, ncftp, etc all work just fine with ftp.funet.fi.

Some kind of workaround in curl would be nice..

Comment 4 Kamil Dudka 2010-10-16 23:42:58 UTC
(In reply to comment #3)
> I've tried every ftp client I can think of, and I've only seen this problem
> with curl.. wget, firefox, lftp, ftp, ncftp, etc all work just fine with
> ftp.funet.fi.

How can I limit amount of the data to download in those clients?

Comment 5 Pasi Karkkainen 2010-10-17 10:39:44 UTC
Well, I don't know if you can limit it actually with those clients. 

So ok, it might be just a broken server, but still workaround in the client would be nice..

Especially when this issue makes people unable to install Fedora from the official mirror!

Comment 6 Kamil Dudka 2010-10-17 11:32:59 UTC
Sure, workaround would be nice.  That's why I am asking how other clients solve the problem.  So far I am not aware of any existing solution.  The following two choices come into consideration:

1) Download more data than we actually need, in order to get the response from server.  This can be as well achieved at application level.

2) Do not wait for response from server if we're downloading only part of file and already have received enough data, and move forward to QUIT command immediately.  Such a solution is faster than 1) but isn't RFC compliant and therefore should not be enabled by default.  We would need to introduce a new option for it.

3) Any other ideas?

Comment 7 Pasi Karkkainen 2010-10-17 12:51:13 UTC
Yeah, or just download the whole file in Anaconda installer. Skipping 1483 bytes per rpm doesn't save much when we're installing the whole Fedora distro anyway. I don't know how likely it is to get the Anaconda people to change that behaviour though.. 

Thinking more about this. During Fedora installation anaconda wants to skip the header from the rpm files, and thus it issues REST command to start the download from specific offset (header size). All of the file after that offset will be downloaded. So there's no end of the range specified. If you test with range "0-8" you get a different problem, no?

I tried to play with wget. I downloaded 10 MB from a 100MB file, and then hit ctrl+c to quit the download. Then I did "wget -c" to continue the download, which makes wget use REST to continue from the offset it already has downloaded, but that seemed to work OK without problems.

In the wireshark capture of the Fedora Anaconda FTP session I can see it starts the download from offset 1483 with REST, and then it fails after one minute (timeout). That's the case that needs to be worked around.

Hopefully I didn't miss something.

Comment 8 Kamil Dudka 2010-10-17 13:28:23 UTC
As I said, REST is not problem here.  Problem is that data channel is closed before the server sends what it wanted to send.  Pick your original example and remove the upper bound from there: "--range 1384-".  Then it works in the same way as wget works.

Once again, skipping first N bytes of a file does not seem to be a problem here.

Comment 9 Pasi Karkkainen 2010-10-17 13:45:04 UTC
Ok, please forget my example, since that's not the exact case what happens with Anaconda! I was just trying to make up a similar easy-to-reproduce example.

Here's actual wireshark capture of the failing Anaconda/curl session:
http://pasik.reaktio.net/fedora/f13-install-port21.thark

Per my understanding it's trying to download the rpm, just skipping first N bytes (the header).

Comment 10 Pasi Karkkainen 2010-10-17 13:46:16 UTC
And I forgot to mention you should go to the end of that wireshark capture.. the failing rpm (iputils) is there.

Comment 11 Kamil Dudka 2010-10-17 14:03:50 UTC
Skipping first N bytes works for me with curl and the mentioned FTP server.

Does Anaconda anyhow limit the amount of data to download?

Comment 12 Pasi Karkkainen 2010-10-17 16:19:05 UTC
Indeed:

curl -svo/dev/null --range 1384-121012 ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/13/Fedora/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm

Seems to work. 121012 is the size of the file in question. So anaconda must do something else..

Comment 13 Daniel Stenberg 2010-10-17 21:41:43 UTC
A possible fix for curl that isn't too shabby would be to send an ABOR command when it has closed the data transfer, as that explicitly instructs the server to close the data connection.

Comment 14 Kamil Dudka 2010-10-17 23:01:27 UTC
Daniel, thanks for looking at the issue.  Could we rely on the support of ABOR by server?  I know it's in RFC959, but it might cause problems with some other FTP servers.  Is it safe to use it by default for range FTP transfers?

Comment 15 Kamil Dudka 2010-10-17 23:11:49 UTC
Created attachment 453972 [details]
fix proposed by Daniel Stenberg

The attached patch solves the problem from comment #0 and does not seem to break anything else at first glance...

Comment 16 Daniel Stenberg 2010-10-18 06:56:12 UTC
curl hasn't used ABOR before so I've not really experienced its use in a wide scale deployment but I'm not aware of any documented or known flaws with it.

Comment 17 Kamil Dudka 2010-10-18 07:20:05 UTC
Pasi, are you able to verify that the patch solves the Anaconda problem?

Comment 18 Pasi Karkkainen 2010-10-18 07:30:54 UTC
Good question. I guess testing requires building custom Fedora installer with the patched curl included.

Or I wonder if I could replace the curl libs from a running anaconda session..

Is there an easier way to try patched curl with anaconda?

Comment 19 Kamil Dudka 2010-10-18 07:43:25 UTC
It should be possible to create a floppy image and pass the "update" during boot.  Chris, is it documented somewhere?

Comment 20 Kamil Dudka 2010-10-18 07:43:48 UTC
s/update/updates/

Comment 21 Chris Lumens 2010-10-18 15:06:00 UTC
http://fedoraproject.org/wiki/Anaconda/Updates

Just create a disk image with the .so files in the root of it, and anaconda will deal with the rest.

Comment 22 Kamil Dudka 2010-10-18 18:05:18 UTC
Created attachment 454179 [details]
lzma compressed updates floppy image for Fedora 14 Anaconda

Thanks Chris.  That's exactly what I was looking for.  Attached is a floppy image with patched libcurl (x86_64).  I didn't test it myself yet...

Comment 23 Adam Williamson 2010-10-19 18:28:41 UTC
Proposing as NTH in case we want to pull this into RC1. Can everyone please vote on whether they think we should or not?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 24 Kamil Dudka 2010-10-19 20:22:08 UTC
Created attachment 454433 [details]
gzip compressed cpio archive with patched libcurl (x86_64)

I had created FAT image, which Anaconda is not able to mount.  Will try the recommended cpio archive instead...

Comment 25 Pasi Karkkainen 2010-10-19 20:27:44 UTC
I'm able to try it tomorrow.. maybe others can try it before that.

Comment 26 Adam Williamson 2010-10-19 20:29:27 UTC
when I asked about NTH status in #anaconda , clumens said 'seems reasonable'. any other votes?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 27 Adam Williamson 2010-10-19 20:43:02 UTC
I want to test this, but I'm not sure how: funet.fi doesn't seem to have the stage2 images for F14 on it anywhere, so how can I set up an FTP install of F14 from funet.fi to test?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 28 Pasi Karkkainen 2010-10-19 20:50:32 UTC
There's F14 beta: ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/test/14-Beta/Fedora/x86_64/os/

Does that help?

Comment 29 Adam Williamson 2010-10-19 21:01:52 UTC
i'm not sure if anaconda 14.19 stage1 can live with stage2 from whatever anaconda we had on 14 Beta, but I can try, I guess...



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 30 Kamil Dudka 2010-10-19 21:40:39 UTC
Created attachment 454451 [details]
gzip compressed cpio archive with patched libcurl (x86_64)

The cpio archive has to be created with 'cpio -H newc', so that Anaconda is able to digest it...

Comment 31 Kamil Dudka 2010-10-19 21:50:36 UTC
So far I can't confirm the Anaconda problem is fixed.  It gives me "(28) FTP response timeout" while downloading ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/test/14-Beta/Fedora/x86_64/os/repodata/repomd.xml in both cases.

If I switch to console and fetch the same URL with urlgrabber, it works just fine.  I suspect that Anaconda refuses to use my patched library for some reason, but have no idea how to check if it does.

Comment 32 Kamil Dudka 2010-10-19 21:51:41 UTC
Created attachment 454454 [details]
gzip compressed cpio archive with patched libcurl (x86_64)

Comment 33 Kamil Dudka 2010-10-19 21:54:19 UTC
Created attachment 454456 [details]
/tmp/anaconda.log from my attempt

Comment 34 Pasi Karkkainen 2010-10-20 18:44:57 UTC
I just burned F14 Beta x86_64 CD1 to a CDR, and booted with "linux askmethod" and tried installing using URL ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/test/14-Beta/Fedora/x86_64/os/ .

The installation failed in the usual way, couldn't fetch rpms from the server.

Then I tried the updates.img from comment #32, but anaconda isn't able to mount it. There's an error about unsupported/invalid filesystem type.

I tried converting that to ext2 image, and then anaconda is able to mount it, but it doesn't seem to copy the included files to the system..

Should there be usr/lib64/ directory on the image, and the .so files in that directory? 

/mnt/runtime seems to be mounted read-only, so I cannot manually copy the libs there during installation..

Comment 35 Kamil Dudka 2010-10-20 18:58:18 UTC
(In reply to comment #34)
> Then I tried the updates.img from comment #32, but anaconda isn't able to mount
> it. There's an error about unsupported/invalid filesystem type.

That's just an error that Anaconda issues to scary users.  It worked for me (means unpacked the files) even with that error in the log, as you can see in the attachment #454456 [details] above.

> Should there be usr/lib64/ directory on the image, and the .so files in that
> directory? 

Nope, they were in /tmp/updates and the directory was included in LD_LIBRARY_PATH.

Comment 36 Pasi Karkkainen 2010-10-20 19:24:42 UTC
Hmm, true, files are copied to /tmp/updates/ .

running "export" in the shell doesn't show /tmp/updates in LD_LIBRARY_PATH though.. ? Maybe it's different inside the anaconda/python session.

I noticed one difference when the updates.img is in use! Anaconda still fails to fetch the rpms, but clicking Retry actually fixes it! It's just a bit annoying to click Retry for every rpm :)

Dunno why it doesn't work on the first try..

Anyway, I think that behaviour is different from without updates.img.. there it couldn't fetch the rpms even if I clicked Retry.

Comment 37 Pasi Karkkainen 2010-10-20 19:30:03 UTC
Running "lsof -n | grep -i curl" shows that anaconda actually is using the library from /tmp/updates/, so that's fine.

After clicking Retry around 20 times I got to an rpm that now fails and Retry doesn't help.. so I guess there's still a problem with curl..

Comment 38 Kamil Dudka 2010-10-20 19:47:03 UTC
It would be good to have a minimal example.  It's nearly impossible to debug the Anaconda code in place.  The original problem reported in comment #0 is addressed by the update, isn't it?

My guess is that Anaconda is doing something different than comment #0 suggests.

Comment 39 Pasi Karkkainen 2010-10-20 20:25:28 UTC
Rpm fetch errors happen during the "Starting installation process" phase.. clicking Retry for every error seemed to work this time.. I had to click Retry *many* times.. and then F14 installed OK.

I have a tshark capture of the installation session here:
http://pasik.reaktio.net/fedora/f14/ftp-problem/f14-test1-funet.tshark

And shorter with only port 21 traffic:
http://pasik.reaktio.net/fedora/f14/ftp-problem/f14-test1-port21.tshark

Take a look at that port21.tshark .. search for "rsyslog-4.6.3-1.fc14.x86_64.rpm".

First anaconda fetched it starting from offset 1384, and later the full file starting from the beginning .. 

Also search for "iputils-20100418-3.fc14.x86_64.rpm", or for "iproute-2.6.35-2.fc14.x86_64.rpm".

First:

12.370205 192.168.0.100 -> 193.166.3.2  FTP Request: MDTM iproute-2.6.35-2.fc14.x86_64.rpm

And then:

 13.533264 192.168.0.100 -> 193.166.3.2  FTP Request: MDTM iproute-2.6.35-2.fc14.x86_64.rpm
 13.562864 192.168.0.100 -> 193.166.3.2  FTP Request: SIZE iproute-2.6.35-2.fc14.x86_64.rpm
 13.568473  193.166.3.2 -> 192.168.0.100 FTP Response: 213 359116
 13.568660 192.168.0.100 -> 193.166.3.2  FTP Request: REST 1384
 13.574364  193.166.3.2 -> 192.168.0.100 FTP Response: 350 Restarting at 1384
 13.574548 192.168.0.100 -> 193.166.3.2  FTP Request: RETR iproute-2.6.35-2.fc14.x86_64.rpm
 13.580205  193.166.3.2 -> 192.168.0.100 FTP Response: 150-Accepted data connection
 13.593034 192.168.0.100 -> 193.166.3.2  FTP Request: ABOR
 13.618404  193.166.3.2 -> 192.168.0.100 FTP Response: 150 0.038 seconds (measured here), 3.30 Mbytes per second
 13.618420  193.166.3.2 -> 192.168.0.100 FTP Response: 226 Since you see this ABOR must've succeeded


And then later:

149.238505 192.168.0.100 -> 193.166.3.2  FTP Request: MDTM iproute-2.6.35-2.fc14.x86_64.rpm
149.394120  193.166.3.2 -> 192.168.0.100 FTP Response: 213 20100818191014
149.394385 192.168.0.100 -> 193.166.3.2  FTP Request: PASV
149.405007  193.166.3.2 -> 192.168.0.100 FTP Response: 227 Entering Passive Mode (193,166,3,2,153,123)
149.411402 192.168.0.100 -> 193.166.3.2  FTP Request: SIZE iproute-2.6.35-2.fc14.x86_64.rpm
149.417034  193.166.3.2 -> 192.168.0.100 FTP Response: 213 359116
149.417234 192.168.0.100 -> 193.166.3.2  FTP Request: RETR iproute-2.6.35-2.fc14.x86_64.rpm
149.422974  193.166.3.2 -> 192.168.0.100 FTP Response: 150-Accepted data connection
149.466546  193.166.3.2 -> 192.168.0.100 FTP Response: 226-File successfully transferred


I wonder if it's that first MDTM that fails and makes Anaconda present that Retry/reboot dialog? I was clicking Retry pretty quickly so the timestamps match with that.. I can't actually remember for sure if I got error dialog about exactly that iproute-2.6.35-2.fc14.x86_64.rpm but probably, since most rpms failed during the "Starting installation process" phase..

Comment 40 Kamil Dudka 2010-10-21 13:16:34 UTC
(In reply to comment #39)
> Rpm fetch errors happen during the "Starting installation process" phase..

You were definitely further.  In my case, Anaconda didn't let me to even add the repo.

> I wonder if it's that first MDTM that fails and makes Anaconda present that

What exactly you mean by "MDTM fails"?

Comment 41 Kamil Dudka 2010-10-21 13:25:21 UTC
Is the error specific to Anaconda?  Could it be somehow reproduced with yum only?

Comment 42 Chris Lumens 2010-10-21 14:51:14 UTC
Attempting to separate out the downloading code is an exercise in madness.  What I usually do is first try things using urlgrabber from within python on tty2 and if that fails to show the problem, I start stepping through the code myself.  It's extremely tedious.

Comment 43 Kamil Dudka 2010-10-21 15:14:10 UTC
As stated in comment #31 , I already tried urlgrabber on tty2.  I picked up the URL that Anaconda GUI was complaining about and gave it to urlgrabber.  It worked fine on tty2.  Any other ways to debug it?

I also tried to set ftp://ftp.funet.fi as update repository on my workstation and perform 'yum update'.  It successfully downloaded bunch of files from there without any problems.

Comment 44 Pasi Karkkainen 2010-10-21 18:07:53 UTC
Kamil: Yeah actually I'm not sure if that "MDTM" fails, but at least that's executed twice.. and time diffence matches what it takes for me to click "Retry". So I was just speculating..

Chris: wget/curl/urlgrabber works OK, because those grab the whole file when you ran them from cmdline. If you check the wireshark capture above (comment 39) you can see the "REST 1384" stuff etc.. so anaconda seems to do various "non-basic" things..

Comment 45 Chris Lumens 2010-10-21 19:04:03 UTC
anaconda downloads the package headers before installing the packages in order to do transaction checks, but so does yum.  anaconda has very little special downloading code in it - most things are just pass-through to yum or handled by yum entirely.

Comment 46 Pasi Karkkainen 2010-10-21 19:16:40 UTC
These failures have *never* happened with yum in an already installed system, but they're 100% reproducible with anaconda installer. I don't really know what's the difference..

Comment 47 Pasi Karkkainen 2010-10-21 19:40:41 UTC
I tried installing F14 Beta again and now it failed in a different way. During the "Starting installation process" phase I get error about rsyslog and iputils rpms, but both of those worked after clicking Retry button. Then selinux-policy-targeted rpm failed, but Retry didn't help for that one.. I clicked Retry multiple times but anaconda couldn't fetch it. 

So now the logs are shorter. Maybe someone understands what's happening. 
"selinux-policy-targeted-3.9.3-1.fc14.noarch.rpm" is the failing rpm that couldn't be fetched, even after multiple Retry attemps.

tshark/wireshark capture:
http://pasik.reaktio.net/fedora/f14/ftp-problem/f14-test2-all.tshark

And shorter version with only port 21 traffic:
http://pasik.reaktio.net/fedora/f14/ftp-problem/f14-test2-port21.tshark

I boot the test-box from F14 Beta x86_64 CD1 by appending "askmethod updates=http://server/updates.img" to the default boot options. I choose URL install and use ftp://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/test/14-Beta/Fedora/x86_64/os/ as the install URL. Should be easy to reproduce by others. I'm using DHCP for Internet access.

Comment 48 seth vidal 2010-10-21 19:41:15 UTC
Yum doesn't download the pkg headers to do file conflict checking. It just downloads the pkgs.

Anaconda downloads the headers and then downloads the pkgs when it needs them.
That is, almost undoubtedly, where the difference lies. B/c the pkg hdr is a byte-range.

Comment 49 Chris Lumens 2010-10-21 19:48:03 UTC
My point is that the header downloading code is not in anaconda.  From pyanaconda/yuminstall.py:

    def downloadHeader(self, po):
        while True:
            # retrying version of download header
            try:
                YumSorter.downloadHeader(self, po)
                break
            except yum.Errors.NoMoreMirrorsRepoError:
                self._handleFailure(po)
            except IOError:
                self._handleFailure(po)
            except yum.Errors.RepoError, e:
                continue

where YumSorter is just yum.YumBase.

Comment 50 seth vidal 2010-10-21 19:55:14 UTC
Yes - I wasn't saying it was anaconda-specific.

I was saying the code which triggers this is only ever exercised by anaconda.

Pasi - I can show you what to call from your system via yum to test the same code path.

Comment 51 Pasi Karkkainen 2010-10-21 20:01:06 UTC
Seth: Ok, feel free :)

Comment 52 Kamil Dudka 2010-10-21 20:06:32 UTC
(In reply to comment #50)
> Pasi - I can show you what to call from your system via yum to test the same
> code path.

Great!  That would be much appreciated.  I am also interested.  TIA.

Comment 53 seth vidal 2010-10-21 20:09:52 UTC
from python or ipython run this (as root)


import yum
my = yum.YumBase()
pkg = my.pkgSack.searchNevra(name='zziplib')[0] # replace zziplib with whatever
my.downloadHeader(pkg)
print(pkg.localHdr())

Comment 54 Pasi Karkkainen 2010-10-25 12:20:37 UTC
Here we go. This is on Fedora 13, no extra patches added to curl. I modified /etc/yum.repos.d/fedora.repo and fedora-updates.repo to only use ftp://ftp.funet.fi mirror.

[root@f13 ~]# python
Python 2.6.4 (r264:75706, Jun  4 2010, 18:20:31)
[GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import yum
>>> my = yum.YumBase()
>>> pkg = my.pkgSack.searchNevra(name='iputils')[0]
Loaded plugins: refresh-packagekit
>>> my.downloadHeader(pkg)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 1917, in downloadHeader
    raise Errors.RepoError, saved_repo_error
yum.Errors.NoMoreMirrorsRepoError: failure: Packages/iputils-20071127-10.fc13.x86_64.rpm from fedora: [Errno 256] No more mirrors to try.
>>> print(pkg.localHdr())
/var/cache/yum/x86_64/13/fedora/headers/iputils-20071127-10.fc13.x86_64.hdr
>>>
[root@f13 ~]#


"my.downloadHeader(pkg)" takes around 1 minute before it errors out.
wireshark/tshark capture of the failing session here:

http://pasik.reaktio.net/fedora/f14/ftp-problem/f13-testyum-all.tshark

Comment 55 seth vidal 2010-10-25 16:56:32 UTC
okay - so we should be able to drill down one layer deeper and make this request happen SOLELY with urlgrabber.

I'm going to see if I can make something wonky happen and post the trial code for you to repeat it.

thanks

Comment 56 seth vidal 2010-10-25 17:09:20 UTC
okay

try this in ipython
import urlgrabber
ug = urlgrabber.grabber.URLGrabber()
help(ug.urlgrab)
ug.urlgrab(url='ftp://ftp.funet.fi/ftp/pub/mirrors/ftp.redhat.com/pub/fedora/linux/releases/13/Everything/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm', reget=None, start=1384, end=18096, size=16712)



I did and i got:
URLGrabError: [Errno 14] Downloaded more than max size for : 18824 > 16712

Comment 57 Kamil Dudka 2010-10-25 17:53:00 UTC
Thanks!  Could you please explain more verbosely what the code is supposed to do?  It does not seem to use CURLOPT_RANGE at all:

gdb -q --args python bz643656.py

(gdb) break curl_easy_setopt
Breakpoint 1 at 0x3db99a8: file easy.c, line 397.

(gdb) run
[Thread debugging using libthread_db enabled]

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_ERRORBUFFER) at easy.c:397
397     {
(gdb) cont

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_PRIVATE) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_NOPROGRESS) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_VERBOSE) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_FTP_ACCOUNT) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_USERAGENT) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_NOPROGRESS) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_NOSIGNAL) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_WRITEFUNCTION) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_FILE) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_HEADERFUNCTION) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_WRITEHEADER) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_PROGRESSFUNCTION) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_PROGRESSDATA) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_FAILONERROR) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_FILETIME) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_FOLLOWLOCATION) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_USERAGENT) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_FOLLOWLOCATION) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_MAXREDIRS) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_CONNECTTIMEOUT) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_LOW_SPEED_LIMIT) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_LOW_SPEED_TIME) at easy.c:397
397     {
(gdb)

Breakpoint 1, curl_easy_setopt (curl=0x82604c0, tag=CURLOPT_URL) at easy.c:397
397     {
(gdb)
[New Thread 0xb7d9db70 (LWP 4318)]
[Thread 0xb7d9db70 (LWP 4318) exited]
[New Thread 0xb7d9db70 (LWP 4319)]
[Thread 0xb7d9db70 (LWP 4319) exited]
Traceback (most recent call last):
  File "bz643656.py", line 3, in <module>
    ug.urlgrab(url='ftp://ftp.funet.fi/ftp/pub/mirrors/ftp.redhat.com/pub/fedora/linux/releases/13/Everything/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm', reget=None, start=1384, end=18096, size=16712)
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 982, in urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 886, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 968, in retryfunc
    fo = PyCurlFileObject(url, filename, opts)
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1063, in __init__
    self._do_open()
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1351, in _do_open
    self._do_grab()
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1481, in _do_grab
    self._do_perform()
  File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1323, in _do_perform
    raise err
urlgrabber.grabber.URLGrabError: [Errno 14] Downloaded more than max size for : 29416 > 16712

Program exited with code 01.

Comment 58 seth vidal 2010-10-25 17:55:26 UTC
okay  - I tested multiple other ftp servers for the same file and using the latest version of urlgrabber and they all output the same error - giving back more data than asked for.

I'm going to be blunt and maybe not kind here but:

Maybe it is time for us to put ftp out to pasture.

Comment 59 seth vidal 2010-10-25 18:01:52 UTC
you're right - that example code was incorrect. I was combining two different interfaces in my head.

Comment 60 seth vidal 2010-10-25 18:03:42 UTC
here's the right example

import urlgrabber

ug = urlgrabber.grabber.URLGrabber()
url = 'ftp://ftp.funet.fi/ftp/pub/mirrors/ftp.redhat.com/pub/fedora/linux/releases/13/Everything/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm'

ug.urlgrab(url=url, reget=None, range=(1384,18096), size=16712)


on non-funet ftp servers it appears to work.

Comment 61 seth vidal 2010-10-25 18:05:02 UTC
and for a bit more explanation

range=(start, end) sets pycurl.RANGE

Comment 62 seth vidal 2010-10-25 18:08:03 UTC
So comment 58 was incorrect b/c i was running the wrong test. Sorry for that.

However, the conclusion in comment 58 may still bear some consideration. Why do we continue supporting ftp?

Comment 63 Kamil Dudka 2010-10-25 18:14:32 UTC
(In reply to comment #60)
> here's the right example
> 
> import urlgrabber
> 
> ug = urlgrabber.grabber.URLGrabber()
> url =
> 'ftp://ftp.funet.fi/ftp/pub/mirrors/ftp.redhat.com/pub/fedora/linux/releases/13/Everything/x86_64/os/Packages/iputils-20071127-10.fc13.x86_64.rpm'
> 
> ug.urlgrab(url=url, reget=None, range=(1384,18096), size=16712)
> 
> 
> on non-funet ftp servers it appears to work.

Right, however that failure seems to be already addressed by attachment #453972 [details] -- I am not able to repeat the failure with updated libcurl:

$ curl -s https://bugzilla.redhat.com/attachment.cgi?id=454454 \
    | gzip -cd \
    | cpio -idv
libcurl.so.4.2.0
pkgconfig
pkgconfig/libcurl.pc
libcurl.so
libcurl.so.4
2340 blocks

$ LD_PRELOAD=./libcurl.so python /tmp/bz643656.py

I have no idea why it works on my box (and Anaconda's tty2), but not with the Anaconda GUI.  Any idea?

Comment 64 Adam Williamson 2010-10-25 20:31:36 UTC
"However, the conclusion in comment 58 may still bear some consideration. Why do
we continue supporting ftp?"

I guess because it's the worst option except for all the others. there aren't a lot of public rsync servers, and http is hardly a better file transfer protocol than ftp...



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 65 seth vidal 2010-10-25 20:36:45 UTC
http is vastly better if only b/c the servers have something that vaguely resembles a standard and handling the error codes from http seems to be at least understood better than ftp.

Comment 67 Kamil Dudka 2010-10-27 09:32:26 UTC
fix proposed upstream:

http://curl.haxx.se/mail/lib-2010-10/0196.html

Comment 68 Pasi Karkkainen 2010-10-27 13:31:43 UTC
Seth: Many mirrors prefer/suggest FTP as the primary/best access method. It's easier to queue multiple files with FTP etc.. 

http mirrors have their problems aswell. I remember virt-install failing because Fedora mirror had customized http error pages for non-existing files..

Anyway, maybe we should keep on topic :)

Kamil: Can we get that proposed fix to curl rpm in Fedora? Is it too late for F14?

Comment 69 Adam Williamson 2010-10-27 21:19:09 UTC
"Is it too late for F14?"

Yes. It went gold yesterday. Practically speaking, we stopped taking non-blocker fixes as soon as RC1 went out.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 70 Kamil Dudka 2010-10-27 22:32:11 UTC
(In reply to comment #68)
> Kamil: Can we get that proposed fix to curl rpm in Fedora? Is it too late for
> F14?

The patch solves the problem you originally reported, so that I considered it useful and proposed it upstream.  It's pending a review now.  If no problem appears with it, I'll build it for Fedora.

As for Anaconda, the patch didn't solve the problem AFAIK.  We should track the issue separately, hopefully with some help of Anaconda developers.  I am not really a python programmer to dive into its code.

Comment 71 Kamil Dudka 2010-10-29 11:14:57 UTC
fixed in curl-7.21.2-3.fc15

Comment 72 Pasi Karkkainen 2010-10-29 13:08:04 UTC
Ok, thanks. I'll try with curl-7.21.2-3.fc15 and see how it goes.

Comment 73 Pasi Karkkainen 2010-10-30 12:11:06 UTC
Here's my test report.

F13 with the following default rpms:
# rpm -qa|grep -i curl
libcurl-7.20.1-4.fc13.x86_64
curl-7.20.1-4.fc13.x86_64
python-pycurl-7.19.0-7.fc13.x86_64

Seth's python test from comment #60 fails with this default F13 setup. After one minute there's the usual timeout error 28 and download fails. 

Then I grabbed curl-7.21.2-3.fc15 from koji and rebuilt it for F13:

# rpm -qa|grep -i curl
libcurl-7.21.2-3.fc13.x86_64
curl-7.21.2-3.fc13.x86_64
python-pycurl-7.19.0-7.fc13.x86_64

And now Seth's test from comment #60 works OK without problems! It downloads 16712 bytes from ftp.funet.fi.

So at least one problem fixed. I wonder if we should re-test with anaconda aswell..

Comment 74 Kamil Dudka 2010-10-30 15:13:52 UTC
Pasi, thank you for testing the package.

(In reply to comment #73)
> So at least one problem fixed. I wonder if we should re-test with anaconda
> aswell..

I am afraid we already did.  It didn't solve the problem, at least not through the updates image.  So there are two possibilities:

1) The problem was solved, but we tested it in a wrong way.

2) There persists another problem that breaks Anaconda downloads.  It may be in Anaconda code, but also in yum, urlgrabber, pycurl, or libcurl.  We don't have any standalone test-case for this one.

Comment 75 Adam Williamson 2010-11-01 23:03:53 UTC
Removing F14 NTH status as F14 is now out.

Remaining open nice-to-have issues do NOT automatically become nice-to-have issues for Fedora 15. If you believe a Fedora 14 issue which was accepted as
nice-to-have but not resolved in time for release should also qualify for
nice-to-have status for Fedora 15, please re-propose it as nice-to-have for
Fedora 15.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 76 Kamil Dudka 2010-11-05 12:52:42 UTC
Pasi, could you please retest with attachment #458054 [details] ?

It works for me with ftp://ftp.funet.fi this time...

Comment 77 Pasi Karkkainen 2010-11-06 12:40:23 UTC
Yeah sure, I can test again. Did you try with F14 final?

Comment 78 Kamil Dudka 2010-11-06 18:14:41 UTC
(In reply to comment #77)
Did you try with F14 final?

yes

Comment 79 Kamil Dudka 2010-11-09 13:39:24 UTC
possibly related issues:

bug 649347
bug 650255

Comment 80 Kamil Dudka 2010-11-09 17:33:18 UTC
fixed in curl-7.21.2-4.fc15

Comment 81 Kamil Dudka 2010-11-10 09:53:36 UTC
Created attachment 459386 [details]
ext2 floppy image with latest libcurl (x86_64)

Comment 82 Pasi Karkkainen 2010-11-14 17:40:08 UTC
I just tried with Fedora 14 (GA) using ftp://ftp.funet.fi.

- Default F14 installation: fails.
- F14 installation with updates.img from comment #81: works OK!

So yes, I can confirm this curl patch fixes the problem with Fedora installation!

Comment 83 Kamil Dudka 2010-11-14 18:16:23 UTC
Great!  Thank you for being patient during all those attempts.

Since installation images are already out, I'll probably postpone the update for F-14 till bug 650255 is also explained and/or fixed.  Please let me know if you need it sooner.

Comment 84 Pasi Karkkainen 2010-11-14 18:32:40 UTC
Kamil: Big thanks to you aswell!

I'm happy to wait until 650255 is sorted out, no hurries.. as long as F15 GA installer will have the fixes included :)

Comment 85 Fedora Update System 2010-11-26 14:31:50 UTC
curl-7.21.0-6.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/curl-7.21.0-6.fc14

Comment 86 Fedora Update System 2010-11-26 15:10:45 UTC
curl-7.20.1-5.fc13 has been submitted as an update for Fedora 13.
https://admin.fedoraproject.org/updates/curl-7.20.1-5.fc13

Comment 87 Fedora Update System 2010-11-26 21:05:06 UTC
curl-7.21.0-6.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update curl'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/curl-7.21.0-6.fc14

Comment 88 Fedora Update System 2010-11-28 20:41:23 UTC
curl-7.21.0-6.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 89 Fedora Update System 2010-12-03 20:38:52 UTC
curl-7.20.1-5.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 90 Kamil Dudka 2010-12-22 00:23:09 UTC
FYI a new FTP command has been proposed by Anthony Bryan, Tatsuhiro Tsujikawa and Daniel Stenberg for exactly this purpose:

http://daniel.haxx.se/blog/2010/12/20/byte-ranges-for-ftp

The IETF draft is available at:

http://www.ietf.org/id/draft-bryan-ftp-range-00.txt


Note You need to log in before you can comment on or make changes to this bug.