Created attachment 457469 [details]
FTP server network trace during install attempt
Description of problem:
Anaconda fails to retrieve filelists.xml.gz from FTP repository and doesn't send FTP request to server, but simply reports [Error 12] Timeout
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Started F14 via PXE boot with kickstart file pointing at local ftp server
2. After making package selection, starts to check package dependencies
3. Fails to retrieve filelists
Graphical window: "Unable to read package metadata."
Log console: Try 1/10 for ftp://....filelists.xml.gz failed: [Error 12] Timeout on ....
Downloads from FTP and continues installation.
Wireshark trace attached.
Having changed the kickstart url from "ftp://ftp//install/F14/...." to "ftp://ftp.cfs/install/F14/...." (fqdn and no extra forward slash after host name), this now seems to be working.
However, I'm still mystified by the network behaviour which clearly saw anaconda connect to the correct FTP server, but then fail to issue a request....?
I'm assigning this to curl since pycurl is pretty much just a thin shim over curl.
OK. Now its failing to download the first RPM. Seems to be a problem when the installation takes a while (because I'm choosing the specific packages to install). My package selection takes about 200 seconds, with no FTP traffic, then the client sends FTP "Request: QUIT" followed by [RST, ACK], followed by a new [SYN] connection.
Created attachment 457485 [details]
New network trace
Adding attachment showing connection history between client (.118) and FTP server (.121)
It may be already fixed:
Are you able to test the following updates image? attachment #454454 [details]
I've tried the updates image bz643656.img and this started retrieving RPMs but again fails, this time with pam_pkcs11. The anaconda log reports:
Try 1/10 for ... failed: [Errno 14] FTP Error 229
I'll attach another network trace.
Created attachment 457738 [details]
Network trace with bz643656.img
Note that the server is running vsftpd-2.0.5-16.el5_5.1 from CentOS 5.5
Created attachment 457741 [details]
Network trace on Retry
Attaching network trace of what happens when clicking "Retry" in anaconda GUI. Doesn't seem to send _any_ FTP requests for the supposedly corrupt/missing file.
Chris, are you able to narrow it down to a curl-based minimal example?
Another symptom might have been reported in https://bugzilla.redhat.com/show_bug.cgi?id=623067
Duplicate issue: https://bugzilla.redhat.com/show_bug.cgi?id=624431
the corresponding thread at upstream ML:
Created attachment 457943 [details]
gzip compressed cpio archive with patched libcurl (x86_64)
Thanks in advance for testing it!
I getting the same behaviour as originally; can't FTP the filelists.xml.gz. I'm not sure the updates mechanism is working.
09:08:49,828 INFO loader: file location: ftp://ftp/install/F14/x86_64/images/bz649347.img
09:08:49,828 INFO loader: transferring ftp://ftp/install/F14/x86_64/images/bz649347.img
09:08:49,871 ERR loader: failed to mount loopback device /dev/loop7 on /tmp/update-disk as /tmp/updates.img: mount: you must specify the filesystem type
09:08:50,065 INFO loader: Running anaconda script /usr/bin/anaconda
But /tmp/updates does contain the new libcurl libraries (.so size 402456 bytes). How do I know if anaconda is using the new updates?
Created attachment 458038 [details]
New network trace with latest updates image
Latest network trace for start of installation phase. Noticed that the first packet is a FTP QUIT from a previous connection that must have been idle for the 3 or 4 minutes that I took choosing packages, etc.
(In reply to comment #16)
> 09:08:49,871 ERR loader: failed to mount loopback device /dev/loop7 on
> /tmp/update-disk as /tmp/updates.img: mount: you must specify the filesystem
The error message above is expected. I guess Anaconda first tries to mount ext2 using a loopback device. If it fails, it tries cpio ... but I am just guessing, Anaconda is a black box for me.
(In reply to comment #17)
> Created attachment 458038 [details]
> New network trace with latest updates image
> Latest network trace for start of installation phase. Noticed that the first
> packet is a FTP QUIT from a previous connection that must have been idle for
> the 3 or 4 minutes that I took choosing packages, etc.
I can't see anything wrong in the trace, but it reminded me that we should probably send QUIT after ABOR before we close the connection. I'll send an updated patch to the upstream mailing list.
Any chance to repeat the failure out of Anaconda? It's nearly impossible to debug it in there...
Strange, I've just successfully installed F-14 from ftp://ftp.funet.fi with that image. It wasn't possible with the previous one. Is there any public FTP server I can try the update against?
Yes, I've just completed a successful install upon a second attempt. I'm now trying the reproduce the error ;-)
Created attachment 458054 [details]
ext2 floppy image with patched libcurl (x86_64)
The attached image uses the latest version of the patch and should avoid mount error in anaconda's log.
Created attachment 458056 [details]
Network trace showing timeout-related FTP issue
OK, I've reproduced the problem and it seems to be related to a timeout.
1. Start installation process, last to download from FTP is *-Fedora-14-comps.xml.gz
2. No ABOR is sent, and the RETR finishes with server "226 File Send OK".
3. FTP session remains connected for another 5 minutes whilst I make package selection/twiddle thumbs
4. 300 seconds later the server sends "421 Timeout" and pushes [FIN, ACK].
5. Eventually I finish package selection. Client sends QUIT request to a closed server port and gets a [RST] response.
6. Client sends new SYN packet to FTP, but now seems to be out of sync with expected server responses and never sends a RETR
If I'm quick and never let the server timeout, installation is successful.
"Retry" button in anaconda make no difference, the client never sends any requests/login to the server after the initial connection.
I'll bump the local servers idle_session_timeout for now...
I can confirm that with the latest updates image,
the timeout issue still remains. If the server session timeout triggers, libcurl doesn't seem to be in the correct state.
If there is no ABOR sent, the patch can't make any difference. It's simply yet another issue. We should track it separately...
I agree, this is an orthogonal issue to the ABOR, but it probably is the root cause of the particular symptoms I observed. My FTP server never original hung on the disconnects unlike the Pure-FTPD of https://bugzilla.redhat.com/show_bug.cgi?id=643656.
The timeout problem explains my first and third comments; https://bugzilla.redhat.com/show_bug.cgi?id=649347#c1
(In reply to comment #26)
> I agree, this is an orthogonal issue to the ABOR, but it probably is the root
> cause of the particular symptoms I observed. My FTP server never original hung
> on the disconnects unlike the Pure-FTPD of
Do you mean it works also without any updates image, if only you are fast enough in selection of the packages to install?
(In reply to comment #27)
> Do you mean it works also without any updates image, if only you are fast
> enough in selection of the packages to install?
Yes, I did manage to install successfully once when not customizing packages. It's always confusing when there are two separate bugs. The original updates image which I assumed would help, https://bugzilla.redhat.com/attachment.cgi?id=454454 , failed not because of the timeout (when trying to download filelists.xml.gz), but when trying to download some packages (crontab or pam-something), because of the new ABOR.
So to clarify, the original symptoms were most likely caused by the server session timeout not being handled gracefully by curl (can't start installation). The other symptoms (can't finish installation), were cause by the ABOR patch....
So the patch fixes the ABOR issue successfully (without re-introducing bug 643656). Please open a new ticket for the timeout issue with some steps to reproduce. I'll give it a try next week.
New bug at:
For now, I can successfully install having raised the FTP idle timeout. I'm using your latest patch without any issues.
Thank you for opening the new bug. I'll get to it probably later this week. For now, I've pushed the fix for the already resolved one:
fixed in curl-7.21.2-4.fc15
Created attachment 459386 [details]
ext2 floppy image with latest libcurl (x86_64)
curl-7.21.0-6.fc14 has been submitted as an update for Fedora 14.
curl-7.20.1-5.fc13 has been submitted as an update for Fedora 13.
curl-7.21.0-6.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update curl'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/curl-7.21.0-6.fc14
curl-7.21.0-6.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.
curl-7.20.1-5.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.
FYI a new FTP command has been proposed by Anthony Bryan, Tatsuhiro Tsujikawa and Daniel Stenberg for exactly this purpose:
The IETF draft is available at: