Hide Forgot
Description of problem: Has anyone looked at the performance of libcurl based tools using RHEL7.0 and 7.1 when fetching files using FTP. I started investigating this after finding that RHEL7 installs were taking massively longer than RHEL5 or 6 ones. Further investigate showed that the unexpected difference went away when switching to using HTTP instead of FTP to talk to the file server. I then repeated my tests using yum on an installed system. Installing the default base 7.1 configuration and then loading the "Server with GUI" environment on top with yum with FTP results in the download of 890 RPMs taking 7m40s whereas using HTTP the same download from the same server is completed in 7s, a speed difference of over 65times. I believe that yum uses urlgrabber (python-urlgrabber.noarch 3.10-6.el7) which in turn uses pycurl (python-pycurl.x86_64 7.19.0-17.el7) which uses libcurl (libcurl.x86_64 7.29.0-19.el7). So next I tried comparing the behaviour of curl using FTP and HTTP and comparing these with wget and also with RHEL6.4. I used a simple shell loop to download a small (less than 1K file) 10 times. I originally planned to do it 1000 times, but this proved to take far too long. real usr sys RHEL 7.1 curl FTP 0m11.610s 0m0.026s 0m0.041s HTTP 0m1.551s 0m0.017s 0m0.039s wget FTP 0m0.091s 0m0.011s 0m0.040s HTTP 0m0.049s 0m0.014s 0m0.031s RHEL6.4 curl FTP 0m0.099s 0m0.016s 0m0.031s HTTP 0m0.096s 0m0.013s 0m0.042s wget FTP 0m0.087s 0m0.007s 0m0.036s HTTP 0m0.048s 0m0.008s 0m0.028s So comparing curl on RHEL7.1 between FTP and HTTP there is a speed difference of about 7.5. Comparing RHEL7.1 curl and wget for FTP there is a speed difference of 127 times and comparing curl using FTP between RHEL7.1 and 6.4 I see a speed difference of 117 times. All of these measurements where made using HP ProLiant bl460c Gen8 blades using a Broadcom Corporation NetXtreme II BCM57810 running at 10Gb talking to a fileserver running RHEL6.4 with vsftpd and Apache. I have also experienced the same problems with a variety of other HP ProLiant servers using a number of different network cards. This issue significantly impacts the install speed of RHEL7 when using FTP. I run training classes and for years have been used to scripting rebuilds of whole classrooms taking less than 10 minutes. With RHEL7 I was seeing install times of about 1 hour. Version-Release number of selected component (if applicable): The install SW for both RHEL7.0 and 7.1 yum-3.4.3-125.el7.noarch python-urlgrabber.noarch 3.10-6.el7 python-pycurl.x86_64 7.19.0-17.el7 curl-7.29.0-19.el7.x86_64 How reproducible: totally. Steps to Reproduce: 1. write a script like #!/bin/sh typeset -i n=0 while ((n<10)) do curl -s ftp://192.168.73.24/pub/vm1 > /dev/null # curl -s http://192.168.73.24/ftp/pub/vm1 > /dev/null # wget --quiet ftp://192.168.73.24/pub/vm1 # wget --quiet http://192.168.73.24/ftp/pub/vm1 > /dev/null let n=n+1 echo $n done You'll obviously need to use your own file server and filename. 2. time the script 3. try the different download options. Actual results: RHEL 7.1 curl FTP 0m11.610s 0m0.026s 0m0.041s HTTP 0m1.551s 0m0.017s 0m0.039s wget FTP 0m0.091s 0m0.011s 0m0.040s HTTP 0m0.049s 0m0.014s 0m0.031s Expected results: No significant difference between curl and wget, a small difference between FTP and HTTP, like with the wget example. FTP needs to establish a second network connection so is likely to be a little slower on this test. Additional info: I think this is likely to be in libcurl rather than just python-pycurl, but this tool wouldn't let me put libcurl in the component field.
Thanks for the bug report! It seems to be triggered by the following upstream commit: https://github.com/bagder/curl/commit/7cc00d9a ... and I believe that this upstream commit will fix it: https://github.com/bagder/curl/commit/29bf0598
*** Bug 1152628 has been marked as a duplicate of this bug. ***
(In reply to Kamil Dudka from comment #2) > ... and I believe that this upstream commit will fix it: > > https://github.com/bagder/curl/commit/29bf0598 One more upstream commit is needed to make the internal blocking logic work nicely with the FTP protocol implementation: https://github.com/bagder/curl/commit/c4a7ca03
... and this commit will be needed to restore the functionality of HTTP PUT: https://github.com/bagder/curl/commit/0bf5ce77 Upstream tests 154 and 155 tend to hang without the above patch applied.
Thanks everyone, it's great to see so much attention on this. Given the impact it has on the installation of RHEL 7.1 it the fix likely to make it into an ISO at any point? Or am I better just advising students to use HTTP for setting up install servers?
(In reply to Ken Green from comment #12) > Thanks everyone, it's great to see so much attention on this. > Given the impact it has on the installation of RHEL 7.1 it the fix likely to > make it into an ISO at any point? Or am I better just advising students to > use HTTP for setting up install servers? I am not sure about this. Release engineers would give you a precise answer. Nevertheless installation via HTTP sounds like a reasonable default. It will work fast enough regardless the version of libcurl on the installation images.
Install via HTTP works fine. No problems there. But install via FTP is scarily slow. Chapter 2.3 of the RHEL Install guide shows setting up an FTP server as an install source. For an IBM LPAR I think they don't show HTTP (I've never touched a Mainframe). Personally I've always tended to setup my repositories using FTP, partly because I preferred the logging but mostly because wget allows wild cards with FTP but not with HTTP. When researching this problem I found a few HowTo blogs on setting up for installing RHEL7 (or CentOS) which showed using FTP. I can't find any of those currently. But Googling for "RHEL7 network install server" returns http://www.tecmint.com/multiple-centos-installations-using-kickstart/ as the 3rd hit for me this morning. As I said, for years I've setup install servers using FTP, and for RHEL 5 & 6 the installs were typically sub 10mins, the download install phase about 4mins. Suddenly I was seeing times of about an hour, which doesn't show 7 is a good light. It was only when I didn't find other other people complaining that I thought to look further and tried using HTTP and spotted the vast difference in speed.
Thanks for the explanation! I will try to get the fixed version of libcurl on the installation ISO images. However, I cannot guarantee that through Bugzilla. Please open a customer case if it is important for your business.
*** Bug 1225836 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-2159.html
*** Bug 1269086 has been marked as a duplicate of this bug. ***