Bug 431757
Summary: | caching needed when "Starting install process." | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | John Reiser <jreiser> | ||||||||||||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||
Severity: | low | Docs Contact: | |||||||||||||||||
Priority: | low | ||||||||||||||||||
Version: | 9 | CC: | dtimms, wwoods | ||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | i386 | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2008-08-01 15:36:12 UTC | Type: | --- | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Bug Depends On: | |||||||||||||||||||
Bug Blocks: | 235705 | ||||||||||||||||||
Attachments: |
|
Description
John Reiser
2008-02-06 19:44:38 UTC
Created attachment 294135 [details]
anaconda.log
Created attachment 294136 [details]
X.log
The box already runs Fedora 8 i386 and x86_64 and Ubuntu 7.04, all installed in the last 7 days. Created attachment 294146 [details]
syslog
Created attachment 294147 [details]
anaconda.log
Created attachment 294148 [details]
X.log
Package install started after 20 minutes. TWENTY MINUTES. Something is *horribly* slow. 2.5GHz (Athlon X2 4800+) and 2GB RAM is not the fastest box today, but it is a very long ways away from the advertised minimums. I *suspect* that your latency was network based. We have to do a lot of header grabbing to actually start the install. And that's not exactly fast Created attachment 294659 [details] follow tcp stream - child died (In reply to comment #8) > I *suspect* that your latency was network based. We have to do a lot of > header grabbing to actually start the install. And that's not exactly fast If it is going to take more than 1 second - please give good tui/gui. Eg. retrieved header 1 of 1345, or a progress bar. Est time remaing after the first two items complete...to give some impression. ==== However, my opinion: Based on a vmware-server 1.0.4 test on a F8 kernel-2.6.23.14-115.fc8 host. My machine is still saying "Starting Install Process - this may take several minutes" after 3 hours. On VT3, I can see WARNING : Try 1/10 ftp://gg/g.rpm Failed. This seems to be happening every couple of files from download.fedora.redhat.com Since it is in a vm, I ran wireshark on the host and see: wait {30 seconds} ftp login|cd|test|RESU {4 seconds <1-2kB/sec via gkrellm} ftp data {< 2 seconds 5-20kB/sec} - repeat wait {30 seconds} and so on. Note: 1. the ftp connection is not kept alive. it logs in|cd's back to the same folder for every file. 2. often the "follow tcp stream" shows: - 500 OOPS: vsf_sysutil_recv_peek: no data - 500 OOPS: child died during the download. In this case the data is downloading nicely, when the above occurs. capture shows the ftp client sending a heap eg 10-30 TCP RSTs, which the FTP server ignores for a bit, then aborts transfer. The combination looks like it will take the machine all night++ to complete. Very occasionally I see: 421 There are too many connections from your internet address. in the ftp stream. However, the client waits only 0.1s of seconds, connects logs in and starts the cd|RESU process. John: did you install actually complete ? did you give it 8 hours or more ? I succesfully completed two installs using the original F-9(Alpha) rescue CD and http. After the clue about network latency (Comment #8 above), I watched the activity light on the ethernet port during "Starting install process." There was an obvious cycle of about 2 or 3 seconds in length, repeated over and over. The number of packages is about 870, which would be 30 to 45 minutes for all the <package>.hdr. [Perhaps the remote end got faster from time to time.] This is an obvious candidate for pipelining, parallelism (3 or 4 threads), or for aggregating the *.hdr into a single file on the distribution medium in advance. (In reply to comment #9) For some ranges of network difficulties (such as moderate traffic congestion on route, high load on server or client, etc.) then http can be noticeably faster than ftp. http requires fewer network round trips per file (example: no separate 'cd' command; the full path is in the original http GET request) and has less state in a connection. Sometimes http servers degrade more gracefully (they just increase latency) than ftp servers (total cutoff based on number of simultaneous connections.) (In reply to comment #10) > The combination looks like it will take the machine all night++ to complete. The install completed; end=06:50, start=21:30 ie: 9hr:20. Installed size=2.5GB. In a rough way that works out to 595kbit/sec for the duration {on a 10Mb adsl connection that is traversing the pacific}. For an additional data point, I will repeat with nearest .au mirror, then from internal web server. Attaching strace [strace -o "|gzip -dc >strace.out.gz" -p <pid_of_anaconda>] and analyzing the output suggests: 1. Cache the patience of the human by using a progress bar (bug #186789). 2. Re-use the connection for more than one .hdr; do not close the connection every time. Put a limit of something like one minute elapsed or 20 .hdr per connection on this caching. 3. Cache the DNS: don't translate the mirror name into numeric IPx for every .hdr. Even if there is a caching DNS server on the same local network, communicating with it is slow; and in many cases the nearest DNS is even slower because it is via DSL or cable modem. Put a limit of something like five minutes on this caching. 4. Cache CPU time: use a 64-character local buffer to read http and ftp command protocol. The current 1-character reads on http require two system calls per character, which is slower than a 150KB/s ADSL network! (Evidence: the first 8192-byte read for payload [after the http GetPartialContent protocol overhead] often returns more than the 1388 bytes [hint: MTU] of subsequent reads.) 5. On DVD or CD, optimize usage of the device's buffer (often 2MB sequential) and of the head position. This can be done by splitting each .rpm into two File Sections [see 6.5.1 of the Ecma-119.pdf ISO 9660 specification], with the first section [extent] covering the .hdr sectors. Group the .hdr sections together sequentially; put the remainder of each file somewhere else. This will reduce seeks dramatically (bug #352481). 6. Fix .rpm to allow compressed ChangeLog and compressed FileList. These are *not* compressed in current .rpm! Created attachment 295440 [details]
strace for one .hdr while "Starting install process"
Here is current system call activity (763 calls) to retrieve one .hdr via http.
Note the DNS resolution of mirrors.kernel.org, reading the http response using
a separate poll+recv for each character, payload read gets 3844 bytes the first
time and only 1388 the second, connection closed immediately.
#1: Done. Since there's a progress bar for this now, I wouldn't consider it a release blocker. But there's still some work that could be done here. I suspect most of the work that will need to happen will be deep in the bowels of urlgrabber. We might even consider moving to pycurl.. but these are both F10 items, at the earliest. I'm moving this to F9Target, in hopes that someone will find a simple speedup that we can use for F9. Oh yeah - see also bug 344301, regarding caching headers from the CD. and bug 344301 about poor FTP performance. Oops, sorry - bug 352481 is the one about CD caching. It was already mentioned in comment #14. Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Item 1 has been taken care of, and we will consider any patches you submit for items 3, 4, and 5. Please submit items 2 and 6 against the proper components (urlgrabber and rpm, respectively). |