Bug 286161

Summary: "wget --recursive" only retrieves index.html instead of all files
Product: Red Hat Enterprise Linux 5 Reporter: Paul Bijnens <paul.bijnens>
Component: wgetAssignee: Karsten Hopp <karsten>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 5.0CC: acase, berryd, pertusus, pknirsch, psplicha, rlerch, syeghiay
Target Milestone: ---Keywords: Rebase
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Rebase: Bug Fixes and Enhancements
Doc Text:
Major features + changes between wget 1.10.2 (RHEL5.3) and rebased wget 1.11.4: - The combination of -r or -p with -O, emits a warning as all downloaded content will be placed in the single file specified with -O - The combination of -N with -O, emits a warning as timestamping does nothing in combination with -O - Wget 1.11 no longer alloweds ".." to persist at the beginning of URLs, for improved conformance with RFC 3986. As this behavior presents problems for some FTP setups this is still allowed for FTP URLs only. - No authentication credentials are sent until a challenge is issued, for improved security. - Added option --auth-no-challenge to support some obscure servers which never send HTTP authentication challenges, but accept unsolicited auth info. - Timestamping now uses the value from the most recent HTTP response, rather than the first one it got. - Authentication information is no longer sent as part of the Referer header in recursive fetches. - Added --max-redirect option, allowing the user to specify what should be the maximum number of HTTP redirects to follow. - Wget now supports saving HTTP downloads using file names specified by the `Content-Disposition' header. - The new option `--ignore-case' makes Wget ignore case when matching files, directories, and wildcards. This affects the -X, -I, -A, and -R options, as well as globbing in FTP URLs.
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 09:38:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Bijnens 2007-09-11 15:19:05 UTC
Description of problem:

"wget --recursive ..." (fails to download any files.  It download the directory
listing only, and stops.  Using exact same args on RHEL4 works.

Version-Release number of selected component (if applicable):
wget-1.10.2-7.el5
(working version on RHEL4: wget-1.10.2-0.40E)

How reproducible:
Invoke recusive download of some small directory:


Steps to Reproduce:
1.
wget --recursive
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/x86_64/Debuginfo/repodata/

2.
3.
  
Actual results:

one file named "index.html"

Expected results:

All the files in the directory, just as RHEL4 version of wget did.

Additional info:

Comment 1 Paul Bijnens 2007-09-17 15:48:00 UTC
Probably same as bug 239420 in fc6.

Comment 2 Phil Knirsch 2008-04-28 13:30:10 UTC
Proposing a rebase for wget to version 1.11 for RHEL-5.3 and granting Devel ACK.
The wget-1.10.2-to11.patch is basically the problem, bringing wget-1.10.2 close
to wget-1.11. A rebase will fix this and several other issues that the pre11
patch had.

Read ya, Phil


Comment 3 Karsten Hopp 2008-07-15 09:26:50 UTC
Will be fixed together with #387321 and #436822, but a rebase is not allowed in
FastTrack. Moved to 5.4


Comment 4 Phil Knirsch 2008-12-16 13:12:00 UTC
Marking as a higher prio and severity as the current state of wget in RHEL-5 needs definite improvement and a rebase to wget-1.11 will cover all those bugs and more.

Adding keyword Rebase as well.

Thanks & regards, Phil

Comment 9 Karsten Hopp 2009-06-26 10:53:39 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Major features + changes between wget 1.10.2 (RHEL5.3) and rebased
wget 1.11.4:

- The combination of -r or -p with -O, emits a warning as all downloaded
  content will be placed in the single file specified with -O

- The combination of -N with -O, emits a warning as timestamping does nothing
  in combination with -O

- Wget 1.11 no longer alloweds ".." to persist at the beginning of URLs,
  for improved conformance with RFC 3986. As this behavior presents
  problems for some FTP setups this is still allowed for FTP URLs only.

- No authentication credentials are sent until a challenge is issued,
  for improved security.

- Added option --auth-no-challenge to support some obscure servers which
  never send HTTP authentication challenges, but accept unsolicited auth info.

- Timestamping now uses the value from the most recent HTTP response,
  rather than the first one it got.

- Authentication information is no longer sent as part of the Referer
  header in recursive fetches.

- Added --max-redirect option, allowing the user to specify what should
  be the maximum number of HTTP redirects to follow.

- Wget now supports saving HTTP downloads using file names specified by
  the `Content-Disposition' header.

- The new option `--ignore-case' makes Wget ignore case when
  matching files, directories, and wildcards.  This affects the -X, -I,
  -A, and -R options, as well as globbing in FTP URLs.

Comment 12 errata-xmlrpc 2009-09-02 09:38:34 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1280.html