Bug 445862 - Better url sanitizing during ftp/http installs
Better url sanitizing during ftp/http installs
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: anaconda (Show other bugs)
All Linux
high Severity high
: rc
: ---
Assigned To: Anaconda Maintenance Team
: Reopened
: 411931 452484 (view as bug list)
Depends On: 482952
Blocks: 391501
  Show dependency treegraph
Reported: 2008-05-09 10:34 EDT by Adam Stokes
Modified: 2010-10-22 20:52 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-09-03 12:56:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Adam Stokes 2008-05-09 10:34:57 EDT
Description of problem:
  I cannot install the PV guest domain since the installation is stopped at
 package installation when I prepared the following network server and I
 used second one for creating the PV guest domain of RHEL-5.2.
    As a ftp server:
    (1) ftp://blah.bah/test/work1
        -> no image is mounted
    (2) ftp://blah.bah/test/work2
        -> DVD iso image of RHEL5.2 is mounted
    (3) ftp://blah.bah/test/work3
        -> no image is mounted
 I saw the following repeated error message at 'anaconda.log':
    WARNING : Try x/10 for ftp://blah.bah/%2F/test/work1/Server/e
macs-leim-21.4-20.el5.i386.rpm faild
 It seems the installer is searching 'emacs-leim' package from 'work1'
 though I specified 'work2' as --location option.
 I guess that the installer might be searching with the following sequence,
 and that is why the installation is failed,
    first  -> work1 (contents in no image)
    second -> work2 (contents in RHEL5.2 iso image)
    third  -> work3 (contents in no image)

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. create /pub/{work1,work2,work3}
2. mount iso on work2
3. install using ftp://blah.bah/pub/work2
Actual results:
06:53:08 WARNING : Try 1/10 for
ftp://blah.blah/%2F/test/work1/Server/emacs-leim-21.4-20.el5.i386.rpm failed
06:53:08 WARNING : Try 2/10 for
ftp://blah.bah/%2F/test/work1/Server/emacs-leim-21.4-20.el5.i386.rpm failed

Expected results:
Resolve dependencies.

Additional info:
Appending a / to end of ftp works around this


I've noticed some url sanitizing in upstream code


/* convert a UI to a URL, returns allocated string */
char *convertUIToURL(struct iurlinfo *ui) {
   char *login, *finalPrefix, *url, *p;

   if (!strcmp(ui->prefix, "/"))
       finalPrefix = "/.";
       finalPrefix = ui->prefix;

   login = "";
   login = getLoginName(login, ui);

   url = malloc(strlen(finalPrefix) + 25 + strlen(ui->address) + strlen(login));

   /* sanitize url so we dont have problems like bug #101265 */
   /* basically avoid duplicate /'s                          */
   if (ui->protocol == URL_METHOD_HTTP) {
       for (p=finalPrefix; *p == '/' && *(p+1) && *(p+1) == '/'; p++);
       finalPrefix = p;

   sprintf(url, "%s://%s%s%s",
           ui->protocol == URL_METHOD_FTP ? "ftp" : "http",
           login, ui->address, finalPrefix);

   return url;

I think it would be benenficial to backport this into previous anaconda versions
as the way we encode url's now is kind of crazy.

Comment 1 Chris Lumens 2008-06-17 15:47:15 EDT
Unfortunately we're running up against the devel capacity limit for 5.3, so I am
going to have to NAK this bug for now.  We can consider it for 5.4, however,
since we'll have lots of the current high priority issues cleared out and should
have some capacity freed up.
Comment 2 RHEL Product and Program Management 2008-06-17 15:58:31 EDT
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request. 
Comment 3 Chris Lumens 2008-07-11 15:55:21 EDT
*** Bug 393891 has been marked as a duplicate of this bug. ***
Comment 4 Chris Lumens 2008-07-11 21:54:06 EDT
*** Bug 411931 has been marked as a duplicate of this bug. ***
Comment 5 Chris Lumens 2008-08-29 15:01:35 EDT
*** Bug 452484 has been marked as a duplicate of this bug. ***
Comment 7 Jeff Bastian 2009-01-28 14:12:39 EST
From urlinstall.py in anaconda-

    def __checkUrlForIsoMounts(self):
        # account for multiple mounted ISOs on loopback...bleh
        # assumes ISOs are mounted as AAAAN where AAAA is some alpha text
        # and N is an integer.  so users could have these paths:
        #     CD1, CD2, CD3
        #     disc1, disc2, disc3
        #     qux1, qux2, qux3
        # as long as the alpha text is consistent and the ints increment
        # NOTE: this code is basically a guess. we don't really know if
        # they are doing a loopback ISO install, but make a guess and
        # shove all that at yum and hope for the best   --dcantrell

This describes the above setup exactly where AAAA = "work" and N is 1, 2, and 3:

Will fixing this bug mean breaking the above setup?
Comment 8 Jeff Bastian 2009-01-28 17:43:16 EST
This may not be a problem with URL sanitizing after all.

In urlinstall.py, __checkUrlForIsoMounts() tries to check if there's a .discinfo file in the AAAAN directory, and if so, it assumes that it's one of the CD ISO images and adds it to the list of base URLs.  But, if there's no .discinfo file, then it skips the directory.

                while True:
                        filename = self.__copyFileToTemp(dirpath, ".discinfo",
                    baseurls.append("%s" % (dirpath,))

Well, it's not skipping the directory.

The ftp://blah.bah/test/work1 directory is empty -- there is no .discinfo file nor any other file -- and yet it still gets added to the list.

I added some extra debugging statements and got:
22:28:22 DEBUG   : __copyFileToTemp(ftp://rx2620.gsslab.rdu.redhat.com/%2F/pub/work1, ".discinfo")
22:28:22 WARNING : Unable to find temp path, going to use ramfs path
22:28:22 DEBUG   : Trying urlretrieve(ftp://rx2620.gsslab.rdu.redhat.com/%2F/pub/work1/.discinfo, /tmp//.discinfo)
22:28:22 DEBUG   : grabber.urlopen(ftp://rx2620.gsslab.rdu.redhat.com/%2F/pub/work1/.discinfo)
22:28:22 DEBUG   : no problems with grabber.urlopen(ftp://rx2620.gsslab.rdu.redhat.com/%2F/pub/work1/.discinfo)
22:28:22 DEBUG   : unlinked file /tmp//.discinfo
22:28:22 DEBUG   : Adding baseurl: ftp://rx2620.gsslab.rdu.redhat.com/%2F/pub/work1

The "no problems with grabber.urlopen(...)" log entry means that grabber.urlopen() did not raise an exception.  So, either the urlgrabber Python module has a bug, or anaconda is not catching the exception correctly.

I'll keep investigating.
Comment 9 Jeff Bastian 2009-01-28 18:14:39 EST
This does indeed appear to be a problem with the urlgrabber module.

The short Python script below tries to use the urlgrabber module to get the non-existing .discinfo file.  It does not raise any exceptions, and it creates an empty .discinfo file.

    $ ./get-discinfo.py
    Saved as .discinfo
    $ cat .discinfo

And as proof the file really does not exist:

    $ lftp -e 'get pub/work1/.discinfo; exit' rx2620.gsslab.rdu.redhat.com
    get: Access failed: 550 Failed to open file. (pub/work1/.discinfo)


from urlgrabber import urlgrab

        fn = urlgrab("ftp://rx2620.gsslab.rdu.redhat.com/pub/work1/.discinfo")
except URLGrabError, e:
        print e.strerror
        print "unexpected error with urlopen"

print("Saved as %s" % (fn))
Comment 10 Jeff Bastian 2009-01-28 18:27:55 EST
See bug 482952 for the urlgrab bug from comment #9
Comment 11 Chris Lumens 2009-09-03 11:57:57 EDT
In light of comment #9 and comment #10, is this still a valid anaconda bug?
Comment 13 Chris Lumens 2009-09-03 12:56:20 EDT
Yeah, it looks like the underlying problem here is the lack of an exception coming up from urlgrab.  Let's close this one out for now since there's a bug filed in the right place to track that, and if we require additional anaconda work this one can always get reopened later.

Note You need to log in before you can comment on or make changes to this bug.