Common Vulnerabilities and Exposures assigned an identifier CVE-2007-4559 to the following vulnerability: Directory traversal vulnerability in the (1) extract and (2) extractall functions in the tarfile module in Python allows user-assisted remote attackers to overwrite arbitrary files via a .. (dot dot) sequence in filenames in a TAR archive, a related issue to CVE-2001-1267. References: Issue and additional attack vectors were discussed in following thread on python-dev mailinglist: http://mail.python.org/pipermail/python-dev/2007-August/074290.html Upstream bug tracking possible fixes for the issue: http://bugs.python.org/issue1044
Ok, so they seem confused about whether they wanted to fix anything or just define what it currently does as "correct". Also the patches they are proposing only "fix" paths in a tarfile prefixed with "../" or "/" ... they are trying to fix the symlink attacks by just checking the result of the link (which is very different from what GNUtar does). And I'm not sure that's all of the known tar attacks, is it? Summary: . symlinks linking to just ".." work (can be used repeatedly to walk up the tree). . symlinks pointing to absolute paths can't be used. . path checking doesn't check for either "./../foo" or "xyz/../../foo" type attacks. . failure against the security checks results in an exception being thrown. . I'm pretty sure self._check_path(os.path.join(tarinfo.name, tarinfo.linkname)) is wrong, I think they meant "current path inside the tarfile" not "path of symlink" ... so that "foo.html -> ../src/foo.html" works if the link is in a docs directory.
James, thanks for feedback. I've re-checked list of directory traversal vulnerabilities reported against tar in past few years. Not very surprisingly, reported attack vectors use either absolute paths or paths with '..'s. Such path can be used in file/directory name or in (directory) symlink. Additionally, there were some issues specific to implementation of the checks ('/../' detected correctly, but not '//../'). . symlinks linking to just ".." work (can be used repeatedly to walk up the tree). Good point. You seem to be right with that. . symlinks pointing to absolute paths can't be used. All "suspicious" symlinks are ignored / not extracted. . path checking doesn't check for either "./../foo" or "xyz/../../foo" type attacks. It does. Patch is using normpath() to normalize paths, hence both of your example paths get normalized to: ../foo and are tested after normalization. . I'm pretty sure self._check_path(os.path.join(tarinfo.name, tarinfo.linkname)) is wrong, I think they meant "current path inside the tarfile" not "path of symlink" ... so that "foo.html -> ../src/foo.html" works if the link is in a docs directory. Yes, that part seems to be incorrect... I assume tarinfo.name is dir/in/tarfile/symlink. So in your example tarinfo.name would be docs/foo.html and join + normpath would yield docs/src/foo.html ... not correct. Current upstream opinion seems to be that module's behavior conforms to standards and it just should not be used to extract archives from untrusted sources.
> . symlinks pointing to absolute paths can't be used. > All "suspicious" symlinks are ignored / not extracted. Right, it's just I could see it being "common" to have symlinks to /home/* or /var/*. So people might treat that as a regression. But, yeh, it's also possible people have a use for tarfiles with symlinks to ../foo in them. > . path checking doesn't check for either "./../foo" or "xyz/../../foo" type attacks. > It does. Patch is using normpath() Ahh, my bad, I don't see how that can do the right thing in the general case but it'll be secure :). > Current upstream opinion seems to be that module's behavior conforms to standards and it just should not be used to extract archives from untrusted sources. Fair enough, I'm not sure we should care anymore than upstream then (although maybe maybe more specifically say don't use the module with any tarfiles you haven't created).
Upstream bug report resolved with following update to documentation: Never extract archives from untrusted sources without prior inspection. It is possible that files are created outside of 'path', e.g. members that have absolute filenames starting with "/" or filenames with two dots "..". There probably not much we can do without diverging significantly from upstream version.
Upstream has resolved this issue (http://bugs.python.org/issue1044#msg55464): "After careful consideration and a private discussion with Martin I do no longer think that we have a security issue here. tarfile.py does nothing wrong, its behaviour conforms to the pax definition and pathname resolution guidelines in POSIX. There is no known or possible practical exploit. I update the documentation with a warning, that it might be dangerous to extract archives from untrusted sources. That is the only thing to be done IMO." And the documentation "fix": http://svn.python.org/view/python/trunk/Doc/library/tarfile.rst?r1=57764&r2=57763&pathrev=57764 If upstream does not feel this is a security issue, as stated in comment #4, neither should we.
Due to recent media attention about this vulnerability upstream is considering fixing this issue. Upstream discussion about the fix: https://github.com/python/cpython/issues/73974
Created mingw-python3 tracking bugs for this issue: Affects: fedora-all [bug 2141078] Created python2.7 tracking bugs for this issue: Affects: fedora-all [bug 2141079] Created python3.10 tracking bugs for this issue: Affects: fedora-all [bug 2141084] Created python3.11 tracking bugs for this issue: Affects: fedora-all [bug 2141085] Created python3.12 tracking bugs for this issue: Affects: fedora-all [bug 2141086] Created python3.6 tracking bugs for this issue: Affects: fedora-all [bug 2141080] Created python3.7 tracking bugs for this issue: Affects: fedora-all [bug 2141081] Created python3.8 tracking bugs for this issue: Affects: fedora-all [bug 2141082] Created python3.9 tracking bugs for this issue: Affects: fedora-all [bug 2141083] Created python34 tracking bugs for this issue: Affects: epel-all [bug 2141077]
I've posted a technical solution for upstream approval: https://peps.python.org/pep-0706/ (It is important, in the long term, that this is coordinated with the larger Python ecosystem.) For RHEL we're planning additional (system-specific) ways to configure the default. Refusing to unpack some (potentially dangerous) tarballs is, obviously, a behavior change, so there's a trade-off between security vs. backwards compatibility.
This issue has been addressed in the following products: Red Hat Enterprise Linux 9 Via RHSA-2023:6324 https://access.redhat.com/errata/RHSA-2023:6324
This issue has been addressed in the following products: Red Hat Enterprise Linux 9 Via RHSA-2023:6494 https://access.redhat.com/errata/RHSA-2023:6494
This issue has been addressed in the following products: Red Hat Enterprise Linux 9 Via RHSA-2023:6659 https://access.redhat.com/errata/RHSA-2023:6659
This issue has been addressed in the following products: Red Hat Enterprise Linux 9 Via RHSA-2023:6694 https://access.redhat.com/errata/RHSA-2023:6694
This issue has been addressed in the following products: Red Hat Software Collections for Red Hat Enterprise Linux 7 Via RHSA-2023:6793 https://access.redhat.com/errata/RHSA-2023:6793
This issue has been addressed in the following products: Red Hat Enterprise Linux 8 Via RHSA-2023:6914 https://access.redhat.com/errata/RHSA-2023:6914
This issue has been addressed in the following products: Red Hat Enterprise Linux 8 Via RHSA-2023:7024 https://access.redhat.com/errata/RHSA-2023:7024
This issue has been addressed in the following products: Red Hat Enterprise Linux 8 Via RHSA-2023:7034 https://access.redhat.com/errata/RHSA-2023:7034
This issue has been addressed in the following products: Red Hat Enterprise Linux 8 Via RHSA-2023:7050 https://access.redhat.com/errata/RHSA-2023:7050
This issue has been addressed in the following products: Red Hat Enterprise Linux 8 Via RHSA-2023:7151 https://access.redhat.com/errata/RHSA-2023:7151
This issue has been addressed in the following products: Red Hat Enterprise Linux 8 Via RHSA-2023:7176 https://access.redhat.com/errata/RHSA-2023:7176