Bug 2217906

Summary: sos: Python tarfile extraction needs change to avoid a warning (CVE-2007-4559 mitigation)
Product: Red Hat Enterprise Linux 9 Reporter: Petr Viktorin <pviktori>
Component: sosAssignee: Pavel Moravec <pmoravec>
Status: NEW --- QA Contact: Supportability QE <supportability-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.3CC: agk, jcastillo, plambri, sbradley, theute
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2218238 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2218238    

Description Petr Viktorin 2023-06-27 12:41:35 UTC
Hello,
In RHEL 9.3 and 8.9, we're planning to fix the long-standing CVE-2007-4559: Python's `tarfile` module makes it too easy to extract tarballs in an unsafe way.
Unfortunately, for the CVE to be considered fixed, this needs a behavior change. (If you don't think this is the case, let's bring it up with the security team.)
Upstream, Python will emit deprecation warnings for 2 releases, but in RHEL we change the behavior now, emit warnings, and provide ways for customers to restore earlier behavior.
To avoid the warning, software shipped by Red Hat will need a change.

For more details see upstream PEP 706: https://peps.python.org/pep-0706
and the Red Hat knowledge base draft: https://access.redhat.com/articles/7004769

---

In /usr/lib/python3.9/site-packages/sos/cleaner/archives/__init__.py, sos calls `archive.extractall(path). The call will emit a warning by default, and should be changed to something like:

if hasattr(tarfile, 'data_filter'):
    # Python with CVE-2007-4559 mitigation (PEP 706)
    archive.extractall(path, filter='data')
else:
    # Fallback to a possibly dangerous extraction (before PEP 706)
    archive.extractall(path)

The 'data' filter above attempts a "safe" extraction, intended for pure data archives. For example:
- prevents extracting outside the target directory, and to absolute paths (by raising an exception)
- prevents symlinks pointing outside the target directory, and to absolute paths
- adjusts permissions (for the owner, only the executable bit is honored)
See PEP 706 for details: https://peps.python.org/pep-0706/#filters

If you trust that the archive is not malicious, use `filter='fully_trusted'` instead. That will preserve the existing behavior.

---

Let me know if you have any questions!