Bug 1759140
Summary: | Tar extraction consumes several GB of memory as if a memory leak was occuring | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Renaud Métrich <rmetrich> | ||||
Component: | tar | Assignee: | Ondrej Dubaj <odubaj> | ||||
Status: | CLOSED NOTABUG | QA Contact: | RHEL CS Apps Subsystem QE <rhel-cs-apps-subsystem-qe> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.7 | CC: | databases-maint, odubaj, panovotn, praiskup | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-12-04 06:41:11 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Renaud Métrich
2019-10-07 13:16:19 UTC
Digging further, it appears that the issue can be worked around by extracting the archive using "-P" flag ("--absolute-names", don't strip leading `/'s from file names). In such case, no placeholder is created and symlink is created immediately, causing the symlink to be "dangling" until resolved, but this is not an issue. The code creating the placeholder has always been there (at least since 2005 with a refactoring). Apparently, when not using "-P" flag, there are cases where symlinks are "potentially dangerous", but I don't understand exactly why: Related commit doing the refactoring (but previously there was already that code, it was just not integrated into a new function named "create_placeholder_file"): -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- commit af5d05729ade7f6d6f1552df31bce4c3dbc37247 Author: Paul Eggert <eggert.edu> Date: Mon Sep 12 18:45:59 2005 +0000 Treat fishy-looking hard links like fishy-looking symlinks. (struct delayed_set_stat): Rename after_symlinks member to after_links. All uses changed. (struct delayed_link): Renamed from struct delayed_symlink. All uses changed. New member is_symlink. (delayed_link_head): Renamed from delayed_symlink_head. All uses changed. (create_placeholder_file): New function, taken from extract_symlink. (extract_link): Create placeholders for fishy-looking hard links. (extract_symlink): Move code into create_placeholder_file. (apply_delayed_links): Renamed from apply_delayed_symlinks. All uses changed. Create both hard links and symlinks. -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Related new code (refactoring): -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- static int extract_symlink (char *file_name, int typeflag) { #ifdef HAVE_SYMLINK int status; int interdir_made = 0; if (! absolute_names_option && (IS_ABSOLUTE_FILE_NAME (current_stat_info.link_name) || contains_dot_dot (current_stat_info.link_name))) return create_placeholder_file (file_name, true, &interdir_made); --> WHEN NOT HAVING "-P" AND FILE STARTS WITH "../" OR IS ABSOLUTE, DELAY CREATION ... } -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Related old code showing comment about "potentially dangerous symlinks": -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- static int extract_symlink (char *file_name, int typeflag) { #ifdef HAVE_SYMLINK int status, fd; int interdir_made = 0; if (absolute_names_option || ! (IS_ABSOLUTE_FILE_NAME (current_stat_info.link_name) || contains_dot_dot (current_stat_info.link_name))) { --> WHEN HAVING "-P" OR FILE DOESN'T START WITH "../" AND IS NOT ABSOLUTE, CREATE IMMEDIATELY ... } else { /* This symbolic link is potentially dangerous. Don't create it now; instead, create a placeholder file, which will be replaced after other extraction is done. */ struct stat st; --> WHEN NOT HAVING "-P" AND FILE STARTS WITH "../" OR IS ABSOLUTE, DELAY CREATION ... -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Some statistics : - Creating the archive (see reproducer) took up to 674696 kB of memory - Extracting the archive without -P took more than 2 GB of memory (OOM killed tar) - At 50%, it was taking already 1192072 kB - There were ~150K inodes created per minute - Extracting the archive with -P took 1172 kB of memory ;-) - There were ~1.5M inodes created per minute Created attachment 1623129 [details]
Symlink generator
This symlink generator creates:
- 1000 plain files "X" (X=[0-999]) in "./resolved" directory
- 1000 directories "X" (X=[0-999]) under "./linksY" directories Y=[0-9]
- 1000 symlinks "X" (X=[0-999]) to "../../resolved/X" corresponding files
This hence builds a directory tree with 10M symlinks
After discussion with upstream of tar component, we came to conclusion that this is expected behaviour. When extracting a symlink to absolute file name or to a filename in a parent directory, tar first creates a placeholder (a regular file of zero length) in its place and records the fact in a list of such "delayed links". The placeholder is replaced with the actual link when it becomes certain that it cannot be used for placing other file to the absolute location unknown to the user. Quite often this becomes certain only at the end of extraction. The delayed list link is kept in the memory, and that's the reason for the excessive memory usage. As you already mentions, delayed link creation can be disabled if the -P (--absolute-names) option is used. From the about mentioned reasons, I am closing this bug as CLOSED:NOTABUG OK, created a KCS. |