RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 825568 - Midnight commander lists whole archive contents before extracting each file
Summary: Midnight commander lists whole archive contents before extracting each file
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: mc
Version: 6.3
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Denys Vlasenko
QA Contact: BaseOS QE - Apps
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-05-27 21:37 UTC by Jiri Pospisil
Modified: 2013-07-04 08:13 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-25 11:20:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jiri Pospisil 2012-05-27 21:37:05 UTC
Description of problem:
When extracting a couple of files from a zip archive via mc, mc calls unzip -l on the archive before extraction of each file (it's probably checking if the file is present). This make extracting files a lot slower when working with large archive

Version-Release number of selected component (if applicable):
4.7.0.2

How reproducible:

Steps to Reproduce:
1. create a huge zip archive (e.g. a million small files)
2. extract a few (e.g. 5) files from the archive via unzip
3. navigate to the archive in mc, mark the same files and extract them
4. compare how long took steps 2. and 3.

Actual results:
Extraction via mc is a lot slower - in my case (1.6M files, extracting three of them) unzip is instant, extraction via mc takes about one minute

Expected results:
Extraction takes approximately the same time

Additional info:

Comment 2 RHEL Program Management 2012-09-07 05:36:20 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.

Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.

Comment 3 Denys Vlasenko 2013-02-22 16:50:37 UTC
Unpacking zip files is done by /usr/libexec/mc/extfs.d/uzip script (in mc source tree it is src/vfs/extfs/helpers/uzip.in).

This script will run the following command:

/usr/bin/unzip -p FILE.zip NAME_IN_ARCHIVE >/tmp/mc-root/extRANDOM

for each unpacked file. This shouldn't be much slower than unzip in "extract a few files" scenario: at worst, it will be 5 times slower than using unzip to extract all 5 files in one command invocation. I will take a look now why it is slow (maybe processing of huge file list is slow?)

Meanwhile, I have my doubts about that script safety wrt names with spaces,
like "file >/etc/passwd" :(

Comment 4 Denys Vlasenko 2013-02-22 18:14:25 UTC
Copying out one file from 100000 file archive.

mc executes this:
/usr/libexec/mc/extfs.d/uzip copyout /path/z.zip z/f1054 /tmp/mc-root/extfsQmUzPaf1054

which in turn executes:
/usr/bin/unzip -Z -l -T \/path\/z\.zip

which is a command to list all files in the archive. which is slow. which isn't helped one iota by unzip not buffering its output at all and executing three syscalls to output one line:

19:05:52.130997 write(1, "z/f31064", 8) = 8
19:05:52.131100 ioctl(1, TIOCGWINSZ, 0xbfcd8ab8) = -1 ENOTTY (Inappropriate ioctl for device)
19:05:52.131212 write(1, "\n", 1)       = 1

This listing is triggered in zipfs_realpathname() call in uzip:
sub mczipfs_copyout {
HERE->  my ($qafile, $qfsfile) = map { &zipquotemeta(zipfs_realpathname($_)) } @_;
        &checkargs(1, 'archive file', @_);
        &checkargs(2, 'local file', @_);
        &safesystem("$cmd_extract $qarchive $qafile >$qfsfile", 11);
  exit;
}
...

Apparently zipfs_realpathname() is needed to fix some problem with mangled names:

# The Midnight Commander never calls this script with archive pathnames
# starting with either "./" or "../". Some ZIP files contain such names,
# so we need to build a translation table for them.
my $zipfs_realpathname_table = undef;
sub zipfs_realpathname($) {
    my ($fname) = @_;

    if (!defined($zipfs_realpathname_table)) {
        $zipfs_realpathname_table = {};
        if (!open(ZIP, "$cmd_list $qarchive |")) {
            return $fname;
        }
...

Comment 5 Denys Vlasenko 2013-02-22 18:21:49 UTC
.tar.gz archives work even more slowly.

Comment 6 Denys Vlasenko 2013-02-25 11:20:21 UTC
I think fixing this bug requires a serious overhaul of mc's virtual filesystem code.

Not only archive handling helpers need to be better (say, they need to handle filenames with all the weird characters such as newline), but it looks like data structures mc uses to represent list of files are not efficient for millions of files.

I think this needs to be fixed in upstream first.

Closing as WONTFIX.

Please reopen and explain if you think this really is an urgent thing to fix.


Note You need to log in before you can comment on or make changes to this bug.