Bug 125523 - Need a way to slow writes from tar
Summary: Need a way to slow writes from tar
Alias: None
Product: Fedora
Classification: Fedora
Component: tar (Show other bugs)
(Show other bugs)
Version: 1
Hardware: All Linux
Target Milestone: ---
Assignee: Jeff Johnson
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2004-06-08 14:56 UTC by Aaron Sherman
Modified: 2007-11-30 22:10 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-06-16 13:21:47 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
The command-line option added, plus docs (5.39 KB, patch)
2004-06-08 14:58 UTC, Aaron Sherman
no flags Details | Diff

Description Aaron Sherman 2004-06-08 14:56:36 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040124 Galeon/1.3.14

Description of problem:
tar does an admirable job of writing out files as fast as it can, but
when you wish to restrict how fast it writes (because of network
constraints over a networked filesystem, or due to scheduling issues
and device IO constraints), simply slowing its read times (e.g. by
buffering its intput file through a program that pauses periodically)
is often not granular enough.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. create a large tar file with many files
2. extract it over a networked filesystem
3. watch network bandwith go away

Actual Results:  network bandwidth or local device throughput inpact.

Expected Results:  ability to constrain that impact

Additional info:

To solve for this, I propose (and will attach a suggested patch for)
adding a command-line option to add a variable number of milliseconds
to pause after each record is written. Combining this with options to
change the record size (or number of blocks per record) allows you to
set any number of bytes-per-second through put that you like.

Hopefully Red Hat will adopt this change across products and/or
contribute it back to the project that maintains GNU tar. It would be
very useful to me and I assume to others.

Comment 1 Aaron Sherman 2004-06-08 14:58:21 UTC
Created attachment 100967 [details]
The command-line option added, plus docs

This is the patch to add the --write-pause-time=TIME command-line flag, along
with the "man" and "info" documentation change for the option.

Comment 2 Jeff Johnson 2004-06-16 13:21:47 UTC
Managing network bandwidth with a tar CLI option will "work",
but isn't a general solution, nor is it of sufficiently
general interest to add to the tar package and diverge from
upstream sources.

Try sending the patch to the upstream tar maintainers.

Comment 3 Aaron Sherman 2004-06-26 18:29:15 UTC
Actually, network utilization was just an example off the top of my
head (and yes, there are other ways to attack that particular
problem), however scheduling issues and IO constraints were my primary
problem, and this patch HAS solved for those problems.

To replicate in specific, store a large tar file with many large
(preferably 1GB+) sub-files. Then attempt to un-tar this file while
most (say, 90% or so) of the system's physical memory is in use by a
program that is routinely using that memory.

Don't plan on being able to use the machine for much until it's done
unless you've applied this patch and used the given command-line argument.

Oh, and no it's not just memory starvation. The core problem is really
the fact that the scheduler is unable to look at system resources as a
whole and determine that the combination of swapping and large numbers
of user-generated reads and writes will starve the IO controler in
question and result in deadlocks that in turn result in near 0% CPU
utilization and massive redundancy in IO operations. This could be
shrugged off as simply "loading the system" if my patch (and the use
of the option in question) did not remove the problem by nudging the
scheduler in the right direction.

Now, I don't have the opportunity to pay Red Hat any more because I
can't afford RHEL, but I bought every retail release of RHL from 4.0
to 9.0 and I'm a stock-holder and this is the first significant thing
I've asked for. I don't think it's that harsh a request that Red Hat
(who have far more weight with the Gnu tar folks than I do) apply and
push upstream this relatively simple patch that doesn't affect anyone
who chooses not to use it... is it?

Thank you for your time.

Note You need to log in before you can comment on or make changes to this bug.