125523 – Need a way to slow writes from tar

Bug 125523 - Need a way to slow writes from tar

Summary: Need a way to slow writes from tar

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	tar
Sub Component:
Version:	1
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	---
Assignee:	Jeff Johnson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-06-08 14:56 UTC by Aaron Sherman
Modified:	2007-11-30 22:10 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-06-16 13:21:47 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
The command-line option added, plus docs (5.39 KB, patch) 2004-06-08 14:58 UTC, Aaron Sherman	no flags	Details \| Diff
View All

Description Aaron Sherman 2004-06-08 14:56:36 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040124 Galeon/1.3.14

Description of problem:
tar does an admirable job of writing out files as fast as it can, but
when you wish to restrict how fast it writes (because of network
constraints over a networked filesystem, or due to scheduling issues
and device IO constraints), simply slowing its read times (e.g. by
buffering its intput file through a program that pauses periodically)
is often not granular enough.


Version-Release number of selected component (if applicable):
tar-1.13.25-12

How reproducible:
Always

Steps to Reproduce:
1. create a large tar file with many files
2. extract it over a networked filesystem
3. watch network bandwith go away
    

Actual Results:  network bandwidth or local device throughput inpact.

Expected Results:  ability to constrain that impact

Additional info:

To solve for this, I propose (and will attach a suggested patch for)
adding a command-line option to add a variable number of milliseconds
to pause after each record is written. Combining this with options to
change the record size (or number of blocks per record) allows you to
set any number of bytes-per-second through put that you like.

Hopefully Red Hat will adopt this change across products and/or
contribute it back to the project that maintains GNU tar. It would be
very useful to me and I assume to others.

Comment 1 Aaron Sherman 2004-06-08 14:58:21 UTC

Created attachment 100967 [details]
The command-line option added, plus docs

This is the patch to add the --write-pause-time=TIME command-line flag, along
with the "man" and "info" documentation change for the option.

Comment 2 Jeff Johnson 2004-06-16 13:21:47 UTC

Managing network bandwidth with a tar CLI option will "work",
but isn't a general solution, nor is it of sufficiently
general interest to add to the tar package and diverge from
upstream sources.

Try sending the patch to the upstream tar maintainers.

Comment 3 Aaron Sherman 2004-06-26 18:29:15 UTC

Actually, network utilization was just an example off the top of my
head (and yes, there are other ways to attack that particular
problem), however scheduling issues and IO constraints were my primary
problem, and this patch HAS solved for those problems.

To replicate in specific, store a large tar file with many large
(preferably 1GB+) sub-files. Then attempt to un-tar this file while
most (say, 90% or so) of the system's physical memory is in use by a
program that is routinely using that memory.

Don't plan on being able to use the machine for much until it's done
unless you've applied this patch and used the given command-line argument.

Oh, and no it's not just memory starvation. The core problem is really
the fact that the scheduler is unable to look at system resources as a
whole and determine that the combination of swapping and large numbers
of user-generated reads and writes will starve the IO controler in
question and result in deadlocks that in turn result in near 0% CPU
utilization and massive redundancy in IO operations. This could be
shrugged off as simply "loading the system" if my patch (and the use
of the option in question) did not remove the problem by nudging the
scheduler in the right direction.

Now, I don't have the opportunity to pay Red Hat any more because I
can't afford RHEL, but I bought every retail release of RHL from 4.0
to 9.0 and I'm a stock-holder and this is the first significant thing
I've asked for. I don't think it's that harsh a request that Red Hat
(who have far more weight with the Gnu tar folks than I do) apply and
push upstream this relatively simple patch that doesn't affect anyone
who chooses not to use it... is it?

Thank you for your time.

Note You need to log in before you can comment on or make changes to this bug.