Bug 64410
Summary: | Tar produces corrupt files in some situations. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Ben Woodard <woodard> | ||||
Component: | tar | Assignee: | Jeff Johnson <jbj> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Ben Levenson <benl> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.2 | CC: | pbrown, rm.riches, tao, zaitcev | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2002-11-20 19:10:22 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Ben Woodard
2002-05-03 20:41:01 UTC
I'm seeing the same symptoms on an Alpha machine RH7.1 Linux using tar-1.13.19-4 and bzip2-1.0.1-4. (It affects my nightly backups--yikes! Time to add a verification step to the script.) After a little testing, using about 88MB of data, I have found the 'trailing garbage' is apparently real but benign. I did four runs: 1) with 'z' switch, 2) with 'j' switch, 3) no switch but piped to bzip2, and 4) no switch bug piped to gzip. In both cases, the switch produced slightly larger files than the no switch and pipe. Only the 'j' switch complained about 'trailing garbage'. Extracting the data from all four runs produced identical results (per 'diff -r'). So, it appears one workaround would be to use '-' as the output file and manually pipe (with a shell script) the output into bzip2 (or gzip). Examination of the file produced with the 'j' switch showed a block of \0 bytes at the end of the file. Removing them produced a file that is identical to the file produced by no switch and manually piping the output through bzip2. So, it would appear the bug is in 'tar' and involves padding the (compressed) output file with a bunch of null bytes after the compression program has finished its job. If I had time, I'd love to dig into the code and submit a patch. But, I'll have to leave that to the professionals for now. I believe that I saw this on a 7.1 based machine as well. It could be that the problem was with the version of tar bundled with 7.1. ben: have you tried tar 1.13.25-4 from Red Hat Linux 7.3? Preston, I tried this with 1.13.25, 1.13.25-4.7 and 1.13.25-7 (8.0 version). I is busted in all these versions. Created attachment 85742 [details]
description of use cases for tar, bzip2 and output files
Tar is working as designed. If the filename is given as '-', tar assumes the user knows what they are doing, and that the output is going to be sent (eventually) to an actual tape device. Many tape devices will only accept input in multiples of their physical blocksize, so this reblocking is required to allow compression to work with those tape devices. Upshot: to create a bzipped tar file that is not padded out to a multiple of tar's blocksize, use either tar -c -j -f {output filename} (Where output filename is not a device file) or tar -c -f - | bzip2 > {output filename} This is all described it way too much detail in the tar documentation. Read the "Blocking Factor" section. |