Created attachment 816894 [details]
sample file that tar --sparse does not archive correctly
Description of problem: When I create tar archives with the --sparse flag, some files are corrupted silently. I do not see this bug in 1.23, but it is present in 1.26 and in 1.27.
Version-Release number of selected component (if applicable):
A sample file is attached. Try the following:
bash-4.2$ tar --version
tar (GNU tar) 1.27
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
bash-4.2$ mkdir out
bash-4.2$ tar --sparse -cf - tar.sparse.broken.file | tar -C out --sparse -xpBf -
bash-4.2$ ls -l tar.sparse.broken.file out/tar.sparse.broken.file
-r--r--r-- 1 ajs ead 35 Oct 28 15:12 out/tar.sparse.broken.file
-r--r--r-- 1 schorr ead 35 Oct 28 15:12 tar.sparse.broken.file
bash-4.2$ md5sum tar.sparse.broken.file out/tar.sparse.broken.file
bash-4.2$ od -c tar.sparse.broken.file
0000000 037 213 \b \b 274 243 u Q 002 003 c u s t _ a
0000020 u d i t . t a g \0 003 \0 \0 \0 \0 \0 \0
0000040 \0 \0 \0
bash-4.2$ od -c out/tar.sparse.broken.file
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 \0 \0 \0
Steps to Reproduce:
1. create a tar archive of the attached file using the --sparse flag
2. extract the archive
3. note that the extracted file does not match the archived file
The files are different. The extracted file contains only zero bytes.
The files should match.
This worked in 1.23, but not in 1.24. I'd run git bisect except that I can't get the autotools to run properly.
I think I see the problem. The ChangeLog says, in part:
2010-08-25 Paul Eggert <email@example.com>
tar: optimize -c --sparse when file is entirely sparse
* src/sparse.c (sparse_scan_file): If the file is entirely sparse,
that is, if ST_NBLOCKS is zero, don't bother scanning for nonzero
blocks. Idea by Kit Westneat, communicated by Bernd Schubert in
Also, omit unnecessary lseek at start of file.
On my Network Appliance fileserver, a small file may have zero blocks
even though it is not empty. In other words, this patch is not correct
on some filesystems. The bug is occurring only on my Netapp filesystem,
not on ext4.
Created attachment 816928 [details]
fix --sparse on filesystems where small files may appear to have zero blocks
This patch reverts the shortcut added here to decide that a file is empty
if st_blocks is zero:
On some filesystem such as Netapp, small files are contained in the inode and have st_blocks set to zero. So this test is not reliable.
Note: the ST_IS_SPARSE macro in lib/system.h will also give some false
positives for the same reason. I don't know if that matters...
Thanks for the report and for the report upstream, I see it was fixed already:
I'll backport that fix and submit an bodhi update.
(In reply to Andrew J. Schorr from comment #4)
> Note: the ST_IS_SPARSE macro in lib/system.h will also give some false
> positives for the same reason. I don't know if that matters...
Sorry, I missed this note. That FP should result in dumping the file into tar
as-is, not recognized & stored as sparse file. As this is just about files of
size < 512 bytes, it should be OK.
tar-1.26-29.fc20 has been submitted as an update for Fedora 20.
tar-1.26-27.fc19 has been submitted as an update for Fedora 19.
Thanks for the prompt attention. I think the patch to ST_IS_SPARSE is probably OK, although I'm not 100% confident that the code in sparse.c:sparse_scan_file
shouldn't be fixed as well. I guess if ST_IS_SPARSE is fixed, it may prevent
the code from ever getting there. So maybe fixing ST_IS_SPARSE is enough...
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing tar-1.26-27.fc19'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
tar-1.26-27.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.
tar-1.26-29.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.