Bug 1024095 - tar --sparse silently corrupts files on filesystems where non-empty files may have zero blocks
Summary: tar --sparse silently corrupts files on filesystems where non-empty files may...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: tar
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ondrej Vasik
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1024268
TreeView+ depends on / blocked
 
Reported: 2013-10-28 19:35 UTC by Andrew J. Schorr
Modified: 2016-07-01 14:11 UTC (History)
4 users (show)

Fixed In Version: tar-1.26-27.fc19
Clone Of:
: 1024268 (view as bug list)
Environment:
Last Closed: 2013-10-31 02:59:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
sample file that tar --sparse does not archive correctly (35 bytes, application/x-gzip)
2013-10-28 19:35 UTC, Andrew J. Schorr
no flags Details
fix --sparse on filesystems where small files may appear to have zero blocks (555 bytes, patch)
2013-10-28 21:06 UTC, Andrew J. Schorr
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1352061 0 unspecified CLOSED btrfs: stat reports the st_blocks with delay (data loss in archivers) 2021-02-22 00:41:40 UTC

Internal Links: 1352061

Description Andrew J. Schorr 2013-10-28 19:35:53 UTC
Created attachment 816894 [details]
sample file that tar --sparse does not archive correctly

Description of problem: When I create tar archives with the --sparse flag, some files are corrupted silently.  I do not see this bug in 1.23, but it is present in 1.26 and in 1.27.


Version-Release number of selected component (if applicable):
tar-1.26-24.fc19.x86_64


How reproducible:
A sample file is attached.  Try the following:
bash-4.2$ tar --version
tar (GNU tar) 1.27
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.
bash-4.2$ mkdir out
bash-4.2$ tar --sparse -cf - tar.sparse.broken.file | tar -C out --sparse -xpBf -
bash-4.2$ ls -l tar.sparse.broken.file out/tar.sparse.broken.file 
-r--r--r-- 1 ajs    ead 35 Oct 28 15:12 out/tar.sparse.broken.file
-r--r--r-- 1 schorr ead 35 Oct 28 15:12 tar.sparse.broken.file
bash-4.2$ md5sum tar.sparse.broken.file out/tar.sparse.broken.file 
20b4497c7bdc00effbb5ad65d04a3bc3  tar.sparse.broken.file
c54104d7894a1941ca710981da437f9f  out/tar.sparse.broken.file
bash-4.2$ od -c tar.sparse.broken.file
0000000 037 213  \b  \b 274 243   u   Q 002 003   c   u   s   t   _   a
0000020   u   d   i   t   .   t   a   g  \0 003  \0  \0  \0  \0  \0  \0
0000040  \0  \0  \0
0000043
bash-4.2$ od -c out/tar.sparse.broken.file 
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000040  \0  \0  \0
0000043



Steps to Reproduce:
1. create a tar archive of the attached file using the --sparse flag
2. extract the archive
3. note that the extracted file does not match the archived file

Actual results:
The files are different.  The extracted file contains only zero bytes.

Expected results:
The files should match.

Additional info:

Comment 1 Andrew J. Schorr 2013-10-28 20:25:48 UTC
This worked in 1.23, but not in 1.24.  I'd run git bisect except that I can't get the autotools to run properly.

Comment 2 Andrew J. Schorr 2013-10-28 20:31:14 UTC
I think I see the problem.  The ChangeLog says, in part:


2010-08-25  Paul Eggert  <eggert.edu>

        tar: optimize -c --sparse when file is entirely sparse
        * src/sparse.c (sparse_scan_file): If the file is entirely sparse,
        that is, if ST_NBLOCKS is zero, don't bother scanning for nonzero
        blocks.  Idea by Kit Westneat, communicated by Bernd Schubert in
        <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>.
        Also, omit unnecessary lseek at start of file.

On my Network Appliance fileserver, a small file may have zero blocks
even though it is not empty.  In other words, this patch is not correct
on some filesystems.  The bug is occurring only on my Netapp filesystem,
not on ext4.

Comment 3 Andrew J. Schorr 2013-10-28 21:06:31 UTC
Created attachment 816928 [details]
fix --sparse on filesystems where small files may appear to have zero blocks

This patch reverts the shortcut added here to decide that a file is empty
if st_blocks is zero:

http://git.savannah.gnu.org/cgit/tar.git/commit/?id=a9895fd20c957ce184091672f1623a5bedd82407

On some filesystem such as Netapp, small files are contained in the inode and have st_blocks set to zero.  So this test is not reliable.

Comment 4 Andrew J. Schorr 2013-10-28 21:38:32 UTC
Note: the ST_IS_SPARSE macro in lib/system.h will also give some false
positives for the same reason.  I don't know if that matters...

Comment 5 Pavel Raiskup 2013-10-29 07:36:37 UTC
Thanks for the report and for the report upstream, I see it was fixed already:
http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00031.html

I'll backport that fix and submit an bodhi update.

Comment 7 Pavel Raiskup 2013-10-29 09:37:52 UTC
(In reply to Andrew J. Schorr from comment #4)
> Note: the ST_IS_SPARSE macro in lib/system.h will also give some false
> positives for the same reason.  I don't know if that matters...

Sorry, I missed this note.  That FP should result in dumping the file into tar
as-is, not recognized & stored as sparse file.  As this is just about files of
size < 512 bytes, it should be OK.

Comment 8 Fedora Update System 2013-10-29 09:42:02 UTC
tar-1.26-29.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/tar-1.26-29.fc20

Comment 9 Fedora Update System 2013-10-29 09:54:30 UTC
tar-1.26-27.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/tar-1.26-27.fc19

Comment 10 Andrew J. Schorr 2013-10-29 12:12:48 UTC
Thanks for the prompt attention.  I think the patch to ST_IS_SPARSE is probably OK, although I'm not 100% confident that the code in sparse.c:sparse_scan_file
shouldn't be fixed as well.  I guess if ST_IS_SPARSE is fixed, it may prevent
the code from ever getting there.  So maybe fixing ST_IS_SPARSE is enough...

Comment 11 Fedora Update System 2013-10-30 01:50:06 UTC
Package tar-1.26-27.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing tar-1.26-27.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-20256/tar-1.26-27.fc19
then log in and leave karma (feedback).

Comment 12 Fedora Update System 2013-10-31 02:59:59 UTC
tar-1.26-27.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2013-11-10 06:11:36 UTC
tar-1.26-29.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.