Bug 1024095

Summary: tar --sparse silently corrupts files on filesystems where non-empty files may have zero blocks
Product: [Fedora] Fedora Reporter: Andrew J. Schorr <aschorr>
Component: tarAssignee: Ondrej Vasik <ovasik>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: kdudka, ovasik, praiskup, registros.it
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tar-1.26-27.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1024268 (view as bug list) Environment:
Last Closed: 2013-10-31 02:59:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1024268    
Attachments:
Description Flags
sample file that tar --sparse does not archive correctly
none
fix --sparse on filesystems where small files may appear to have zero blocks none

Description Andrew J. Schorr 2013-10-28 19:35:53 UTC
Created attachment 816894 [details]
sample file that tar --sparse does not archive correctly

Description of problem: When I create tar archives with the --sparse flag, some files are corrupted silently.  I do not see this bug in 1.23, but it is present in 1.26 and in 1.27.


Version-Release number of selected component (if applicable):
tar-1.26-24.fc19.x86_64


How reproducible:
A sample file is attached.  Try the following:
bash-4.2$ tar --version
tar (GNU tar) 1.27
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.
bash-4.2$ mkdir out
bash-4.2$ tar --sparse -cf - tar.sparse.broken.file | tar -C out --sparse -xpBf -
bash-4.2$ ls -l tar.sparse.broken.file out/tar.sparse.broken.file 
-r--r--r-- 1 ajs    ead 35 Oct 28 15:12 out/tar.sparse.broken.file
-r--r--r-- 1 schorr ead 35 Oct 28 15:12 tar.sparse.broken.file
bash-4.2$ md5sum tar.sparse.broken.file out/tar.sparse.broken.file 
20b4497c7bdc00effbb5ad65d04a3bc3  tar.sparse.broken.file
c54104d7894a1941ca710981da437f9f  out/tar.sparse.broken.file
bash-4.2$ od -c tar.sparse.broken.file
0000000 037 213  \b  \b 274 243   u   Q 002 003   c   u   s   t   _   a
0000020   u   d   i   t   .   t   a   g  \0 003  \0  \0  \0  \0  \0  \0
0000040  \0  \0  \0
0000043
bash-4.2$ od -c out/tar.sparse.broken.file 
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000040  \0  \0  \0
0000043



Steps to Reproduce:
1. create a tar archive of the attached file using the --sparse flag
2. extract the archive
3. note that the extracted file does not match the archived file

Actual results:
The files are different.  The extracted file contains only zero bytes.

Expected results:
The files should match.

Additional info:

Comment 1 Andrew J. Schorr 2013-10-28 20:25:48 UTC
This worked in 1.23, but not in 1.24.  I'd run git bisect except that I can't get the autotools to run properly.

Comment 2 Andrew J. Schorr 2013-10-28 20:31:14 UTC
I think I see the problem.  The ChangeLog says, in part:


2010-08-25  Paul Eggert  <eggert.edu>

        tar: optimize -c --sparse when file is entirely sparse
        * src/sparse.c (sparse_scan_file): If the file is entirely sparse,
        that is, if ST_NBLOCKS is zero, don't bother scanning for nonzero
        blocks.  Idea by Kit Westneat, communicated by Bernd Schubert in
        <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>.
        Also, omit unnecessary lseek at start of file.

On my Network Appliance fileserver, a small file may have zero blocks
even though it is not empty.  In other words, this patch is not correct
on some filesystems.  The bug is occurring only on my Netapp filesystem,
not on ext4.

Comment 3 Andrew J. Schorr 2013-10-28 21:06:31 UTC
Created attachment 816928 [details]
fix --sparse on filesystems where small files may appear to have zero blocks

This patch reverts the shortcut added here to decide that a file is empty
if st_blocks is zero:

http://git.savannah.gnu.org/cgit/tar.git/commit/?id=a9895fd20c957ce184091672f1623a5bedd82407

On some filesystem such as Netapp, small files are contained in the inode and have st_blocks set to zero.  So this test is not reliable.

Comment 4 Andrew J. Schorr 2013-10-28 21:38:32 UTC
Note: the ST_IS_SPARSE macro in lib/system.h will also give some false
positives for the same reason.  I don't know if that matters...

Comment 5 Pavel Raiskup 2013-10-29 07:36:37 UTC
Thanks for the report and for the report upstream, I see it was fixed already:
http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00031.html

I'll backport that fix and submit an bodhi update.

Comment 7 Pavel Raiskup 2013-10-29 09:37:52 UTC
(In reply to Andrew J. Schorr from comment #4)
> Note: the ST_IS_SPARSE macro in lib/system.h will also give some false
> positives for the same reason.  I don't know if that matters...

Sorry, I missed this note.  That FP should result in dumping the file into tar
as-is, not recognized & stored as sparse file.  As this is just about files of
size < 512 bytes, it should be OK.

Comment 8 Fedora Update System 2013-10-29 09:42:02 UTC
tar-1.26-29.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/tar-1.26-29.fc20

Comment 9 Fedora Update System 2013-10-29 09:54:30 UTC
tar-1.26-27.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/tar-1.26-27.fc19

Comment 10 Andrew J. Schorr 2013-10-29 12:12:48 UTC
Thanks for the prompt attention.  I think the patch to ST_IS_SPARSE is probably OK, although I'm not 100% confident that the code in sparse.c:sparse_scan_file
shouldn't be fixed as well.  I guess if ST_IS_SPARSE is fixed, it may prevent
the code from ever getting there.  So maybe fixing ST_IS_SPARSE is enough...

Comment 11 Fedora Update System 2013-10-30 01:50:06 UTC
Package tar-1.26-27.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing tar-1.26-27.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-20256/tar-1.26-27.fc19
then log in and leave karma (feedback).

Comment 12 Fedora Update System 2013-10-31 02:59:59 UTC
tar-1.26-27.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2013-11-10 06:11:36 UTC
tar-1.26-29.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.