Bug 757557
Summary: | GNU tar -S eats data when storing files from btrfs | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michael Stahl <mstahl> | ||||
Component: | tar | Assignee: | Ondrej Vasik <ovasik> | ||||
Status: | CLOSED UPSTREAM | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 16 | CC: | dtardon, jbacik, kdudka, ovasik, praiskup | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-04-20 18:07:19 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Michael Stahl
2011-11-27 19:15:20 UTC
don't know if this problem is in GNU tar or in btrfs... This could be related to the following optimization: http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html Does stat return non-zero count of blocks for the file that causes problems? very interesting find, Kamil! a bit of googling finds this: http://pubs.opengroup.org/onlinepubs/009604599/basedefs/sys/stat.h.html blkcnt_t st_blocks Number of blocks allocated for this object. AFAIK btrfs stores "small" files inside the metadata tree, so they take up 0 filesystem data blocks. so it is entirely plausible that this patch which you found is the reason why these files are mis-detected as entirely sparse files. perhaps it could be fixed by handling files whose size is < 1 blocksize as not sparse? there aren't big savings in that case anyway... or perhaps the better solution would be that btrfs stat reports 1 block allocated in this case? sorry but i can't try anything out because the reason why i did the backup and why i actually verified it is that i replaced the btrfs on my laptop with ext4 because it was unusably slow, and right now i don't have a btrfs anywhere... dtardon->kdudka: I tested that (on a newly created, loop-mounted btrfs filesystem; I am not crazy enough to use btrfs on my machine .-), with the following results: echo hello > hello.txt stat hello.txt File: `hello.txt' Size: 6 Blocks: 8 IO Block: 4096 regular file Device: 29h/41d Inode: 259 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 501/ dtardon) Gid: ( 501/ dtardon) Access: 2011-12-13 06:25:36.928032479 +0100 Modify: 2011-12-13 06:26:41.111524638 +0100 Change: 2011-12-13 06:26:41.111524638 +0100 Birth: - vim hello.txt # edit & save stat hello.txt File: `hello.txt' Size: 6 Blocks: 0 IO Block: 4096 regular file Device: 29h/41d Inode: 262 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 501/ dtardon) Gid: ( 501/ dtardon) Access: 2011-12-13 06:32:51.581716486 +0100 Modify: 2011-12-13 06:32:51.581716486 +0100 Change: 2011-12-13 06:32:51.581716486 +0100 Birth: - (In reply to comment #4) > stat hello.txt > File: `hello.txt' > Size: 6 Blocks: 0 IO Block: 4096 regular file Thank you for testing it. The above confirms that files with zero blocks but non-zero size may appear on btrfs. Those would be mistakenly detected as sparse files by tar -S. (In reply to comment #3) > http://pubs.opengroup.org/onlinepubs/009604599/basedefs/sys/stat.h.html > blkcnt_t st_blocks Number of blocks allocated for this object. > > AFAIK btrfs stores "small" files inside the metadata tree, > so they take up 0 filesystem data blocks. > so it is entirely plausible that this patch which you found > is the reason why these files are mis-detected as entirely sparse files. Thank you for the pointer and the explanation. > perhaps it could be fixed by handling files whose size > is < 1 blocksize as not sparse? > there aren't big savings in that case anyway... This means to revert aforementioned optimization patch if I am not mistaken. > or perhaps the better solution would be that btrfs stat > reports 1 block allocated in this case? This would solve the problem without decreasing the performance, which sounds even better. We should probably notify btrfs guys about this issue. Josef, is there a way to address the issue in btrfs such that it does not return zero count of blocks for files with non-zero data inside? Created attachment 546359 [details]
patch to fix the problem
Yup sorry about that, we were just doing bytes >> 9 for blocks which doesn't work out so well if bytes > 512 bytes. So this should fix it to always say 1 block for something that's less than 512 bytes. Please verify this fixes the problem for you.
As this seems to be stalled - Josef - was this already applied to F16 kernel? Should I reassign it to you and kernel component? I assume there will be no change in tar required, once the btrfs behaviour of stat->blocks is fixed. This was fixed upstream with fadc0d8be4dfca80f6c568bc5874931893c6709b I assume its in the f16 kernel. |