Bug 656834

Summary:

tar --sparse fails to restore file names longer than 100 chars from PAX header

Product:

Red Hat Enterprise Linux 6

Reporter:

Bernd Schubert <bschubert>

Component:

tar

Assignee:

Pavel Raiskup <praiskup>

Status:

CLOSED ERRATA

QA Contact:

Branislav Blaškovič <bblaskov>

Severity:

medium

Docs Contact:

Priority:

low

Version:

6.0

CC:

azelinka, bblaskov, bernd.schubert, kdudka, mishu, Team-Lustre.Internal

Target Milestone:

Keywords:

Patch

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

tar-1.23-4.el6

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2012-06-20 13:49:13 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
reproducer	none

Description Bernd Schubert 2010-11-24 09:31:46 UTC

Description of problem:

http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00095.html

From my last post:

I thought about the patch over night and I think there is a combiation of
different problems:

- The patch does not make sure a single block is used only in start_header(),
so with large EAs, it will cause a block overflow.

- pax_dump_header_1() does not check either to use a single block only. For
complex files with lots of different sparse blocks and so a large sparse map,
that also should be a problem.

- May pax_dump_header_1() should pad the block with zeros? dump_regular_file()
does that.

- Silent error on extracting the archive and leaving GNUSparseFile.* files
around. Does it not even detect those are sparse files?

Version-Release number of selected component (if applicable):

How reproducible:

We got an ext3 (ldiskfs) formated block device image. That image is a Lustre MDT filled with sparse files and almost all files have large EA maps.

Steps to Reproduce:
1. Mount that image
2. tar cfS path/to/mdt.tar --xattrs
3. cd $somewhere; tar xfS path/to/mdt.tar --xattrs
4. find mdt/ -name '*GNUSparseFile*'
mdt/ROOT/tmp/matthew/data-dispenser-test/visit-2/processing/b1/c1/GNUSparseFile.1439064
mdt/ROOT/tmp/matthew/data-dispenser-test/visit-2/processing/b1/c3/GNUSparseFile.1439064
[...]

Actual results:

Expected results:

root@rhel5-nfs@vhost2:/scratch/bernd/diamond/ok_without_xattr# find mdt/ -name '*GNUSparseFile*' | wc -l
0

Additional info:

IMHO, xattr data should NOT be stored in the header, but separate blocks should used with careful accounting. As that will break the existing .el6 (and .el5?) tar versions, I'm not sure how to proceed.

Comment 2 Kamil Dudka 2010-11-24 09:45:31 UTC

Please provide a reliable reproducer we can run on default RHEL-6 installation, preferably as a self-contained shell script.  Thanks in advance!

Comment 3 Bernd Schubert 2010-11-24 10:14:55 UTC

The only option I can offer you is a 27GB disk image. Sorry, no other way. It is already bz2 compressed, but using lzma does not help either. 

Also, I really do not think it is required to verify this bug on your own, when it already clear from design, that current --xattr is broken.

Comment 4 Kamil Dudka 2010-11-24 10:47:15 UTC

(In reply to comment #3)
> The only option I can offer you is a 27GB disk image. Sorry, no other way. It
> is already bz2 compressed, but using lzma does not help either.

We can't analyze data images of your customers anyway.

> Also, I really do not think it is required to verify this bug on your own, when

It _is_ required to verify each bug.  We can't make random changes to our packages without confirming we are really fixing a bug and not breaking anything else.

> it already clear from design, that current --xattr is broken.

It may be clear to you.  Then show us a _minimal_ instance of that problem.  Even if there was a design flaw, it is not supposed to be fixed in Enterprise Linux unless it breaks something we support.

Comment 5 Bernd Schubert 2010-11-24 11:09:45 UTC

(In reply to comment #4)
> (In reply to comment #3)
> > The only option I can offer you is a 27GB disk image. Sorry, no other way. It
> > is already bz2 compressed, but using lzma does not help either.
> 
> We can't analyze data images of your customers anyway.

This is a mountable disk image.

> 
> > Also, I really do not think it is required to verify this bug on your own, when
> 
> It _is_ required to verify each bug.  We can't make random changes to our
> packages without confirming we are really fixing a bug and not breaking
> anything else.
> 
> > it already clear from design, that current --xattr is broken.
> 
> It may be clear to you.  Then show us a _minimal_ instance of that problem. 
> Even if there was a design flaw, it is not supposed to be fixed in Enterprise
> Linux unless it breaks something we support.

So you offer tar with --xattr support, which clearly created corrupted tar files. And even worse, unless you specifically check for GnuSparse.* files, you will not even notice.
I really would not complain here, if you would not have included a --xattr patch on your own. But you decided to do so and that new feature causes problems.

As I said, I can offer you an image allowing you to *easily* reproduce the issue. I probably also can create a 10GB script, which also will reproduce the issue.

So what do you expect from me? I already did you show that the current code is broken, as it does not check if the block size if exceeded. I can offer you a way to reproduce, although it needs an image.
So honestly, what else do you need? 

And yes, I can fix this bug on my own, but that will break existing tar --xattr support, as you decided to implement a bad design.
So I'm here to discuss with you how to proceed and not to argue about your support.


Thanks,
Bernd

Comment 6 Kamil Dudka 2010-11-24 11:37:29 UTC

(In reply to comment #5)
> As I said, I can offer you an image allowing you to *easily* reproduce the
> issue. I probably also can create a 10GB script, which also will reproduce the
> issue.

How did you create the image?

Is there any confidential data?

Sparse files are easy to create on demand.  Please consider using dd(1) or truncate(1).  You can create a bunch of sparse files, then set their attributes by setfattr(1).

> So what do you expect from me? I already did you show that the current code is
> broken, as it does not check if the block size if exceeded.

So far you didn't.

Comment 7 Bernd Schubert 2010-11-24 15:31:13 UTC

Created attachment 462670 [details]
reproducer

Extract that archive, then

1) su -
1) cd path/to/reproduce
2) ./reproduce.sh

=> Enjoy the corrupted files

Comment 8 Bernd Schubert 2010-11-24 15:34:53 UTC

(In reply to comment #6)
> (In reply to comment #5)
> > As I said, I can offer you an image allowing you to *easily* reproduce the
> > issue. I probably also can create a 10GB script, which also will reproduce the
> > issue.
> 
> How did you create the image?

I did not create it, but our customer. Simple dd of a lustre MDT device.

> 
> Is there any confidential data?

Those are Lustre meta-data + customer ACLs. So entirely empty sparse files + lustre-EAs + cutomer ACLs.

Anyway, I was able to cook this down to a rather small subset.

> 
> Sparse files are easy to create on demand.  Please consider using dd(1) or
> truncate(1).  You can create a bunch of sparse files, then set their attributes
> by setfattr(1).
> 
> > So what do you expect from me? I already did you show that the current code is
> > broken, as it does not check if the block size if exceeded.
> 
> So far you didn't.

In comment 1: 

- The patch does not make sure a single block is used only in start_header(), 
  so with large EAs, it will cause a block overflow. 

Do you see a check there, that protects against block overflow? At least I don't.


Thanks,
Bernd

Comment 9 Bernd Schubert 2010-11-24 15:37:17 UTC

(In reply to comment #7)
> Created attachment 462670 [details]
> reproducer
> 
> Extract that archive, then
> 
> 1) su -
> 1) cd path/to/reproduce
> 2) ./reproduce.sh
> 
> => Enjoy the corrupted files

Ah, sorry, you will need to replace 


TAR=~bernd/bin/tar-1.23-3.el6

by 

TAR=/bin/tar

in reproduce.sh

Comment 10 Kamil Dudka 2010-11-24 19:03:05 UTC

Thanks for the reproducer.  I am able to get those GNUSparseFile.* files.  That seems to be, however, completely unrelated to our xattr patch.  You will get the same behavior with the upstream tar, only replace --xattr by --posix.  The only addition in our tar is that we provide --xattr, which implies --posix.  That's it.  I'll have a look what's going on there.

Comment 11 Bernd Schubert 2010-11-24 19:07:20 UTC

Thanks for looking into it. Well, as I wrote in my mail to the gnu-tar list, pax_dump_header_1() does not have a size check either. And "--sparse-version=0.0"
 seems to be a workaround.

Comment 12 Kamil Dudka 2010-11-24 22:02:12 UTC

What size check are you actually talking about?

The reproducer is as easy as:

$ NAME=`seq 60 | tr -d '\n'`
$ truncate -s 10M $NAME
$ tar -c --sparse --posix $NAME | tar t

Comment 13 Kamil Dudka 2010-11-24 23:31:19 UTC

raised upstream:

http://article.gmane.org/gmane.comp.gnu.tar.bugs/4152

Comment 14 Kamil Dudka 2010-12-08 19:22:19 UTC

upstream patch available:

http://article.gmane.org/gmane.comp.gnu.tar.bugs/4160

The patch has not yet reached the upstream repo, but I believe it will appear there soon.

Comment 15 Kamil Dudka 2010-12-08 19:34:50 UTC

a fixed package available for rawhide Fedora:

tar-1.25-3.fc15

Comment 16 Kamil Dudka 2010-12-14 13:56:56 UTC

pushed upstream:

http://git.savannah.gnu.org/cgit/tar.git/commit/?id=bb0af96

Comment 17 RHEL Program Management 2011-01-07 15:31:31 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 18 Bernd Schubert 2011-06-06 15:02:43 UTC

Hello Kamil,

thanks for your help and sorry for my late reply. I opened this bugzilla shortly before leaving DDN and had been busy with too many other things at this time.

I'm now working on FhGFS and just wanted to use tar to copy a few million files on our test cluster (rsync is not usuable either). 
And now comes the IMPORTANT advise for everyone reading this ticket: DO NOT USE tar-1.26-1.fc16.src.rpm it is *ENTIRELY* broken with respect to xattrs. I did not test tar-1.25-3.fc15 yet, though. At this point it was more easy to apply the patch to the RHEL5 version than to work with the FC releases...
Here straces to show what is broken:


newfstatat(4, "702B31-4DEC2D19-fslab2-f", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(4, "2ED83-4DEC00C0-fslab2-f", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
fgetxattr(0, "system.posix_acl_access", 0x7ffff4525790, 132) = -1 EOPNOTSUPP (Operation not supported)
fgetxattr(0, "security.selinux", 0xd4c580, 255) = -1 EOPNOTSUPP (Operation not supported)
flistxattr(0, 0xd33ce0, 1024)           = 0

==> FD=0 is STDIN! I double checked that in /proc/$PID/fd

Here is how it looks like in the patched and working RHEL6 version:

stat("fhgfs_meta/entries/14A9/1C224E-4DEC065B-fslab2-f", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
lgetxattr("fhgfs_meta/entries/14A9/1C224E-4DEC065B-fslab2-f", "security.selinux", 0xf71070, 255) = -1 ENODATA (No data available)
llistxattr("fhgfs_meta/entries/14A9/1C224E-4DEC065B-fslab2-f", 0xf65220, 1024) = 16
lgetxattr("fhgfs_meta/entries/14A9/1C224E-4DEC065B-fslab2-f", "user.fhgfs_file", "\x01\x00\x00\x00\x00\x00\x00\x00[\x06\xecM\x00\x00\x00\x00[\x06\xecM\x00\x00\x00\x00[\x06\xecM\x00\x00\x00\x00[\x06\xecM\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x06\x16\x00\x00R\x08\x00\x00\x80\x81\x00\x00\x16\x00\x00\x001C224E-4DEC065B-fslab2\x00a\x15\x00\x00\x001043B-4DEC0064-fslab2\x00ab\x06\x00\x00\x00fslab2\x00a`\x00\x00\x00\x01\x00\x00\x00\x00\x00\x08\x00P\x00\x00\x00\x04\x00\x00\x001-4DEBFA85-fslab1\x002-4DEBFA88-fslab1\x003-4DEBFA8B-fslab1\x004-4DEBFA8E-fslab1\x00\x04\x00\x00", 1024) = 232
write(1, "fhgfs_meta/entries/14A9/1C224E-4"..., 49) = 49


I leave it to someone else to open a bugreport for the fc16 version.

Comment 19 Kamil Dudka 2011-06-06 15:56:26 UTC

(In reply to comment #18)
> I leave it to someone else to open a bugreport for the fc16 version.

Bernd, we would be happy to open the bug and fix it afterwards.  Could you please at least tell us which arguments you gave to tar in order to trigger the bug?

Comment 20 Bernd Schubert 2011-06-06 16:05:42 UTC

Thanks for your quick reply! Here it is (from strace):


execve("/root/tar", ["/root/tar", "--xattrs", "-cvf", "test.tar", "fhgfs_meta/entries/14A9/"], [/* 25 vars */]) = 0
brk(0)                                  = 0xd2c000

So as command:

/root/tar --xattrs -cvf test.tar fhgfs_meta/entries/14A9/

I had simply copied over the FC16 tar version to /root/tar (note: I compiled and used it under Debian, as my simple-to-boot diskless test environment is based on that, but I made sure xattr support is properly in, of course).

./fhgfs_meta/entries/14A9/ is a directory with lots of 0-byte files, but with EAs set.

Thanks,
Bernd

Comment 21 Kamil Dudka 2011-06-07 12:06:30 UTC

Bernd, here comes my unsuccessful attempt to reproduce the issue with fc16 tar:

$ rpm -q tar libattr
tar-1.26-1.fc14.x86_64
libattr-2.4.44-6.fc14.x86_64

$ cd /tmp
$ mkdir -p fhgfs_meta/entries/14A9/
$ (cd fhgfs_meta/entries/14A9/ && for i in $(seq 256); do dd if=/dev/zero of=$i bs=1048576 skip=1023 count=1 && setfattr -n user.idx -v $i $i; done)

$ strace -e trace=fgetxattr tar --xattrs -cvf test.tar fhgfs_meta/entries/14A9/
...
fgetxattr(5, "system.posix_acl_access", 0x7fffa5602d60, 132) = -1 ENODATA (No data available)
fgetxattr(5, "security.selinux", "unconfined_u:object_r:user_tmp_t:s0", 255) = 36
fgetxattr(5, "security.selinux", "unconfined_u:object_r:user_tmp_t:s0", 1024) = 36
fgetxattr(5, "user.idx", "20", 1024)    = 2
fhgfs_meta/entries/14A9/20
...

There seems to be nothing entirely broken at first glance.  If you have a reliable reproducer, please open the bug yourself.  For me it is hard to fix unless I am able to reproduce the issue first.

Comment 22 RHEL Program Management 2011-07-05 23:43:34 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 31 errata-xmlrpc 2012-06-20 13:49:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0849.html