Bug 1111454 - creating symlinks generates errors on stripe volume
Summary: creating symlinks generates errors on stripe volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: stripe
Version: 3.5.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ravishankar N
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: glusterfs-3.5.2
TreeView+ depends on / blocked
 
Reported: 2014-06-20 05:39 UTC by Ravishankar N
Modified: 2014-07-31 11:43 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.5.2beta1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-31 11:43:09 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ravishankar N 2014-06-20 05:39:02 UTC
Description of problem:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=751888

================================================================
From: Matteo Checcucci <matteo.checcucci.unifi.it>
To: Debian Bug Tracking System <submit.org>
Subject: glusterfs-server: creating symlinks generates errors
Date: Tue, 17 Jun 2014 15:29:55 +0200

Package: glusterfs-server
Version: 3.5.0-1
Severity: grave
Justification: causes non-serious data loss

Dear Maintainer,
after upgrading to version 3.5.0-1 and rebooting the systems, I found
out that creating symlinks on a glusterfs partition generates errors.

I have glusterfs partitions (type: stripe, connection: IP over
InfiniBand) built with bricks on host1, host2, host3, ... and mounted on
master, host1, host2, host3, ...

For instance:

Volume Name: scratch1-8
Type: Stripe
Volume ID: 85c89403-0fc5-43cf-8fde-a8fa2322e6e6
Status: Started
Number of Bricks: 1 x 8 = 8
Transport-type: tcp
Bricks:
Brick1: host1:/data/brick1
Brick2: host2:/data/brick1
Brick3: host3:/data/brick1
Brick4: host4:/data/brick1
Brick5: host5:/data/brick1
Brick6: host6:/data/brick1
Brick7: host7:/data/brick1
Brick8: host8:/data/brick1


If I create a symlink inside one glusterfs partition from master
like this:

  $ cd /storage/scratch1-8/directory
  $ touch foo
  $ ln -s foo my_link

then, my_link may be correctly followed with some commands (such as
cat, readlink, ...), but generates errors with other commands (such as
ls -l, cp -a, ...).
An especially troublesome consequence is that if I

  $ cp -a /storage/scratch1-8/directory ~/tmp/

the directory and file `foo' are copied, but not `my_link'.
This causes non-serious data loss, as soon as the original directory
is discarded.

Downgrading to version 3.4.2-1 and rebooting the systems makes
the issue vanish.

Please investigate this bug.

Thanks in advance for any help.
================================================================

Comment 1 Ravishankar N 2014-06-20 09:07:37 UTC
For stripe volume, symlinks are created only on the first brick (unlike hardlinks which are created on all bricks). Lookups from mount are sent to all bricks. The ones where the symlink is absent send back ESTALE (http://review.gluster.org/#/c/6318/)

From mount log:
[2014-06-19 13:59:32.506315] D [stripe.c:193:stripe_lookup_cbk] 0-testvol-stripe-0: testvol-client-1 returned error Stale file handle
[2014-06-19 13:59:32.506666] D [stripe.c:193:stripe_lookup_cbk] 0-testvol-stripe-0: testvol-client-2 returned error Stale file handle
[2014-06-19 13:59:32.506895] D [stripe.c:193:stripe_lookup_cbk] 0-testvol-stripe-0: testvol-client-3 returned error Stale file handle
[2014-06-19 13:59:32.506986] D [dht-layout.c:679:dht_layout_normalize] (-->/usr/local/lib/glusterfs/3.5qa2/xlator/cluster/stripe.so(stripe_lookup_cbk+0x89e) [0x7f1c98e99cda] (-->/usr/local/lib/glusterfs/3.5qa2/xlator/cluster/distribute.so(dht_discover_cbk+0x6a6) [0x7f1c98c4c5f5] (-->/usr/local/lib/glusterfs/3.5qa2/xlator/cluster/distribute.so(dht_discover_complete+0x278) [0x7f1c98c4b929]))) 0-testvol-dht: path=/dir/slink0 err=Stale file handle on subvol=testvol-stripe-0
[2014-06-19 13:59:32.507033] W [fuse-resolve.c:147:fuse_resolve_gfid_cbk] 0-fuse: d9c3dc43-31c6-4797-bbc8-f7af7e187701: failed to resolve (Stale file handle)
[2014-06-19 13:59:32.507059] E [fuse-bridge.c:1393:fuse_readlink_resume] 0-glusterfs-fuse: READLINK 103 (d9c3dc43-31c6-4797-bbc8-f7af7e187701) resolution failed

Comment 2 Anand Avati 2014-06-20 13:02:07 UTC
REVIEW: http://review.gluster.org/8135 (cluster/stripe: don't treat ESTALE as failure in lookup) posted (#1) for review on master by Ravishankar N (ravishankar)

Comment 3 Anand Avati 2014-06-23 07:48:26 UTC
COMMIT: http://review.gluster.org/8135 committed in master by Vijay Bellur (vbellur) 
------
commit 1e4a046828ea11cb4c7738a2a00fb715f84dc1ff
Author: Ravishankar N <root@ravi3.(none)>
Date:   Thu Jun 19 17:41:25 2014 +0000

    cluster/stripe: don't treat ESTALE as failure in lookup
    
    Problem:
    In a stripe volume, symlinks are created only on the first brick via the
    default_symlink() call. During gfid lookup, server sends ESTALE from the other
    bricks, which is treated as error in stripe_lookup_cbk()
    
    Fix:
    Don't treat ESTALE as error in stripe_lookup_cbk()
    
    Change-Id: Ie4ac8f0dfd3e61260161620bdc53665882e7adbd
    BUG: 1111454
    Signed-off-by: Ravishankar N <root@ravi3.(none)>
    Reviewed-on: http://review.gluster.org/8135
    Reviewed-by: Raghavendra Bhat <raghavendra>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 4 Anand Avati 2014-06-23 09:28:45 UTC
REVIEW: http://review.gluster.org/8153 (cluster/stripe: don't treat ESTALE as failure in lookup) posted (#1) for review on release-3.5 by Ravishankar N (ravishankar)

Comment 5 Anand Avati 2014-06-24 16:58:23 UTC
COMMIT: http://review.gluster.org/8153 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit f5b203fbaba2c4179c126f5db82cc89569ae1697
Author: Ravishankar N <root@ravi3.(none)>
Date:   Thu Jun 19 17:41:25 2014 +0000

    cluster/stripe: don't treat ESTALE as failure in lookup
    
    Backport of: http://review.gluster.org/8135
    
    Problem:
    In a stripe volume, symlinks are created only on the first brick via the
    default_symlink() call. During gfid lookup, server sends ESTALE from the other
    bricks, which is treated as error in stripe_lookup_cbk()
    
    Fix:
    Don't treat ESTALE as error in stripe_lookup_cbk()
    
    BUG: 1111454
    Change-Id: I337ef847f007b7c20feb365da329c79c121d20c4
    Signed-off-by: Ravishankar N <ravishankar>
    Reviewed-on: http://review.gluster.org/8153
    Reviewed-by: Raghavendra Bhat <raghavendra>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Niels de Vos <ndevos>

Comment 6 Niels de Vos 2014-07-21 15:41:38 UTC
The first (and last?) Beta for GlusterFS 3.5.2 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.2beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-July/041636.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 7 Niels de Vos 2014-07-31 11:43:09 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report.

glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.