Bug 1534848

Summary: entries not getting cleared post healing of softlinks (stale entries showing up in heal info)
Product: [Community] GlusterFS Reporter: Ravishankar N <ravishankar>
Component: disperseAssignee: Ravishankar N <ravishankar>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.10CC: aspandey, bugs, nchilaka, pkarampu, ravishankar, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.10.10 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1529488 Environment:
Last Closed: 2018-02-02 14:15:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1527309, 1529488    
Bug Blocks: 1534842, 1534847    

Description Ravishankar N 2018-01-16 05:01:46 UTC
+++ This bug was initially created as a clone of Bug #1529488 +++

+++ This bug was initially created as a clone of Bug #1527309 +++

Description of problem:
======================
on an ec volume, stale entries of softlinks are not at all getting cleared even after healing is complete
[root@dhcp35-192 ecv]# gluster v heal ecv full
Launching heal operation to perform full self heal on volume ecv has been successful 
Use heal info commands to check status
[root@dhcp35-192 ecv]# gluster v heal ecv  info
Brick dhcp35-192.lab.eng.blr.redhat.com:/rhs/brick2/ecv
/var/run 
/var/lock 
/var/mail 
Status: Connected
Number of entries: 3

Brick dhcp35-214.lab.eng.blr.redhat.com:/rhs/brick2/ecv
/var/run 
/var/lock 
/var/mail 
Status: Connected
Number of entries: 3

Brick dhcp35-215.lab.eng.blr.redhat.com:/rhs/brick2/ecv
Status: Connected
Number of entries: 0


root@dhcp35-214 ecv]# ls /rhs/brick2/ecv/var/ -lh
total 8.0K
drwxr-xr-x.  2 root root    6 Dec 19 12:45 adm
drwxr-xr-x.  5 root root   44 Dec 19 12:46 cache
drwxr-xr-x.  2 root root    6 Dec 19 12:46 crash
drwxr-xr-x.  3 root root   34 Dec 19 12:46 db
drwxr-xr-x.  3 root root   18 Dec 19 12:46 empty
drwxr-xr-x.  2 root root    6 Dec 19 12:46 games
drwxr-xr-x.  2 root root    6 Dec 19 12:46 gopher
drwxr-xr-x.  3 root root   18 Dec 19 12:46 kerberos
drwxr-xr-x. 26 root root 4.0K Dec 19 12:45 lib
drwxr-xr-x.  2 root root    6 Dec 19 12:46 local
lrwxrwxrwx.  2 root root   11 Dec 19 12:45 lock -> ../run/lock
drwxr-xr-x.  9 root root 4.0K Dec 19 12:45 log
lrwxrwxrwx.  2 root root   10 Dec 19 12:46 mail -> spool/mail
drwxr-xr-x.  2 root root    6 Dec 19 12:46 nis
drwxr-xr-x.  2 root root    6 Dec 19 12:46 opt
drwxr-xr-x.  2 root root    6 Dec 19 12:46 preserve
lrwxrwxrwx.  2 root root    6 Dec 19 12:45 run -> ../run
drwxr-xr-x. 10 root root  114 Dec 19 12:46 spool
drwxr-xr-t.  3 root root   85 Dec 19 12:45 tmp
drwxr-xr-x.  2 root root    6 Dec 19 12:46 yp
[root@dhcp35-214 ecv]# 
 

Version-Release number of selected component (if applicable):
[root@dhcp35-78 ~]# rpm -qa|grep gluster
glusterfs-rdma-3.12.2-1.el7rhgs.x86_64
glusterfs-server-3.12.2-1.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-3.12.2-1.el7rhgs.x86_64
glusterfs-libs-3.12.2-1.el7rhgs.x86_64
glusterfs-fuse-3.12.2-1.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-1.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
glusterfs-api-3.12.2-1.el7rhgs.x86_64
python2-gluster-3.12.2-1.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-1.el7rhgs.x86_64
vdsm-gluster-4.17.33-1.2.el7rhgs.noarch
libvirt-daemon-driver-storage-gluster-3.9.0-1.el7.x86_64
glusterfs-cli-3.12.2-1.el7rhgs.x86_64
[root@dhcp35-78 ~]# 


How reproducible:
================
2/2

Steps to Reproduce:
1.create a 4+2 ec volume
2.copied /var to mount point
3.from backend deleted var directory on one of the bricks
4. did an ls -lRt on mount
5. issued a heal command to heal files


Actual results:
=============
all files got healed except below 3 entries which were showing up in heal info , irrespective of number of time heal was triggered.
all the files were softlinks
[root@dhcp35-192 ecv]# gluster v heal ecv  info
Brick dhcp35-192.lab.eng.blr.redhat.com:/rhs/brick2/ecv
/var/run 
/var/lock 
/var/mail 
Status: Connected
Number of entries: 3

Brick dhcp35-214.lab.eng.blr.redhat.com:/rhs/brick2/ecv
/var/run 
/var/lock 
/var/mail 
Status: Connected
Number of entries: 3

Brick dhcp35-215.lab.eng.blr.redhat.com:/rhs/brick2/ecv
Status: Connected
Number of entries: 0

root@dhcp35-214 ecv]# ls /rhs/brick2/ecv/var/ -lh
total 8.0K
drwxr-xr-x.  2 root root    6 Dec 19 12:45 adm
drwxr-xr-x.  5 root root   44 Dec 19 12:46 cache
drwxr-xr-x.  2 root root    6 Dec 19 12:46 crash
drwxr-xr-x.  3 root root   34 Dec 19 12:46 db
drwxr-xr-x.  3 root root   18 Dec 19 12:46 empty
drwxr-xr-x.  2 root root    6 Dec 19 12:46 games
drwxr-xr-x.  2 root root    6 Dec 19 12:46 gopher
drwxr-xr-x.  3 root root   18 Dec 19 12:46 kerberos
drwxr-xr-x. 26 root root 4.0K Dec 19 12:45 lib
drwxr-xr-x.  2 root root    6 Dec 19 12:46 local
lrwxrwxrwx.  2 root root   11 Dec 19 12:45 lock -> ../run/lock
drwxr-xr-x.  9 root root 4.0K Dec 19 12:45 log
lrwxrwxrwx.  2 root root   10 Dec 19 12:46 mail -> spool/mail
drwxr-xr-x.  2 root root    6 Dec 19 12:46 nis
drwxr-xr-x.  2 root root    6 Dec 19 12:46 opt
drwxr-xr-x.  2 root root    6 Dec 19 12:46 preserve
lrwxrwxrwx.  2 root root    6 Dec 19 12:45 run -> ../run
drwxr-xr-x. 10 root root  114 Dec 19 12:46 spool
drwxr-xr-t.  3 root root   85 Dec 19 12:45 tmp
drwxr-xr-x.  2 root root    6 Dec 19 12:46 yp


--- Additional comment from Ashish Pandey on 2017-12-26 00:55:08 EST ---

upstream patch -
https://review.gluster.org/#/c/19070/

--- Additional comment from Worker Ant on 2017-12-28 06:05:03 EST ---

REVIEW: https://review.gluster.org/19070 (posix: delete stale gfid handles in nameless lookup) posted (#2) for review on master by Ravishankar N

--- Additional comment from Worker Ant on 2018-01-15 22:45:32 EST ---

COMMIT: https://review.gluster.org/19070 committed in master by \"Ravishankar N\" <ravishankar> with a commit message- posix: delete stale gfid handles in nameless lookup

..in order for self-heal of symlinks to work properly (see BZ for
details).

Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501
BUG: 1529488
Signed-off-by: Ravishankar N <ravishankar>

Comment 1 Worker Ant 2018-01-16 05:02:52 UTC
REVIEW: https://review.gluster.org/19201 (posix: delete stale gfid handles in nameless lookup) posted (#1) for review on release-3.10 by Ravishankar N

Comment 2 Worker Ant 2018-01-18 18:22:00 UTC
COMMIT: https://review.gluster.org/19201 committed in release-3.10 by \"Ravishankar N\" <ravishankar> with a commit message- posix: delete stale gfid handles in nameless lookup

..in order for self-heal of symlinks to work properly (see BZ for
details).

Backport of https://review.gluster.org/#/c/19070/
Signed-off-by: Ravishankar N <ravishankar>

Change-Id: I9a011d00b07a690446f7fd3589e96f840e8b7501
BUG: 1534848

Comment 3 Shyamsundar 2018-02-02 14:15:24 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.10, please open a new bug report.

glusterfs-3.10.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-February/000090.html
[2] https://www.gluster.org/pipermail/gluster-users/