Bug 1155549

Summary: DHT + Rebalance + rename :- file is not accessible( cannot access <filename>: No such file or directory error ) after multiple rename and rebalance
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rachana Patel <racpatel>
Component: distributeAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED DEFERRED QA Contact: storage-qa-internal <storage-qa-internal>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.1CC: mzywusko, nbalacha, rgowdapp, spalai, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: dht-file-access
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1286060 (view as bug list) Environment:
Last Closed: 2015-11-27 10:27:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1286060    

Description Rachana Patel 2014-10-22 10:57:36 UTC
Description of problem:
=======================
after multiple rename and rebalance, lookup on  parent Directory shows file but any access to that file from both mount(FUSE and NFS) fails with no such file or Directory error.
this is happening for few files and those files are present on backend.

[root@snapshot09 nfs1]# ls /mnt/nfs1/mb20-83
ls: cannot access /mnt/nfs1/mb20-83: No such file or directory
[root@snapshot09 nfs1]# ls /mnt/man1/mb20-83
ls: cannot access /mnt/man1/mb20-83: No such file or directory


Version-Release number of selected component (if applicable):
=============================================================


How reproducible:
=================
always


Steps to Reproduce:
===================
1. create, start and mount distributed volume. 
2. create few files inside it
3. start renaming files from multiple mount and start rebalance also
4. repeat step 3 for few times
5. verify all files.

for few files lookup on parents shows that files but unable to access that file

[root@snapshot09 nfs1]# ls  /mnt/man1/
again     mb14-201  mb18-201  mb3-201  mb6-201     zero11-149  zero16-168  zero19-201  zero23-1  zero3-101  zero37-1   zero8-140
mb10-128  mb14-7    mb19-114  mb4-128  mb7-134     zero11-201  zero16-201  zero19-41   zero24-1  zero31-1   zero38-1   zero8-201
mb10-149  mb15-169  mb19-119  mb4-141  mb7-201     zero12-200  zero16-67   zero20-101  zero25-1  zero3-110  zero39-1   zero9-112
mb11-111  mb15-196  mb19-201  mb5-112  mb8-201     zero12-201  zero17-166  zero20-201  zero26-1  zero32-1   zero40-1   zero9-200
mb1-201   mb15-42   mb20-112  mb5-138  mb9-149     zero13-200  zero17-201  zero21-1    zero27-1  zero33-1   zero4-201  zero9-201
mb12-201  mb16-201  mb20-201  mb5-178  zero10-201  zero13-201  zero18-200  zero2-154   zero28-1  zero34-1   zero5-201
mb13-122  mb17-201  mb20-83   mb5-201  zero1-110   zero14-201  zero19-127  zero2-169   zero29-1  zero35-1   zero6-201
mb13-40   mb17-98   mb2-201   mb6-176  zero11-116  zero15-127  zero19-200  zero22-1    zero30-1  zero36-1   zero7-201

now try to access that file

[root@snapshot09 nfs1]# stat /mnt/nfs1/mb20-83
stat: cannot stat `/mnt/nfs1/mb20-83': No such file or directory
[root@snapshot09 nfs1]# stat /mnt/man1/mb20-83
stat: cannot stat `/mnt/man1/mb20-83': No such file or directory
[root@snapshot09 nfs1]# ls /mnt/nfs1/zero19-201
ls: cannot access /mnt/nfs1/zero19-201: No such file or directory
[root@snapshot09 nfs1]# ls /mnt/nfs1/mb15-169:
ls: cannot access /mnt/nfs1/mb15-169:: No such file or directory


backend:-
verified that file is present on backend

[root@snapshot09 ~]#  ls -l /rhs/brick1/m*/mb20-83 -i
33554945 -rw-r--r-- 8 root root 0 Oct 17 14:22 /rhs/brick1/man1/mb20-83



Actual results:
==============
accessto file is failing with error

Expected results:
===============
all files should be accessible from mount


Additional info:
================
log snippet

nfs:-

[2014-10-22 18:36:39.956704] I [dht-layout.c:645:dht_layout_normalize] 0-manhoos-dht: found anomalies in <gfid:3de423ac-8d98-434a-b3fb-93d1ed76ecb0>. holes=1 overlaps=0 missing=5 down=0 misc=0
[2014-10-22 18:36:39.956843] E [nfs3.c:768:nfs3_getattr_resume] 0-nfs-nfsv3: No such file or directory: (10.70.44.62:881) manhoos : 3de423ac-8d98-434a-b3fb-93d1ed76ecb0
[2014-10-22 18:36:39.956879] W [nfs3-helpers.c:3409:nfs3_log_common_res] 0-nfs-nfsv3: XID: 521919a9, GETATTR: NFS: 2(No such file or directory), POSIX: 14(Bad address)
[2014-10-22 18:36:39.959123] I [dht-layout.c:645:dht_layout_normalize] 0-manhoos-dht: found anomalies in <gfid:3de423ac-8d98-434a-b3fb-93d1ed76ecb0>. holes=1 overlaps=0 missing=5 down=0 misc=0
[2014-10-22 18:36:39.959262] E [nfs3.c:768:nfs3_getattr_resume] 0-nfs-nfsv3: No such file or directory: (10.70.44.62:881) manhoos : 3de423ac-8d98-434a-b3fb-93d1ed76ecb0
[2014-10-22 18:36:39.959296] W [nfs3-helpers.c:3409:nfs3_log_common_res] 0-nfs-nfsv3: XID: 531919a9, GETATTR: NFS: 2(No such file or directory), POSIX: 14(Bad address)

Comment 4 Rachana Patel 2014-10-22 11:41:08 UTC
version:-
========
3.4.0.69rhs-1.el6rhs.x86_64

Comment 5 Rachana Patel 2014-10-22 18:27:25 UTC
log snippet FUSE mount:-

[2014-10-22 03:21:32.502200] E [fuse-bridge.c:1170:fuse_getattr_resume] 0-glusterfs-fuse: 248540: GETATTR 139972615252928 (ea6ce7e5-75cd-49f8-be86-e8c8916bae4d) resolution failed
[2014-10-22 03:24:33.738853] E [fuse-bridge.c:1170:fuse_getattr_resume] 0-glusterfs-fuse: 248696: GETATTR 139972615252928 (ea6ce7e5-75cd-49f8-be86-e8c8916bae4d) resolution failed



[2014-10-23 02:27:19.894256] E [fuse-bridge.c:1170:fuse_getattr_resume] 0-glusterfs-fuse: 249010: GETATTR 139972615252460 (e6a83a9f-c06d-4f60-93fb-24ec2df9bcfa) resolution failed

Comment 6 Raghavendra G 2014-11-06 06:23:25 UTC
Most likely a duplicate of bz 1157680?

Comment 8 Susant Kumar Palai 2015-11-27 10:27:15 UTC
Cloning this bug in 3.1. To be fixed in future release.