Bug 856459 - DHT- lookup (stat, ls) on directory fails and gives error, 'invalid argument' if hashed sub-volume is down
Summary: DHT- lookup (stat, ls) on directory fails and gives error, 'invalid argument...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Venky Shankar
QA Contact: amainkar
URL:
Whiteboard:
: 819444 863114 (view as bug list)
Depends On:
Blocks: 864269
TreeView+ depends on / blocked
 
Reported: 2012-09-12 05:11 UTC by Rachana Patel
Modified: 2015-04-20 11:56 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.4.0.4rhs-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-23 22:33:24 UTC
Embargoed:


Attachments (Terms of Use)
mount-log (205.46 KB, text/x-log)
2012-09-12 05:27 UTC, Rachana Patel
no flags Details

Description Rachana Patel 2012-09-12 05:11:09 UTC
Description of problem:
DHT-  lookup (stat, ls) on directory fails and gives error, 'invalid argument' if hashed sub-volume is down

Version-Release number of selected component (if applicable):


How reproducible:
not always

Steps to Reproduce:
1. Create a Distributed volume having 3 or more sub-volumes on multiple server and start that volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs  server:/<volume> <client-1_mount_point>”

3. From mount point create some dirs and files inside it

4. Find the hash value for dir name and also find its hashed sub volume

from test-client-1
[]# getfattr -n trusted.glusterfs.dht -e hex /home/test
getfattr: Removing leading '/' from absolute path names
# file: home/test
trusted.glusterfs.dht=0x00000001000000000000000055555554

hash value for file d10 is 4b423224 

so it should hash to sub-vol 'test-client-1'

5.bring hashed sub-vol down 

[root@Rhs3 test]# gluster volume status test
Status of volume: test
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick XX1:/home/test				24009	Y	11463
Brick XX2:/home/test				24009	N	16287
Brick XX3:/home/test				24009	Y	12116
NFS Server on localhost				38467	Y	16293
NFS Server on XXX				38467	Y	16005
NFS Server on xxx				38467	Y	15892

6. Execute ls or stat command from mount point

[root@a mnt]# ls
ls: cannot access d1: Invalid argument
ls: cannot access d3: Invalid argument
ls: cannot access d8: Invalid argument
ls: cannot access d10: Invalid argument
ls: cannot access d11: Invalid argument
ls: cannot access d18: Invalid argument
ls: cannot access d19: Invalid argument
ls: cannot access d20: Invalid argument
ls: cannot access d101: Invalid argument
ls: cannot access d105: Invalid argument
ls: cannot access d106: Invalid argument
d1   d101  d103  d105  d107  d11   d111  d113  d115  d117  d12  d14  d16  d18  d2   d3  d5  d7  d9
d10  d102  d104  d106  d109  d110  d112  d114  d116  d118  d13  d15  d17  d19  d20  d4  d6  d8  data1
[root@a mnt]# stat d1
stat: cannot stat `d1': Invalid argument

Actual results:
lookup results in invalid argument

Expected results:
though hashed sub-vol is down, if another sub-vol has files/diretories, it should list those files and directories rather than giving invalid argument.


Additional info:

this error will come whenever hashed sub-vol layout start-stop value is 0,0 as shown below.

[root@a mnt]# getfattr -m . -n trusted.glusterfs.pathinfo /home/racpatel/mnt
getfattr: Removing leading '/' from absolute path names
# file: home/racpatel/mnt
trusted.glusterfs.pathinfo="((<DISTRIBUTE:test-dht> <POSIX(/home/test):Rhs2:/home/test/>) (test-dht-layout (test-client-2 1431655765 2863311529) (test-client-0 2863311530 4294967295) (test-client-1 0 0)))"

Comment 2 Rachana Patel 2012-09-12 05:27:51 UTC
Created attachment 611992 [details]
mount-log

Comment 3 shishir gowda 2012-10-09 14:26:12 UTC
*** Bug 860915 has been marked as a duplicate of this bug. ***

Comment 4 Amar Tumballi 2012-10-11 08:34:18 UTC
http://review.gluster.org/4046 should fix the behavior

Comment 5 Rachana Patel 2012-10-11 08:58:42 UTC
Invalid argument error has been observed in other operations like chown, chgrp, chmod, mkdir(bug 860915) etc(same pre conditions). While verifying this defect one should test all operation.

Comment 6 Vijay Bellur 2012-10-17 07:16:16 UTC
CHANGE: http://review.gluster.org/4046 (cluster/dht: ignore empty ->hashed_subvol during lookup) merged in master by Anand Avati (avati)

Comment 7 Venky Shankar 2012-10-17 08:57:45 UTC
*** Bug 863114 has been marked as a duplicate of this bug. ***

Comment 8 shishir gowda 2012-10-17 12:24:42 UTC
*** Bug 819444 has been marked as a duplicate of this bug. ***

Comment 9 Rachana Patel 2013-03-19 10:47:52 UTC
bug 860915 is reproducible with 3.3.0.6rhs-4.el6rhs.x86_64

fuse mount :-
[root@rhsauto037 test1]# mkdir d98
mkdir: cannot create directory `d98': Invalid argument


nfs mount
[root@rhsauto037 test2]# mkdir d98
mkdir: cannot create directory `d98': Input/output error

[root@cutlass ~]# gluster v status 64-fuse
Status of volume: 64-fuse
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick fred.lab.eng.blr.redhat.com:/brick1/6.4-fuse	24025	N	7262
Brick fan.lab.eng.blr.redhat.com:/brick1/6.4-fuse	24020	Y	19383
Brick mia.lab.eng.blr.redhat.com:/brick1/6.4-fuse	24017	Y	27197
NFS Server on localhost					38467	Y	32556
NFS Server on fred.lab.eng.blr.redhat.com		38467	Y	7268
NFS Server on mia.lab.eng.blr.redhat.com		38467	Y	19066
NFS Server on 10.70.34.91				38467	Y	193

Comment 11 Rachana Patel 2013-03-19 14:10:01 UTC
for defect 856459
able to reproduce with 3.3.0.6rhs-4.el6rhs.x86_64 

on nfs mount
[root@rhsauto037 new1]# stat d6
stat: cannot stat `d6': Invalid argument

Comment 12 Rachana Patel 2013-05-08 07:17:53 UTC
bug 860915 is reproducible with 3.3.4.0.4rhs-1.el6rhs.x86_64

on  nfs mount :-
for directory-
mkdir: cannot create directory `11': Input/output error

for file:-
dd: opening `df59': Input/output error

hence reopening the defect

Comment 14 Rachana Patel 2013-05-09 09:56:04 UTC
verified on with 3.3.4.0.4rhs-1.el6rhs.x86_64
not able to reproduce - 819444, 863114 and this defect 856459

so changing status to verified and bug 860915  is reproducible always so removing it from duplicate of this bug

Comment 15 Scott Haines 2013-09-23 22:33:24 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.