Bug 762747 (GLUSTER-1015)

Summary: ls returns "File descriptor in bad state" after server down/up
Product: [Community] GlusterFS Reporter: Vikas Gorur <vikas>
Component: replicateAssignee: Pavan Vilas Sondur <pavan>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 3.0.4CC: fharshav, garrettwp, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Client TRACE log none

Description Vikas Gorur 2010-06-21 19:37:47 UTC
Created attachment 235 [details]
XFree86 Xserver output for 1024x768 16bpp

Comment 1 Vikas Gorur 2010-06-21 20:41:41 UTC
Issue can be fixed by specifying

 option strict-readdir on

to the replicate volume.

Comment 2 Vikas Gorur 2010-06-21 21:15:42 UTC
Behavior is expected. Option "strict-readdir" fixes it. Perhaps strict-readdir should be the default?

Comment 3 Harshavardhana 2010-06-21 21:17:11 UTC
(In reply to comment #3)
> Behavior is expected. Option "strict-readdir" fixes it. Perhaps strict-readdir
> should be the default?

Isn't keeping this on a performance bottleneck?

Comment 4 Vikas Gorur 2010-06-21 22:36:56 UTC
Setup is GlusterFS 3.0.4, replicate with two subvolumes, no performance translators.

The bug was seen when the following sequence of operations were done:

# cd /mnt/glusterfs/test1

Kill glusterfsd on server1

# mkdir test2
# cd test2
# echo "hi" > file

Start glusterfsd on server1

# ls
.: File descriptor in bad state

Also, self-heal was not triggered on the test2 directory until a "cd .." was done on the client.

TRACE logs for the client are attached.

Comment 5 Garrett Prochnow 2010-06-22 19:19:37 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Behavior is expected. Option "strict-readdir" fixes it. Perhaps strict-readdir
> > should be the default?
> 
> Isn't keeping this on a performance bottleneck?

The whole point of using the replicate translator is redundancy....if the clients need to use readdir, it should be set by default.