Bug 1024181

Summary: Unicode filenames cause directory listing interactions to hang/loop
Product: [Community] GlusterFS Reporter: Alex Smith <alex>
Component: fuseAssignee: bugs <bugs>
Status: CLOSED DEFERRED QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.1CC: bugs, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-14 19:40:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Smith 2013-10-29 05:18:49 UTC
Unicode filenames on a GlusterFS volume result in unexpected behaviour. 
 - When trying to list in that directory, ls hangs (relevant sections of strace below) 
 - When copying files via SCP, a remote client will receive the contents of a directory repeatedly in a loop[1]
 - Using ls2[2] sample code to try and rule out if it was an ls issue originally, the directory listing repeats in a loop - which leads me to believe:

It looks like the getdents call never gets to the 'end' of a directory, e.g.:

Affected directory on a GlusterFS volume:

getdents(3, /* 2 entries */, 32768)     = 48
getdents(3, /* 1 entries */, 32768)     = 56
getdents(3, /* 1 entries */, 32768)     = 56
[last line repeats]

Unaffected directory on local disk:

getdents(3, /* 5 entries */, 32768)     = 144
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0


The test system in question is Debian Squeeze, 3.4.55 kernel, libc 2.11.3-4, Gluster 3.3.1-1.

I have not yet ruled out this being a kernel/glibc/etc problem, but I'm conscious that since I can only reproduce it on a GlusterFS volume it may be unlikely. I've raised this against the fuse component - but that's possibly a misnomer, apologies if so. I hope I've included enough information about the GlusterFS configuration below - but I'm happy to provide anything more as necessary.

[1] » scp -r gluster1:/ajs/testdir .
スミス                                                                                                                                                      100%    0     0.0KB/s   00:00
スミス                                                                                                                                                      100%    0     0.0KB/s   00:00
スミス                                                                                                                                                      100%    0     0.0KB/s   00:00
[repeats]

[2] http://www.johnloomis.org/ece537/notes/Files/Examples/ls2.html

[3]

Volume Name: shared
Type: Distribute
Volume ID: c89a941a-591c-4e65-88bd-8be7878be4ff
Status: Started
Number of Bricks: 6
Transport-type: tcp
Bricks:
Brick1: ca1.gl:/data0/gluster
Brick2: ca1.gl:/data1/gluster
Brick3: ca1.gl:/data2/gluster
Brick4: ca1.gl:/data3/gluster
Brick5: ca1.gl:/data4/gluster
Brick6: ca1.gl:/data5/gluster
Options Reconfigured:
nfs.disable: on
performance.flush-behind: on
performance.io-thread-count: 32
performance.write-behind-window-size: 128MB
performance.cache-size: 2048MB
performance.cache-refresh-timeout: 60

Comment 1 Alex Smith 2013-12-06 06:45:49 UTC
I've just tried migrating this to XFS and it's working fine. I assume it's something similar to the issue described at http://joejulian.name/blog/glusterfs-bit-by-ext4-structure-change/ -- I guess it's a combination of the kernel/gluster revision/ext4 driver/etc.

I don't know if that's sufficient to close this bug, or whether it's potentially a different Ext4 issue. I'll leave it up to wiser minds. Cheers!

Comment 2 Niels de Vos 2014-11-27 14:54:36 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.