Bug 1024181 - Unicode filenames cause directory listing interactions to hang/loop
Unicode filenames cause directory listing interactions to hang/loop
Status: CLOSED DEFERRED
Product: GlusterFS
Classification: Community
Component: fuse (Show other bugs)
3.3.1
Unspecified Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: bugs@gluster.org
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-29 01:18 EDT by Alex Smith
Modified: 2014-12-14 14:40 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-12-14 14:40:32 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alex Smith 2013-10-29 01:18:49 EDT
Unicode filenames on a GlusterFS volume result in unexpected behaviour. 
 - When trying to list in that directory, ls hangs (relevant sections of strace below) 
 - When copying files via SCP, a remote client will receive the contents of a directory repeatedly in a loop[1]
 - Using ls2[2] sample code to try and rule out if it was an ls issue originally, the directory listing repeats in a loop - which leads me to believe:

It looks like the getdents call never gets to the 'end' of a directory, e.g.:

Affected directory on a GlusterFS volume:

getdents(3, /* 2 entries */, 32768)     = 48
getdents(3, /* 1 entries */, 32768)     = 56
getdents(3, /* 1 entries */, 32768)     = 56
[last line repeats]

Unaffected directory on local disk:

getdents(3, /* 5 entries */, 32768)     = 144
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0


The test system in question is Debian Squeeze, 3.4.55 kernel, libc 2.11.3-4, Gluster 3.3.1-1.

I have not yet ruled out this being a kernel/glibc/etc problem, but I'm conscious that since I can only reproduce it on a GlusterFS volume it may be unlikely. I've raised this against the fuse component - but that's possibly a misnomer, apologies if so. I hope I've included enough information about the GlusterFS configuration below - but I'm happy to provide anything more as necessary.

[1] » scp -r gluster1:/ajs/testdir .
スミス                                                                                                                                                      100%    0     0.0KB/s   00:00
スミス                                                                                                                                                      100%    0     0.0KB/s   00:00
スミス                                                                                                                                                      100%    0     0.0KB/s   00:00
[repeats]

[2] http://www.johnloomis.org/ece537/notes/Files/Examples/ls2.html

[3]

Volume Name: shared
Type: Distribute
Volume ID: c89a941a-591c-4e65-88bd-8be7878be4ff
Status: Started
Number of Bricks: 6
Transport-type: tcp
Bricks:
Brick1: ca1.gl:/data0/gluster
Brick2: ca1.gl:/data1/gluster
Brick3: ca1.gl:/data2/gluster
Brick4: ca1.gl:/data3/gluster
Brick5: ca1.gl:/data4/gluster
Brick6: ca1.gl:/data5/gluster
Options Reconfigured:
nfs.disable: on
performance.flush-behind: on
performance.io-thread-count: 32
performance.write-behind-window-size: 128MB
performance.cache-size: 2048MB
performance.cache-refresh-timeout: 60
Comment 1 Alex Smith 2013-12-06 01:45:49 EST
I've just tried migrating this to XFS and it's working fine. I assume it's something similar to the issue described at http://joejulian.name/blog/glusterfs-bit-by-ext4-structure-change/ -- I guess it's a combination of the kernel/gluster revision/ext4 driver/etc.

I don't know if that's sufficient to close this bug, or whether it's potentially a different Ext4 issue. I'll leave it up to wiser minds. Cheers!
Comment 2 Niels de Vos 2014-11-27 09:54:36 EST
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.

Note You need to log in before you can comment on or make changes to this bug.