Unicode filenames are not working with glusterfind, following tracebacks when unicode file names are used. : File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 391, in <module> : actual_end = changelog_crawl(args.brick, start, end, args) : File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 342, in changelog_crawl : return get_changes(brick, working_dir, log_file, start, end, args) : File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 309, in get_changes : gfid_to_path_using_pgfid(brick, changelog_data, args) : File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 164, in gfid_to_path_using_pgfid : subdirs_crawl=False) : File "/usr/libexec/glusterfs/glusterfind/utils.py", line 74, in find : callback_func(full_path, filter_result) : File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 148, in output_callback : path = output_path_prepare(path, args.output_prefix) : File "/usr/libexec/glusterfs/glusterfind/utils.py", line 241, in output_path_prepare : return urllib.quote_plus(path) : File "/usr/lib64/python2.6/urllib.py", line 1242, in quote_plus : s = quote(s, safe + ' ') : File "/usr/lib64/python2.6/urllib.py", line 1236, in quote : res = map(safe_map.__getitem__, s) :KeyError: u'\u0422' File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 391, in <module> actual_end = changelog_crawl(args.brick, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 342, in changelog_crawl return get_changes(brick, working_dir, log_file, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 309, in get_changes gfid_to_path_using_pgfid(brick, changelog_data, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 164, in gfid_to_path_using_pgfid subdirs_crawl=False) File "/usr/libexec/glusterfs/glusterfind/utils.py", line 74, in find callback_func(full_path, filter_result) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 148, in output_callback path = output_path_prepare(path, args.output_prefix) File "/usr/libexec/glusterfs/glusterfind/utils.py", line 241, in output_path_prepare return urllib.quote_plus(path) File "/usr/lib64/python2.6/urllib.py", line 1242, in quote_plus s = quote(s, safe + ' ') File "/usr/lib64/python2.6/urllib.py", line 1236, in quote res = map(safe_map.__getitem__, s) KeyError: u'\u0422' File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 391, in <module> actual_end = changelog_crawl(args.brick, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 342, in changelog_crawl return get_changes(brick, working_dir, log_file, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 309, in get_changes gfid_to_path_using_pgfid(brick, changelog_data, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 164, in gfid_to_path_using_pgfid subdirs_crawl=False) File "/usr/libexec/glusterfs/glusterfind/utils.py", line 74, in find callback_func(full_path, filter_result) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 148, in output_callback path = output_path_prepare(path, args.output_prefix) File "/usr/libexec/glusterfs/glusterfind/utils.py", line 241, in output_path_prepare return urllib.quote_plus(path) File "/usr/lib64/python2.6/urllib.py", line 1242, in quote_plus s = quote(s, safe + ' ') File "/usr/lib64/python2.6/urllib.py", line 1236, in quote res = map(safe_map.__getitem__, s) KeyError: u'\u0422' File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 400, in <module> actual_end = changelog_crawl(args.brick, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 343, in changelog_crawl return get_changes(brick, working_dir, log_file, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 310, in get_changes gfid_to_path_using_pgfid(brick, changelog_data, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 121, in gfid_to_path_using_pgfid populate_pgfid_and_inodegfid(brick, changelog_data) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 103, in populate_pgfid_and_inodegfid changelog_data.inodegfid_add(os.stat(p).st_ino, gfid) File "/usr/libexec/glusterfs/glusterfind/changelogdata.py", line 294, in inodegfid_add "converted": converted File "/usr/libexec/glusterfs/glusterfind/changelogdata.py", line 195, in _add params.append(unicode(value, "utf8")) TypeError: decoding Unicode is not supported File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 400, in <module> actual_end = changelog_crawl(args.brick, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 343, in changelog_crawl return get_changes(brick, working_dir, log_file, start, end, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 310, in get_changes gfid_to_path_using_pgfid(brick, changelog_data, args) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 121, in gfid_to_path_using_pgfid populate_pgfid_and_inodegfid(brick, changelog_data) File "/usr/libexec/glusterfs/glusterfind/changelog.py", line 103, in populate_pgfid_and_inodegfid changelog_data.inodegfid_add(os.stat(p).st_ino, gfid) File "/usr/libexec/glusterfs/glusterfind/changelogdata.py", line 294, in inodegfid_add "converted": converted File "/usr/libexec/glusterfs/glusterfind/changelogdata.py", line 195, in _add params.append(unicode(value, "utf8")) TypeError: decoding Unicode is not supported
REVIEW: http://review.gluster.org/13798 (tools/glusterfind: Handling Unicode file names) posted (#1) for review on master by Aravinda VK (avishwan)
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
COMMIT: http://review.gluster.org/13798 committed in master by Aravinda VK (avishwan) ------ commit 48a0a38fadf9c5164869a908dcff8a951aa21b4b Author: Aravinda VK <avishwan> Date: Mon Mar 21 16:57:48 2016 +0530 tools/glusterfind: Handling Unicode file names Unicode filenames handled cleanly with this patch. Changelog files and output files are opened with utf-8 encoding using codecs.open. urllib.quote_plus and unquote_plus will not handle Unicode so, encode Unicode to 8-bit string version before calling unquote. urllib.quote_plus requires 8-bit string itself so do not decode to Unicode if we need to use quote_plus(when --no-encode=false). Decode to unicode in --no-encode is set. BUG: 1319717 Change-Id: If5561c749ab5529445650d322c831eb4da22b65a Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/13798 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Milind Changire <mchangir> Reviewed-by: Kotresh HR <khiremat>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user