Bug 1102800 - Gluster file names with special characters not seen by hadoop fs -ls
Summary: Gluster file names with special characters not seen by hadoop fs -ls
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhs-hadoop
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: Release Candidate
: ---
Assignee: Bradley Childs
QA Contact: Martin Bukatovic
URL:
Whiteboard:
Depends On:
Blocks: 1159155
TreeView+ depends on / blocked
 
Reported: 2014-05-29 14:55 UTC by Hank Jakiela
Modified: 2015-05-18 01:19 UTC (History)
10 users (show)

Fixed In Version: glusterfs-hadoop-2.3.2-2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-11-24 11:54:39 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1275 0 normal SHIPPED_LIVE Red Hat Storage Server 3 Hadoop plug-in enhancement update 2014-11-24 16:53:36 UTC

Description Hank Jakiela 2014-05-29 14:55:42 UTC
Description of problem: Gluster file names that include special characters are not reported by "hadoop fs -ls". For example, some history files of benchmarks.


Version-Release number of selected component (if applicable):


How reproducible: List files including some with special characters, first at the gluster level, and then at the hadoop level. The filename with the special character "%2D" is missing in the hadoop listing. 




#> ls -l /mnt/glusterfs/mr-history/tmp/hdfs/*6553_0027*

-rwxrwx--- 1 hdfs hadoop 703103 May 29 10:25 /mnt/glusterfs/mr-history/tmp/hdfs/job_1401217096553_0027-1401373490653-hdfs-hadoop%2Dmapreduce%2Dclient%2Djobclient%2D2.2.0.2.0.6.0%2D10-1401373525523-100-1-SUCCEEDED-default.jhist

-rwxrwx--- 1 hdfs hadoop  81829 May 29 10:25 /mnt/glusterfs/mr-history/tmp/hdfs/job_1401217096553_0027_conf.xml

-rwxrwx--- 1 hdfs hadoop    403 May 29 10:25 /mnt/glusterfs/mr-history/tmp/hdfs/job_1401217096553_0027.summary




#> hadoop fs -ls /mr-history/tmp/hdfs/*6553_0027* 

Found 1 items
-rwxrwx---   1 hdfs hadoop        403 2014-05-29 10:25 /mr-history/tmp/hdfs/job_1401217096553_0027.summary

Found 1 items
-rwxrwx---   1 hdfs hadoop      81829 2014-05-29 10:25 /mr-history/tmp/hdfs/job_1401217096553_0027_conf.xml

Expected results: Expected the same list of file names in both cases.


Additional info: This was done on cluster G in the phoenix lab.

Comment 2 Bradley Childs 2014-06-03 18:33:38 UTC
Fixed here: https://github.com/gluster/glusterfs-hadoop/pull/99

Comment 3 Martin Bukatovic 2014-07-07 16:29:03 UTC
With the following:

rhs-hadoop-2.3.2-3.el6rhs.noarch

Hortonworks Hadoop distribution:
Ambari: ambari-server-1.4.4.23-1.noarch
Stack : HDP-2.0.6.GlusterFS
hadoop-2.2.0.2.0.6.0-101.el6.x86_64

Red Hat Storage Server 3.0 iso with:
glusterfs-3.6.0.22-1.el6rhs.x86_64

Listing directory containing files with '%2D' in the filename:

~~~
[bigtop@master bz-1102800]$ hadoop fs -ls /tmp/bz-1102800
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Initializing gluster volume..
14/07/07 18:23:24 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS
14/07/07 18:23:24 INFO glusterfs.GlusterFileSystem: Initializing GlusterFS,  CRC disabled.
14/07/07 18:23:24 INFO glusterfs.GlusterFileSystem: GIT INFO={git.commit.id.abbrev=fad53a1, git.commit.user.email=bchilds.rdu2.redhat.com, git.commit.message.full=[update RPM spec file/changelog] - 2.3.2
, git.commit.id=fad53a132514e2c8d9de6910f0f86d080047abac, git.commit.message.short=[update RPM spec file/changelog] - 2.3.2, git.commit.user.name=Brad Childs, git.build.user.name=Unknown, git.commit.id.describe=2.3.8-11-gfad53a1, git.build.user.email=Unknown, git.branch=2.3.2, git.commit.time=02.07.2014 @ 17:14:29 EDT, git.build.time=02.07.2014 @ 17:24:31 EDT}
14/07/07 18:23:24 INFO glusterfs.GlusterFileSystem: GIT_TAG=2.3.8
14/07/07 18:23:24 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Initializing gluster volume..
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Gluster volume: HadoopVol1 at : /mnt/glusterfs/HadoopVol1
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Gluster volume: HadoopVol2 at : /mnt/glusterfs/HadoopVol2
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Working directory is : glusterfs:/user/bigtop
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Write buffer size : 131072
14/07/07 18:23:24 INFO glusterfs.GlusterVolume: Default block size : 67108864
Found 4 items
-rw-r--r--   1 bigtop hadoop          0 2014-07-07 18:21 /tmp/bz-1102800/bar
-rw-r--r--   1 bigtop hadoop          0 2014-07-07 18:21 /tmp/bz-1102800/foo
-rw-r--r--   1 bigtop hadoop          0 2014-07-07 18:21 /tmp/bz-1102800/foo%2Dbar
-rw-r--r--   1 bigtop hadoop          0 2014-07-07 18:09 /tmp/bz-1102800/job_1401217096553_0027-1401373490653-hdfs-hadoop%2Dmapreduce%2Dclient%2Djobclient%2D2.2.0.2.0.6.0%2D10-1401373525523-100-1-SUCCEEDED-default.jhis
~~~

So it works.

Comment 4 Hank Jakiela 2014-11-04 15:16:58 UTC
Can we get some clarification on this? What exactly does " --- -> rc " mean? Is this fixed in RHS 3.0.x? Which x?

Comment 6 errata-xmlrpc 2014-11-24 11:54:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2014-1275.html


Note You need to log in before you can comment on or make changes to this bug.