Version identification Its important to know which version of the plugin is being run for debugging against different hadoop environments. Currently, the only way to do this are by file name and checksum . We need to 1) Bundle release information into the jar file 2) Add a log statement in file system init that prints version 3) (maybe) automate packaging of version into the shim at release time rather than statically embedding it in the code.
Just to be more explicit, current versioning is available by checking the name of the jar you're employing under /lib. i.e. glusterfs-hadoop-0.0.4. However, if you are experiencing issues and the plugin logs the version of the plugin running on a particular node into that nodes tasktracker logs, then it provides a higher level of confidence that the environment is (or is not) set up correctly and that the right version of the plugin is being used on ALL the nodes. My only caveat is that if we employ this approach, then the version number must be construed dynamically in the log statement, otherwise its going to require a manual change to that line of code each time we roll a release.
fixed here: https://github.com/gluster/hadoop-glusterfs/commit/780445414925f7e8bc2ed1753fe601eb54c80fb2
Output of org.apache.hadoop.fs.glusterfs.Version is OK: java -cp /usr/share/java/glusterfs-2.1.5.jar org.apache.hadoop.fs.glusterfs.Version {git.commit.id.abbrev=51e5108, git.commit.user.email=bchilds, git.commit.message.full=2.1.5 branch/build , git.commit.id=51e5108fbec0b50d921aeb00ba2489bbdbe3d6ff, git.commit.message.short=2.1.5 branch/build, git.commit.user.name=childsb, git.build.user.name=Unknown, git.commit.id.describe=2.1.4-21-g51e5108, git.build.user.email=Unknown, git.branch=master, git.commit.time=17.01.2014 @ 16:05:54 EST, git.build.time=21.01.2014 @ 02:19:28 EST} But if I run hadoop example it seems that there is wrong version: hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-*.jar pi 10 10 Number of Maps = 10 Samples per Map = 10 14/01/23 13:25:46 INFO glusterfs.GlusterVolume: Initializing gluster volume.. 14/01/23 13:25:46 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS 14/01/23 13:25:46 INFO glusterfs.GlusterFileSystem: Initializing GlusterFS, CRC disabled. 14/01/23 13:25:46 INFO glusterfs.GlusterFileSystem: GIT INFO={git.commit.id.abbrev=51e5108, git.commit.user.email=bchilds, git.commit.message.full=2.1.5 branch/build , git.commit.id=51e5108fbec0b50d921aeb00ba2489bbdbe3d6ff, git.commit.message.short=2.1.5 branch/build, git.commit.user.name=childsb, git.build.user.name=Unknown, git.commit.id.describe=2.1.4-21-g51e5108, git.build.user.email=Unknown, git.branch=master, git.commit.time=17.01.2014 @ 16:05:54 EST, git.build.time=21.01.2014 @ 02:19:28 EST} ------------------------ 14/01/23 13:25:46 INFO glusterfs.GlusterFileSystem: GIT_TAG=2.1.4 <--- this can be bug -_____----------------------- 14/01/23 13:25:46 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS 14/01/23 13:25:46 INFO glusterfs.GlusterVolume: Initializing gluster volume.. 14/01/23 13:25:46 INFO glusterfs.GlusterVolume: Root of Gluster file system is /mnt/glusterfs 14/01/23 13:25:46 INFO glusterfs.GlusterVolume: mapreduce/superuser daemon : null 14/01/23 13:25:46 INFO glusterfs.GlusterVolume: Working directory is : glusterfs:/user/root 14/01/23 13:25:46 INFO glusterfs.GlusterVolume: Write buffer size : 131072 -->Assigned
We will need to show that this is a bug in 2.1.7 and up, because we've changed up our build release process since then. I'll leave it open.
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice. If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.