1) The code that creates the mapreduce.jobtracker.system.dir should be removed. It is expected that the directory already exist. 2) Any required hadoop directories should be created with the installer jeffvance: Could we move the hadoop directory creation and permissions to its own installation script? That way we could re-run just the dir creation script to re-create required hadoop directories on the gluster volume. In the same vein user creation will require additional directory creation steps as well (/user/<username>). We should probably included an adduser script that creates the new user across the cluster, and adds their home directory with appropriate permissions.
Brad, there is an undocumented "--_hadoop-dirs" option that executes only the code that creates all the hadoop dirs and sets up the owners and perms. Eg. ./install.sh --_hadoop-dirs /dev/vg/lv # note: the /dev/vg/lv shouldn't be required but the parser still wants it. Will this work for you?
Okay, thats a good fix . I guess that will make it so that , at least "hadoop fs -ls" works on a filesystem thats uninitialized. Meanwhile: Should we add mapreduce.jobtracker.system.dir to https://raw.githubusercontent.com/apache/bigtop/master/bigtop-packages/src/common/hadoop/init-hcfs.json? Is this a common directory defined in all hadoop deployments?
Per Apr-02 bug triage meeting, granting both devel and pm acks
jeff : i tried the script and it worked after some massaging. the install script tried to create my existing gluster volume which caused some issue. would it be possible not to require the hosts file configuration, and maybe just take the FUSE mount point as a parameter for this option? i realize for a system exclusively configured with install.sh this shouldn't be an issue, but in dev setup thats not always the case. jay: yes that config should be updated. i've tried to locate more info on mapreduce.jobtracker.system.dir without much luck. I thought we might be able to pull it out for 2.x, but the docs indicate its still valid. so it seems like this is value is here to stay.
@brad, specifically what massaging did you need to do? If install.sh --_hadoop-dirs attempted to create a vol then that's a bug. Can you copy/paste output from the log file showing that (/var/log/rhs-hadoop-install.log)? Yes, true that it requires the local "hosts" config file which is overkill for what you need, and that should be addressed with a separate BZ against rhs-hadoop-install. thx.
I'm trying to reproduce this one without any success. I downloaded old plugin 2.1.7 from upstream[1] and checked that mapreduce.jobtracker.system.dir is not defined: i) 'grep mapreduce.jobtracker.system.dir -R /etc/hadoop/conf/' shows nothing ii) following groovy script shows null: ~~~ import org.apache.hadoop.conf.Configuration; conf = new Configuration(); println conf.get("mapreduce.jobtracker.system.dir") ~~~ but the plugin still works: ~~~ [bigtop@mrg-qe-vm-c2-302 ~]$ hadoop fs -ls / 14/07/01 15:20:16 INFO glusterfs.GlusterVolume: Initializing gluster volume.. 14/07/01 15:20:16 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS 14/07/01 15:20:16 INFO glusterfs.GlusterFileSystem: Initializing GlusterFS, CRC disabled. 14/07/01 15:20:16 INFO glusterfs.GlusterFileSystem: GIT INFO={git.commit.id.abbrev=938d258, git.commit.user.email=jvyas, git.commit. message.full=[ci-skip] , git.commit.id=938d2585069f0409cf8f81386f0eeeb0715ff329, git.commit.message.short=[ci-skip], git.commit.user.name=rhbdjenkins, git.build.user. name=Unknown, git.commit.id.describe=2.1.7, git.build.user.email=Unknown, git.branch=master, git.commit.time=26.03.2014 @ 19:04:09 UTC, git.bui ld.time=26.03.2014 @ 19:04:33 UTC} 14/07/01 15:20:16 INFO glusterfs.GlusterFileSystem: GIT_TAG=2.1.7 14/07/01 15:20:16 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS 14/07/01 15:20:16 INFO glusterfs.GlusterVolume: Initializing gluster volume.. 14/07/01 15:20:16 INFO glusterfs.GlusterVolume: Root of Gluster file system is /mnt/glusterfs 14/07/01 15:20:16 INFO glusterfs.GlusterVolume: Working directory is : glusterfs:/user/bigtop 14/07/01 15:20:16 INFO glusterfs.GlusterVolume: Write buffer size : 131072 14/07/01 15:20:16 INFO glusterfs.GlusterVolume: Default block size : 67108864 Found 1 items drwxrwxr-- - yarn hadoop 123 2014-07-01 13:54 /HadoopVol1 [bigtop@mrg-qe-vm-c2-302 ~]$ ~~~ Any idea what did I miss? Morevoer I find interesting that nodemanager dump of hadoop configuration shows that mapreduce.jobtracker.system.dir is defined: ~~~ $ for NODE in node1 node2; do wget http://$NODE:8042/conf -O hadoop-conf.${NODE}.xml; done $ grep -i mapreduce.jobtracker.system.dir *.xml hadoop-conf.node1.xml:<property><name>mapreduce.jobtracker.system.dir</name><value>glusterfs:///mapred/system</value><source>mapred-site.xml</source></property> hadoop-conf.node2.xml:<property><name>mapreduce.jobtracker.system.dir</name><value>glusterfs:///mapred/system</value><source>mapred-site.xml</source></property> ~~~ How is that possible? [1] http://rhbd.s3.amazonaws.com/maven/repositories/internal/org/apache/hadoop/fs/glusterfs/gluste rfs-hadoop/2.1.7/glusterfs-hadoop-2.1.7.jar
https://github.com/gluster/glusterfs-hadoop/blob/master/src/main/java/org/apache/hadoop/fs/glusterfs/GlusterVolume.java#L97 <-- this is the line that needs to be gaurded , IMO
Tested with versions: rhs-hadoop-2.3.2-5.el6rhs.noarch hadoop-2.2.0.2.0.6.0-101.el6.x86_64 glusterfs-3.6.0.27-1.el6rhs.x86_64 ambari-agent-1.4.4.23-1.x86_64 ambari-server-1.4.4.23-1.noarch I was able to reproduce issue by deleting mapred.system.dir property from ambari in Services/MapReduce2/Configs/Custom mapred-site.xml, restarting all services and running fs command like hadoop fs -ls /. Here is traceback: -ls: Fatal internal error java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.fs.glusterfs.GlusterVolume.setConf(GlusterVolume.java:179) at org.apache.hadoop.fs.RawLocalFileSystem.initialize(RawLocalFileSystem.java:82) at org.apache.hadoop.fs.glusterfs.GlusterVolume.initialize(GlusterVolume.java:80) at org.apache.hadoop.fs.FilterFileSystem.initialize(FilterFileSystem.java:86) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287) at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:224) at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:207) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190) at org.apache.hadoop.fs.shell.Command.run(Command.java:154) at org.apache.hadoop.fs.FsShell.run(FsShell.java:255) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:305) Caused by: java.lang.NullPointerException at org.apache.hadoop.fs.glusterfs.GlusterVolume.sameVolume(GlusterVolume.java:98) at org.apache.hadoop.fs.glusterfs.GlusterVolume.setConf(GlusterVolume.java:138) ... 20 more I even checked code in srpm glusterfs-hadoop-2.3.2-6 and fix is not present there. Please change state to MODIFIED or ON_QA after building package with fix.
Tested with version: rhs-hadoop-2.3.3-1.el6rhs.noarch (from https://brewweb.devel.redhat.com//buildinfo?buildID=373729) hadoop-2.2.0.2.0.6.0-101.el6.x86_64 glusterfs-3.6.0.27-1.el6rhs.x86_64 ambari-agent-1.4.4.23-1.x86_64 ambari-server-1.4.4.23-1.noarch I tested issue same way as described in comment #14, but with different plugin version. This time, code and fix mentioned in comment #12 was present. Sadly, reaction is still same - there is still NPE after deleting property mapred.system.dir (which is older and maybe deprecated version of mapreduce.jobtracker.system.dir) from ambari. Traceback is still same after running hadoop fs command. I also tried to use exact package Bradley proposed - 2.3.2-5 (https://brewweb.devel.redhat.com//buildinfo?buildID=369089) - behaviour was same. This is little odd to me - are steps to reproduce this issue correct? -> ASSIGNED
I have traced the source. A similar bug surfaces when mapreduce.jobtracker.system.dir isn't defined. The bug is slightly different in that there is a default setting in mapred-default.xml (inside a JAR) which points to /tmp/hadoop/... the gluster volume code is expecting this path to be gluster and its not. The URI scheme comparison was failing and throwing NPE since there's no scheme with the local file path. i don't believe this is a RHS blocker, especially since its outside of the supported install path. i will update upstream then post a link here.
Upstream fix: https://github.com/gluster/glusterfs-hadoop/pull/111
Used versions: rhs-hadoop-2.3.3-3.el6rhs.noarch hadoop-2.4.0.2.1.7.0-784.el6.x86_64 glusterfs-3.6.0.30-1.el6rhs.x86_64 ambari-agent-1.6.1-98.x86_64 ambari-server-1.6.1-98.noarch There is slight change of location mapred.system.dir in Ambari - now it can be found in Services/MapReduce2/Configs/Advanced. Property now can not be deleted, or with no value. Hadoop fs commands works even if there is setted up only space character in property. This issue is fixed -> VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2014-1275.html