Bug 1116399

Summary: [doc] Document the dependency that the default volume must be started when running jobs using additional volumes
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Tomas Meszaros <tmeszaro>
Component: rhs-hadoopAssignee: Bradley Childs <bchilds>
Status: CLOSED CURRENTRELEASE QA Contact: BigData QE <bigdata-qe-bugs>
Severity: low Docs Contact: Divya <divya>
Priority: medium    
Version: rhgs-3.0CC: aavati, eboyd, esammons, matt, mkudlej, nlevinki, rhs-bugs, swatt, vbellur
Target Milestone: ---   
Target Release: RHGS 3.0.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-10 07:45:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomas Meszaros 2014-07-04 13:05:19 UTC
Description of problem:

I think we should document that when I run mapreduce job with non-default volumes for input & output, the default volume has to be also started otherwise mapreduce job will fail.

Comment 1 Tomas Meszaros 2014-07-04 13:51:54 UTC
Additional information:

I've ran: su - bigtop -c "hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-101.jar -input glusterfs://HadoopVol2/user/bigtop/in -output glusterfs://HadoopVol3/user/bigtop/out -mapper /bin/cat -reducer /usr/bin/wc"

where HadoopVol{1..2} are non-default volumes. Default volume (HadoopVol1) has to be started (and not full) during the job because HadoopVol1/user/username/.staging directory is being created during the process.

Comment 2 Steve Watt 2014-07-11 21:50:32 UTC
I have updated the installation guide in the section "Using Hadoop" (near the end) to describe that the HadoopVol needs to be running at all times when using Hadoop.