Bug 1116399

Summary:	[doc] Document the dependency that the default volume must be started when running jobs using additional volumes
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Tomas Meszaros <tmeszaro>
Component:	rhs-hadoop	Assignee:	Bradley Childs <bchilds>
Status:	CLOSED CURRENTRELEASE	QA Contact:	BigData QE <bigdata-qe-bugs>
Severity:	low	Docs Contact:	Divya <divya>
Priority:	medium
Version:	rhgs-3.0	CC:	aavati, eboyd, esammons, matt, mkudlej, nlevinki, rhs-bugs, swatt, vbellur
Target Milestone:	---
Target Release:	RHGS 3.0.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-08-10 07:45:08 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Tomas Meszaros 2014-07-04 13:05:19 UTC

Description of problem:

I think we should document that when I run mapreduce job with non-default volumes for input & output, the default volume has to be also started otherwise mapreduce job will fail.

Comment 1 Tomas Meszaros 2014-07-04 13:51:54 UTC

Additional information:

I've ran: su - bigtop -c "hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-101.jar -input glusterfs://HadoopVol2/user/bigtop/in -output glusterfs://HadoopVol3/user/bigtop/out -mapper /bin/cat -reducer /usr/bin/wc"

where HadoopVol{1..2} are non-default volumes. Default volume (HadoopVol1) has to be started (and not full) during the job because HadoopVol1/user/username/.staging directory is being created during the process.

Comment 2 Steve Watt 2014-07-11 21:50:32 UTC

I have updated the installation guide in the section "Using Hadoop" (near the end) to describe that the HadoopVol needs to be running at all times when using Hadoop.