Bug 1116399 - [doc] Document the dependency that the default volume must be started when running jobs using additional volumes
Summary: [doc] Document the dependency that the default volume must be started when ru...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhs-hadoop
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: RHGS 3.0.0
Assignee: Bradley Childs
QA Contact: BigData QE
Divya
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-04 13:05 UTC by Tomas Meszaros
Modified: 2015-08-10 07:45 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-10 07:45:08 UTC
Embargoed:


Attachments (Terms of Use)

Description Tomas Meszaros 2014-07-04 13:05:19 UTC
Description of problem:

I think we should document that when I run mapreduce job with non-default volumes for input & output, the default volume has to be also started otherwise mapreduce job will fail.

Comment 1 Tomas Meszaros 2014-07-04 13:51:54 UTC
Additional information:

I've ran: su - bigtop -c "hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-101.jar -input glusterfs://HadoopVol2/user/bigtop/in -output glusterfs://HadoopVol3/user/bigtop/out -mapper /bin/cat -reducer /usr/bin/wc"

where HadoopVol{1..2} are non-default volumes. Default volume (HadoopVol1) has to be started (and not full) during the job because HadoopVol1/user/username/.staging directory is being created during the process.

Comment 2 Steve Watt 2014-07-11 21:50:32 UTC
I have updated the installation guide in the section "Using Hadoop" (near the end) to describe that the HadoopVol needs to be running at all times when using Hadoop.


Note You need to log in before you can comment on or make changes to this bug.