Bug 1006044

Summary:	Hadoop benchmark WordCount fails with "Job initialization failed: java.lang.OutOfMemoryError: Java heap space at.." on 17th run of 100GB WordCount (10hrs on 4 node cluster)
Product:	[Community] GlusterFS	Reporter:	Diane Feddema <dfeddema>
Component:	gluster-hadoop	Assignee:	Bradley Childs <bchilds>
Status:	CLOSED EOL	QA Contact:	hcfs-gluster-bugs
Severity:	medium	Docs Contact:
Priority:	low
Version:	mainline	CC:	bugs, chrisw, eboyd, matt, rhs-bugs, vbellur
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-09-18 08:41:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Diane Feddema 2013-09-09 21:14:13 UTC

Description of problem:
This is likely a job tracker memory leak.  Filing this problem with the glusterfs-hadoop plug-in, 
for tracking and documentation purposes.

Hadoop benchmark WordCount (100GB input) fails with
Job initialization failed: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3209) at java.lang.String.(String.java:215) at java.lang.StringBuffer.toString(StringBuffer.java:585) at java.net.URI.toString(URI.java:1907) at java.net.URI.(URI.java:732) at org.apache.hadoop.fs.Path.(Path.java:65) at org.apache.hadoop.fs.Path.(Path.java:50) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getJobSplitFile(JobSubmissionFiles.java:51) at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:75) at org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:835) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:725) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3890) at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 

after running successfully 16 times in a row WordCount with 100GB text input file fails due to apparent Jobtracker memory leak. 


Version-Release number of selected component (if applicable):
Hadoop Version: hadoop-1.2.0.1.3.2.0-110 (part of Hortonworks 1.3)
RHS 2.0.5
glusterfs-hadoop 2.1 

How reproducible:
takes 10+ hours to reproduce and NOT 100% reproducible

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Jay Vyas 2013-11-19 13:11:25 UTC

I vote to close this: ITs probably not a gluster related bug, but rather 
,related to the way the JT stores job metadata.

Is this possibly a duplicate of an existing hadoop JIRA, maybe: 
https://issues.apache.org/jira/browse/HADOOP-4018 or 
https://issues.apache.org/jira/browse/HADOOP-3670
?

Comment 3 Scott Haines 2013-11-19 19:55:45 UTC

Per 11/13 bug triage meeting, re-assigning to bchilds and mark low.

Comment 4 Amar Tumballi 2018-09-18 08:41:02 UTC

gluster-hadoop project is not being actively maintained, and the team is not planning to contribute further on to it.