765347 – (GLUSTER-3615) terasort failed with quick.slave.io on

Bug 765347 (GLUSTER-3615) - terasort failed with quick.slave.io on

Summary: terasort failed with quick.slave.io on

Keywords:
Status:	CLOSED EOL
Alias:	GLUSTER-3615
Product:	GlusterFS
Classification:	Community
Component:	HDFS
Sub Component:
Version:	pre-release
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Steve Watt
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-09-23 06:29 UTC by M S Vishwanath Bhat
Modified:	2016-06-01 01:57 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-10-22 15:40:20 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
complete stack trace of terasort (38.47 KB, text/plain) 2011-09-23 03:35 UTC, M S Vishwanath Bhat	no flags	Details
View All

Description M S Vishwanath Bhat 2011-09-23 03:35:19 UTC

Created attachment 670


The gluster configuration was 2*2 stripe-replicate.

Comment 1 M S Vishwanath Bhat 2011-09-23 06:29:23 UTC

Created data for terasort with 10000000 rows. Now when start terasort job, it failed with following backtrace.

It says that space not available. But there is actually space available in all the back ends. 

11/09/23 01:57:11 INFO mapred.JobClient: Task Id : attempt_201109220435_0004_m_000142_0, Status : FAILED
java.io.IOException: Spill failed
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
        at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
        at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:40)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201109220435_0004/attempt_201109220435_0004_m_000142_0/output/spill0.out
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)

attempt_201109220435_0004_m_000142_0: Initializing GlusterFS
11/09/23 01:57:13 INFO mapred.JobClient:  map 95% reduce 14%
11/09/23 01:57:17 INFO mapred.JobClient:  map 96% reduce 14%
11/09/23 01:57:22 INFO mapred.JobClient:  map 97% reduce 14%
11/09/23 01:57:29 INFO mapred.JobClient:  map 98% reduce 14%
11/09/23 01:57:32 INFO mapred.JobClient:  map 99% reduce 14%
11/09/23 01:57:37 INFO mapred.JobClient:  map 100% reduce 14%
11/09/23 01:57:49 INFO mapred.JobClient:  map 100% reduce 15%
11/09/23 01:57:52 INFO mapred.JobClient:  map 100% reduce 0%
11/09/23 01:57:54 INFO mapred.JobClient: Task Id : attempt_201109220435_0004_r_000000_1, Status : FAILED
Error: java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:190)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.mapred.IFileOutputStream.write(IFileOutputStream.java:84)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:218)
        at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:157)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2454)

attempt_201109220435_0004_r_000000_1: Initializing GlusterFS
11/09/23 01:57:55 INFO mapred.JobClient:  map 99% reduce 0%
11/09/23 01:57:57 INFO mapred.JobClient: Task Id : attempt_201109220435_0004_m_000133_0, Status : FAILED
attempt_201109220435_0004_m_000133_0: Initializing GlusterFS
11/09/23 01:58:09 INFO mapred.JobClient:  map 100% reduce 0%
11/09/23 01:58:16 INFO mapred.JobClient:  map 100% reduce 1%
11/09/23 01:58:19 INFO mapred.JobClient:  map 100% reduce 2%
11/09/23 01:58:22 INFO mapred.JobClient:  map 100% reduce 3%


I will attach the complete set of backtrace it threw.

Comment 2 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC

pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.

Note You need to log in before you can comment on or make changes to this bug.