Bug 804053

Summary: randomtextwriter job fails on a distribute replicate volume with "java.io.FileNotFoundException"
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: HDFSAssignee: Venky Shankar <vshankar>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: gluster-bugs, mzywusko
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-04 10:26:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description M S Vishwanath Bhat 2012-03-16 12:27:40 UTC
Description of problem:
Ran randomtextwriter job on a 4*2 distributed-replicated volume and it failed with "java.io.FileNotFoundException".

Version-Release number of selected component (if applicable):
git master

How reproducible:
Always

Steps to Reproduce:
1. Create a hadoop set-up with 1 JT and 8 TT.
2. Create a 4*2 dist-rep gluster volume.
3. Run randomtextwriter hadoop job
  
Actual results:
The job fails after map jobs and before starting the reduce jobs

12/03/15 18:11:25 INFO mapred.JobClient:  map 98% reduce 0%
12/03/15 18:11:43 INFO mapred.JobClient:  map 100% reduce 0%
12/03/15 18:11:48 INFO mapred.JobClient: Task Id : attempt_201203150524_0010_m_000080_0, Status : FAILED
java.io.FileNotFoundException: File /mnt/glusterfs/randomtextdata/_temporary/_attempt_201203150524_0010_m_000079_1/part-00079 does not exist.
        at org.apache.hadoop.fs.glusterfs.GlusterFileSystem.getFileStatus(GlusterFileSystem.java:276)
        at org.apache.hadoop.fs.glusterfs.GlusterFileSystem.getFileStatusFromFileString(GlusterFileSystem.java:268)
        at org.apache.hadoop.fs.glusterfs.GlusterFileSystem.listStatus(GlusterFileSystem.java:259)
        at org.apache.hadoop.fs.glusterfs.GlusterFileSystem.delete(GlusterFileSystem.java:381)
        at org.apache.hadoop.fs.glusterfs.GlusterFileSystem.delete(GlusterFileSystem.java:387)
        at org.apache.hadoop.mapred.FileOutputCommitter.cleanupJob(FileOutputCommitter.java:64)
        at org.apache.hadoop.mapred.OutputCommitter.cleanupJob(OutputCommitter.java:135)
        at org.apache.hadoop.mapred.Task.runJobCleanupTask(Task.java:826)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:292)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


Expected results:
randomtextriter job should pass.

Comment 1 Venky Shankar 2012-04-04 10:26:54 UTC

*** This bug has been marked as a duplicate of bug 808009 ***