Bug 765438 (GLUSTER-3706)

Summary: 'grep' job fails for N-1 failover tests
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: HDFSAssignee: Steve Watt <swatt>
Status: CLOSED EOL QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: pre-releaseCC: bugs, gluster-bugs, mzywusko, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-22 15:40:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description M S Vishwanath Bhat 2011-10-06 18:06:53 UTC
I was running grep job on a 2*2*2 dist-striped-replicate volume when I took down one machine in each replica pair. Grep job which was running ran for some more time but failed after sometime with following back trace.

11/10/05 19:52:43 INFO mapred.JobClient:     Reduce input records=1
11/10/05 19:52:44 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/10/05 19:52:44 INFO mapred.FileInputFormat: Total input paths to process : 1
java.io.IOException: Cannot get layout
        at org.apache.hadoop.fs.glusterfs.GlusterFSXattr.execGetFattr(GlusterFSXattr.java:208)
        at org.apache.hadoop.fs.glusterfs.GlusterFSXattr.getPathInfo(GlusterFSXattr.java:73)
        at org.apache.hadoop.fs.glusterfs.GlusterFileSystem.getFileBlockLocations(GlusterFileSystem.java:458)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:222)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
        at org.apache.hadoop.examples.Grep.run(Grep.java:84)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.examples.Grep.main(Grep.java:93)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


In this particular scenario I was running with quick-slave-io OFF. But i have hit this issue with quick-slave-io ON too.

I have archived logs.

Comment 1 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.