Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1033093

Summary:	listStatus test failure reveals we need to upgrade tests and then : Fork code OR focus on MR2
Product:	[Community] GlusterFS	Reporter:	Jay Vyas <jvyas>
Component:	gluster-hadoop	Assignee:	Jay Vyas <jvyas>
Status:	CLOSED NOTABUG	QA Contact:	hcfs-gluster-bugs
Severity:	low	Docs Contact:
Priority:	low
Version:	pre-release	CC:	eboyd, jvyas, matt, mbukatov, mkudlej, shaines, swatt
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-06-04 14:44:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1057253

Description Jay Vyas 2013-11-21 14:34:29 UTC

Problem: 

Our listStatus implementation doesn't return files in sorted order.
This breaks the build by breaking the "testListStatus" unit test, which asserts 
that listStatus returns files in sorted order.

Suggested Fix: 

Sort files after calling list status using Collections.sort(File.listFiles), 
new Comparator(){...}

Root cause: 

However, our underlying listStatus method relies on File.listFiles(), which 
DOES NOT gaurantee returning files in sorted order. 

(((( from JDK docs , 1.6  : 

public File[] listFiles()
    Returns an array of abstract pathnames denoting the files in the directory 
denoted by this abstract pathname.
    ...There is no guarantee that the name strings in the resulting array will 
appear in any specific order; they are not, in particular, guaranteed to appear 
in alphabetical order... )))

Comment 1 Jay Vyas 2013-11-22 17:42:31 UTC

This bug has now been broadened:  The semantics of listFiles has changed in newer hadoop versions. 

We thus need to do one of the following:

1) create 2 test modules in the pom: One using old "hadoop-test" (1x) and another using "hadoop-common" (2x).  

  Pros: two code paths means more flexibility.  
  Cons: Its also more code and build to maintain.  It also might mean we need to produce two jar artifacts :(.

2) Embrace 2x tests, and update our tests to pass all 2x tests along with our pom file to pull in newer 2.x tests, and not use 1x tests. 

  Pros: Simpler to maintain and most likely to be "good enough" for any real world 1.x functionality
  Cons: Not quite as fine grained test coverage for 1.x semantics. 

My vote is to go with approach 2 : Its less code, easier to maintain, and gaurantees that the plugin test coverage and artifact is tested against the latest expected community standards for the FileSystem interface.

Comment 2 Bradley Childs 2013-11-22 18:04:02 UTC

I don't believe approach #1 is viable.  You'd really need a separate branch for 1.x and 2.x.

My vote is approach #2 modified-

Iff we find 1.x semantics that pose real world problems vs hypotheticals, then branch into a 1.x branch, and revert the semantics. Then continue to maintain the 1.x and 2.x branch independently. The semantic differences in 1.x and 2.x may be enough to fail unit tests, but realistically insignificant to an end developer.

Comment 3 Jay Vyas 2013-11-25 17:25:13 UTC

I agree with brad that we should go with approach #2... but we certainly won't need to fork though.

** why? because even in the "worst case" scenario outlined above where we need to support real world MR1 semantic differences, we still won't need to fork: The GlusterFileSystem is implemented in a different class than the GlusterFs.

MR1:

https://github.com/gluster/glusterfs-hadoop/blob/master/src/main/java/org/apache/hadoop/fs/glusterfs/GlusterFileSystem.java

MR2: 

https://github.com/gluster/glusterfs-hadoop/blob/master/src/main/java/org/apache/hadoop/fs/local/GlusterFs.java

Both depend on FilterFileSystem, which is set to "wrap" GlusterVolume in its implementation, but nevertheless the FileSystem implementations can override functionality and diverge as required.

Onward and upwards with MR2 !.

Comment 4 Scott Haines 2013-12-04 16:41:11 UTC

Per 2013-12-04 bug triage meeting, re-assigning to jvyas.

Comment 5 Jay Vyas 2014-05-01 17:02:13 UTC

This is fixed in the upstream.  Brew release still pending.

The fix was to run tests against 2x semantics, and over time maybe we can add in 1x file tests where they dont conflict with behaviour of 2x

Comment 7 Martin Kudlej 2014-05-20 14:09:30 UTC

Do i understand it correctly that list of files should be in alphabetical order? Is this example of testcase?
"hadoop fs -ls"

Comment 8 Martin Kudlej 2014-06-04 14:44:24 UTC

This is bug just for 1x. -->close