Bug 927415 - Remove the QuickSlave IO implementation from the GlusterFileSystem class
Summary: Remove the QuickSlave IO implementation from the GlusterFileSystem class
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: gluster-hadoop
Version: mainline
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jay Vyas
QA Contact: Martin Bukatovic
URL:
Whiteboard:
: 957274 (view as bug list)
Depends On: 957274
Blocks: 927418 959778 959779 961540 1057253
TreeView+ depends on / blocked
 
Reported: 2013-03-25 20:58 UTC by Steve Watt
Modified: 2016-02-01 16:14 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
: 959778 (view as bug list)
Environment:
Last Closed: 2016-02-01 16:14:18 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Steve Watt 2013-03-25 20:58:57 UTC
Description of problem:

The QuickSlave IO behavior needs to be removed in the GlusterFileSystem class given the issues with directly access bricks. One onerous example would be that the files on the brick might could be inconsistent with the volume. The logic needs to be removed from the GlusterFileSystem class in methods such as initialize() and open().

Comment 2 Steve Watt 2013-04-28 18:35:17 UTC
*** Bug 957274 has been marked as a duplicate of this bug. ***

Comment 3 Jay Vyas 2013-04-29 22:39:47 UTC
Okay this is removed in this pull 

https://github.com/gluster/hadoop-glusterfs/pull/30

Comment 4 Martin Kudlej 2014-05-13 14:08:29 UTC
Could you write here any example how to test this fix? Do you know how to invoke inconsistency on volume?

Comment 5 Steve Watt 2014-06-04 14:25:24 UTC
A suggestion - you could add the QuickSlave IO properties to the core-site (via ambari) and then you could run a job and verify (perhaps via strace or something similar) that the NodeManagers DOES NOT try to access data from the brick mount directly (instead of going through the FUSE mount)

Comment 6 Jay Vyas 2014-06-09 17:57:37 UTC
Yes - you'd have to do some runtime tests to verify the absence of this feature.

Comment 7 Martin Bukatovic 2014-07-02 08:25:25 UTC
Related documentation
---------------------

See documentation for RHS 2.0[1] (when the feature was in tech preview)

quick.slave.io is a gluster specific property from core-site.xml

(off by default)

Performance tunable option. If this option is set to On, the plugin will try to perform I/O directly from the disk file system (like ext3 or ext4) the file resides on. As a result, read performance improves and jobs run faster. 

[1] https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.0/html/Administration_Guide/ch19s04.html

Comment 8 Martin Bukatovic 2014-07-02 18:59:15 UTC
Retrying on Red Hat Storage Server 3.0 iso with:

glusterfs-3.6.0.22-1.el6rhs.x86_64

Hortonworks Hadoop distribution:
Ambari: ambari-server-1.4.4.23-1.noarch
Stack : HDP-2.0.6.GlusterFS
hadoop-2.2.0.2.0.6.0-101.el6.x86_64

RHS Hadoop plugin:
rhs-hadoop-2.3.2-6.noarch
rhs-hadoop-install-1_25-1.el6rhs.noarch

Verification
============

Setting 'quick.slave.io' to 'On' via Ambari REST api, value double checked via
groovy script using Hadoop api.

Then running terragen, terrasort, terravalidate (as bigtop user generating 1GB):

~~~
/usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar teragen 10000000 /tmp/terra.in
/usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar terasort /tmp/terra.in /tmp/terra.out
/usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar teravalidate /tmp/terra.out /tmp/terra.report
~~~

While executing following strace for every YarnChild process on the worker nodes:

~~~
strace -p $PID -f -e trace=open > yarnchild.$PID.strace 2>&1 &
~~~

And the shim doesn't access the files on brick directly:

~~~
$ grep '/mnt/brick1/HadoopVol1' *.strace | wc -l
0
$ grep '/mnt/glusterfs/HadoopVol1' *.strace | wc -l
69
~~~

So it works as expected.

Comment 9 Steve Watt 2016-02-01 16:14:18 UTC
This solution is no longer available from Red Hat.


Note You need to log in before you can comment on or make changes to this bug.