Bug 802513 - AFR does not distribute reads well enough
Summary: AFR does not distribute reads well enough
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jeff Darcy
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-12 17:31 UTC by Jeff Darcy
Modified: 2013-07-24 17:18 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:18:44 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jeff Darcy 2012-03-12 17:31:31 UTC
After Ben England found that reads were all going to a single subvolume in 3.2.5, I looked into this area of the code.  The earlier "first to respond" approach seems to have been abandoned, but the "round robin" approach used in 3.3 does an inadequate job of avoiding hot spots.  Consider an application where each client opens a hundred files in the same sequence.  Because of the way the round-robin works, these two clients have a 50% chance of their read_child_rr values being in sync and *staying in sync* throughout the entire set of opens.

A better solution would be to have the selection of a read child be random.  Better still, it could be based on a hash of the gfid (to distribute reads among files but ensure consistent reads for a single file) or of the gfid plus some client identifier (to provide full distribution even for a single file).  The patch will be forthcoming as soon as I have a bug number.

Comment 1 Jeff Darcy 2012-03-19 13:51:03 UTC
Avati: here's the explanation of option 2, already referenced from http://review.gluster.com/#change,2926,patchset=1 which you rejected without checking.

Comment 2 Amar Tumballi 2012-08-05 18:50:44 UTC
http://review.gluster.com/2926 is commited to master branch


Note You need to log in before you can comment on or make changes to this bug.