Bug 1471753 - [disperse] Keep stripe in in-memory cache for the non aligned write
Summary: [disperse] Keep stripe in in-memory cache for the non aligned write
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
Assignee: Ashish Pandey
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1512447
TreeView+ depends on / blocked
 
Reported: 2017-07-17 11:56 UTC by Ashish Pandey
Modified: 2018-03-19 15:09 UTC (History)
3 users (show)

Fixed In Version: glusterfs-4.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1512447 (view as bug list)
Environment:
Last Closed: 2018-03-15 11:17:12 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Ashish Pandey 2017-07-17 11:56:33 UTC
Description of problem:

   
    Problem:
    Consider an EC volume with configuration  4 + 2.
    The stripe size for this would be 512 * 4 = 2048.
    That means, 2048 bytes of user data stored in one
    stripe. Let's say 2048 + 512 = 2560 bytes are
    already written on this volume. 512 Bytes would
    be in second stripe. Now, if there are sequential
    writes with offset 2560 and of size 1 Byte, we have
    to read the whole stripe, encode it with 1 Byte and
    then again have to write it back. Next, write with
    offset 2561 and size of 1 Byte will again
    READ-MODIFY-WRITE the whole stripe. This is causing
    bad performance because of lots of READ request
    travelling over the network.
    
    There are some tools and scenario's where such kind
    of load is coming and users are not aware of that.
    Example: fio and zip
    
    Solution:
    One possible solution to deal with this issue is to
    keep last stripe in memory. This way, we need not to
    read it again and we can save READ fop going over the
    network. Considering the above example, we have to
    keep last 2048 bytes (maximum) in memory per file.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Worker Ant 2017-07-17 12:24:29 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#1) for review on master by Ashish Pandey (aspandey)

Comment 2 Worker Ant 2017-07-19 05:39:03 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#2) for review on master by Ashish Pandey (aspandey)

Comment 3 Worker Ant 2017-07-23 08:47:53 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#3) for review on master by Ashish Pandey (aspandey)

Comment 4 Worker Ant 2017-08-21 05:31:04 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#4) for review on master by Ashish Pandey (aspandey)

Comment 5 Worker Ant 2017-09-18 10:45:22 UTC
REVIEW: https://review.gluster.org/17789 (Problem: Consider an EC volume with configuration  4 + 2. The stripe size for this would be 512 * 4 = 2048. That means, 2048 bytes of user data stored in one stripe. Let's say 2048 + 512 = 2560 bytes are already written on this volume. 512 Bytes would be in second stripe. Now, if there are sequential writes with offset 2560 and of size 1 Byte, we have to read the whole stripe, encode it with 1 Byte and then again have to write it back. Next, write with offset 2561 and size of 1 Byte will again READ-MODIFY-WRITE the whole stripe. This is causing bad performance because of lots of READ request travelling over the network.) posted (#5) for review on master by Ashish Pandey (aspandey)

Comment 6 Worker Ant 2017-09-18 10:55:48 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#6) for review on master by Ashish Pandey (aspandey)

Comment 7 Worker Ant 2017-09-22 08:26:54 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#7) for review on master by Ashish Pandey (aspandey)

Comment 8 Worker Ant 2017-10-03 12:35:56 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#8) for review on master by Ashish Pandey (aspandey)

Comment 9 Worker Ant 2017-10-03 17:51:33 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#9) for review on master by Ashish Pandey (aspandey)

Comment 10 Worker Ant 2017-10-04 07:51:44 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#10) for review on master by Ashish Pandey (aspandey)

Comment 11 Worker Ant 2017-10-05 10:30:10 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#11) for review on master by Ashish Pandey (aspandey)

Comment 12 Worker Ant 2017-10-09 17:20:46 UTC
REVIEW: https://review.gluster.org/17789 (cluster/ec: Keep last written strip in in-memory cache) posted (#12) for review on master by Ashish Pandey (aspandey)

Comment 13 Worker Ant 2017-11-10 22:16:13 UTC
COMMIT: https://review.gluster.org/17789 committed in master by  

------------- cluster/ec: Keep last written strip in in-memory cache

Problem:
Consider an EC volume with configuration  4 + 2.
The stripe size for this would be 512 * 4 = 2048.
That means, 2048 bytes of user data stored in one
stripe. Let's say 2048 + 512 = 2560 bytes are
already written on this volume. 512 Bytes would
be in second stripe. Now, if there are sequential
writes with offset 2560 and of size 1 Byte, we have
to read the whole stripe, encode it with 1 Byte and
then again have to write it back. Next, write with
offset 2561 and size of 1 Byte will again
READ-MODIFY-WRITE the whole stripe. This is causing
bad performance because of lots of READ request
travelling over the network.

There are some tools and scenario's where such kind
of load is coming and users are not aware of that.
Example: fio and zip

Solution:
One possible solution to deal with this issue is to
keep last stripe in memory. This way, we need not to
read it again and we can save READ fop going over the
network. Considering the above example, we have to
keep last 2048 bytes (maximum) in memory per file.

Change-Id: I3f95e6fc3ff81953646d374c445a40c6886b0b85
BUG: 1471753
Signed-off-by: Ashish Pandey <aspandey>

Comment 14 Worker Ant 2017-11-29 06:31:11 UTC
REVIEW: https://review.gluster.org/18882 (cluster/ec: Add test cases for stripe-cache option) posted (#1) for review on master by Ashish Pandey

Comment 15 Worker Ant 2017-11-29 09:34:47 UTC
REVIEW: https://review.gluster.org/18888 (cluster/ec: Modify OP_VERSION to 4.0.0 for stripe cache option) posted (#1) for review on master by Ashish Pandey

Comment 16 Worker Ant 2017-11-29 16:15:28 UTC
COMMIT: https://review.gluster.org/18888 committed in master by \"Ashish Pandey\" <aspandey> with a commit message- cluster/ec: Modify OP_VERSION to 4.0.0 for stripe cache option

Change-Id: I991eaeb979497a1bf056b5871284274f959f36f2
BUG: 1471753
Signed-off-by: Ashish Pandey <aspandey>

Comment 17 Shyamsundar 2018-03-15 11:17:12 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.0, please open a new bug report.

glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 18 Worker Ant 2018-03-19 15:09:39 UTC
REVISION POSTED: https://review.gluster.org/18882 (cluster/ec: Add test cases for stripe-cache option) posted (#9) for review on master by Ashish Pandey


Note You need to log in before you can comment on or make changes to this bug.