Bug 1192114

Summary: edge-triggered epoll breaks rpc-throttling
Product: [Community] GlusterFS Reporter: Xavi Hernandez <jahernan>
Component: rpcAssignee: Shyamsundar <srangana>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: annair, bugs, gluster-bugs, ndevos, rcyriac, rgowdapp, srangana, vagarwal
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-05-14 17:29:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1196546    

Description Xavi Hernandez 2015-02-12 16:10:00 UTC
Description of problem:

On slow processors or on busy processors, when a lot of data is written to an gluster volume through a NFS mount point, Gluster gets saturated and NFS disconnects with timeout, causing an error.

Version-Release number of selected component (if applicable): mainline


How reproducible:

Very often on slow processor (Dual Core Atom)


Steps to Reproduce:
1. Create a replica 3 volume (it also happens with a replica 2, but less often)
2. Mount the volume using NFS
3. dd if=/dev/zero of=<nfs mount>/test bs=1024k count=1k (it has to be a big file)


Actual results:

dd takes a lot of time and returns I/O error.

Expected results:

It should complete successfully.

Additional info:

System logs (/var/log/messages) show NFS timeouts. nfs.log shows many of these errors:

[2015-02-12 16:04:52.543598] E [rpcsvc.c:1257:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x9672b064, Program: NFS3, ProgVers: 3, Proc: 7) to rpc-transport (socket.nfs-server)
[2015-02-12 16:04:52.543671] E [nfs3.c:565:nfs3svc_submit_reply] 0-nfs-nfsv3: Reply submission failed

It seems like nfs-server is processing large amounts of requests without waiting for answers. This saturates gluster on slow/busy machines.

Comment 1 Anand Avati 2015-02-12 17:15:06 UTC
REVIEW: http://review.gluster.org/9649 (Temporarily remove nfs.t to avoid regression failures) posted (#1) for review on master by Xavier Hernandez (xhernandez)

Comment 2 Xavi Hernandez 2015-02-12 17:16:07 UTC
(In reply to Anand Avati from comment #1)
> REVIEW: http://review.gluster.org/9649 (Temporarily remove nfs.t to avoid
> regression failures) posted (#1) for review on master by Xavier Hernandez
> (xhernandez)

This patch temporarily removes nfs.t to avoid regression test failures. It must be added again once the original problem is solved.

Comment 3 Xavi Hernandez 2015-02-12 17:21:25 UTC
More investigation with Shyam seems to point to the edge triggered epoll introduced by the Multi-thread epoll patch.

It seems that edge-triggered epoll does not enforce the value of outstanding-rpc-limit option.

Comment 4 Anand Avati 2015-02-12 18:42:50 UTC
COMMIT: http://review.gluster.org/9649 committed in master by Vijay Bellur (vbellur) 
------
commit 47f6f41f1204d0a1fd3cb699919c757d3238fdf3
Author: Xavier Hernandez <xhernandez>
Date:   Thu Feb 12 18:11:46 2015 +0100

    Temporarily remove nfs.t to avoid regression failures
    
    Test basic/ec/nfs.t is causing many regression failures due to
    a problem related with NFS.
    
    While the NFS problem is solved, this patch removes the test
    to avoid more regression failures.
    
    Change-Id: I29884c5e06732e427130d1bc82f1b83553916f95
    BUG: 1192114
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: http://review.gluster.org/9649
    Reviewed-by: Shyamsundar Ranganathan <srangana>
    Reviewed-by: Niels de Vos <ndevos>
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>

Comment 5 Niels de Vos 2015-02-17 12:33:59 UTC
Hey Shyam, did you continue your investigation on this topic?

If this really breaks the rpc-throttling, should the ET-epoll change get reverted until we have a fix  for it?

Comment 6 Shyamsundar 2015-02-17 14:09:48 UTC
@Niels: The MT epoll patch does break the throttling, so we need to get that fixed.

Reverting the MT epoll would not help, as that would take us backwards. With MT epoll there are going to be edge cases of races in other parts of the code, the only way to scrub them and get things moving would be to fix them, so I am not for reverting the epoll patch at all.

On the investigation front, I would say that is complete, we need a fix for the same that does not seem trivial or in line with the previous approach of throttling the readers, we may need to throttle the writers.

Comment 7 Anand Avati 2015-02-20 20:17:42 UTC
REVIEW: http://review.gluster.org/9722 (epoll: Fix broken RPC throttling due to MT epoll) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)

Comment 8 Anand Avati 2015-02-23 14:28:50 UTC
REVIEW: http://review.gluster.org/9722 (epoll: Fix broken RPC throttling due to MT epoll) posted (#2) for review on master by Shyamsundar Ranganathan (srangana)

Comment 9 Anand Avati 2015-02-23 15:06:24 UTC
REVIEW: http://review.gluster.org/9726 (epoll: Fix broken RPC throttling due to MT epoll) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)

Comment 10 Niels de Vos 2015-02-24 11:54:50 UTC
Anoop, Vivkek, if you need this fixed in a Red Hat product, please clone the bug and set your keywords and flags there. This is a Gluster Community bug and ZStream or "blocker?" flags have no meaning here.

Comment 12 Anand Avati 2015-02-24 19:30:38 UTC
REVIEW: http://review.gluster.org/9726 (epoll: Fix broken RPC throttling due to MT epoll) posted (#2) for review on master by Shyamsundar Ranganathan (srangana)

Comment 13 Anand Avati 2015-02-24 19:32:54 UTC
REVIEW: http://review.gluster.org/9726 (epoll: Fix broken RPC throttling due to MT epoll) posted (#3) for review on master by Shyamsundar Ranganathan (srangana)

Comment 14 Anand Avati 2015-03-02 06:50:11 UTC
COMMIT: http://review.gluster.org/9726 committed in master by Raghavendra G (rgowdapp) 
------
commit c48cbccfafbcf71aaad4ed7d868dbac609bc34fe
Author: Shyam <srangana>
Date:   Mon Feb 23 10:00:39 2015 -0500

    epoll: Fix broken RPC throttling due to MT epoll
    
    The RPC throttle which kicks in by setting the poll-in event on a
    socket to false, is broken with the MT epoll commit. This is due
    to the event handler of poll-in attempting to read as much out of
    the socket till it receives an EAGAIN. Which may never happen and
    hence we would be processing far more RPCs that we want to.
    
    This is being fixed by changing the epoll from ET to LT, and
    reading request by request, so that we honor the throttle.
    
    The downside is that we do not drain the socket, but go back to
    epoll_wait before reading the next request, but when kicking in
    throttle, we need to anyway and so a busy connection would degrade
    to LT anyway to maintain the throttle. As a result this change
    should not cause deviation in the performance much for busy
    connections.
    
    Change-Id: I522d284d2d0f40e1812ab4c1a453c8aec666464c
    BUG: 1192114
    Signed-off-by: Shyam <srangana>
    Reviewed-on: http://review.gluster.org/9726
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 15 Niels de Vos 2015-05-14 17:29:08 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 16 Niels de Vos 2015-05-14 17:35:50 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 17 Niels de Vos 2015-05-14 17:38:12 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 18 Niels de Vos 2015-05-14 17:45:59 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user