Bug 1192114
Summary: | edge-triggered epoll breaks rpc-throttling | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Xavi Hernandez <jahernan> |
Component: | rpc | Assignee: | Shyamsundar <srangana> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | mainline | CC: | annair, bugs, gluster-bugs, ndevos, rcyriac, rgowdapp, srangana, vagarwal |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.7.0 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-05-14 17:29:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1196546 |
Description
Xavi Hernandez
2015-02-12 16:10:00 UTC
REVIEW: http://review.gluster.org/9649 (Temporarily remove nfs.t to avoid regression failures) posted (#1) for review on master by Xavier Hernandez (xhernandez) (In reply to Anand Avati from comment #1) > REVIEW: http://review.gluster.org/9649 (Temporarily remove nfs.t to avoid > regression failures) posted (#1) for review on master by Xavier Hernandez > (xhernandez) This patch temporarily removes nfs.t to avoid regression test failures. It must be added again once the original problem is solved. More investigation with Shyam seems to point to the edge triggered epoll introduced by the Multi-thread epoll patch. It seems that edge-triggered epoll does not enforce the value of outstanding-rpc-limit option. COMMIT: http://review.gluster.org/9649 committed in master by Vijay Bellur (vbellur) ------ commit 47f6f41f1204d0a1fd3cb699919c757d3238fdf3 Author: Xavier Hernandez <xhernandez> Date: Thu Feb 12 18:11:46 2015 +0100 Temporarily remove nfs.t to avoid regression failures Test basic/ec/nfs.t is causing many regression failures due to a problem related with NFS. While the NFS problem is solved, this patch removes the test to avoid more regression failures. Change-Id: I29884c5e06732e427130d1bc82f1b83553916f95 BUG: 1192114 Signed-off-by: Xavier Hernandez <xhernandez> Reviewed-on: http://review.gluster.org/9649 Reviewed-by: Shyamsundar Ranganathan <srangana> Reviewed-by: Niels de Vos <ndevos> Reviewed-by: Vijay Bellur <vbellur> Tested-by: Vijay Bellur <vbellur> Hey Shyam, did you continue your investigation on this topic? If this really breaks the rpc-throttling, should the ET-epoll change get reverted until we have a fix for it? @Niels: The MT epoll patch does break the throttling, so we need to get that fixed. Reverting the MT epoll would not help, as that would take us backwards. With MT epoll there are going to be edge cases of races in other parts of the code, the only way to scrub them and get things moving would be to fix them, so I am not for reverting the epoll patch at all. On the investigation front, I would say that is complete, we need a fix for the same that does not seem trivial or in line with the previous approach of throttling the readers, we may need to throttle the writers. REVIEW: http://review.gluster.org/9722 (epoll: Fix broken RPC throttling due to MT epoll) posted (#1) for review on master by Shyamsundar Ranganathan (srangana) REVIEW: http://review.gluster.org/9722 (epoll: Fix broken RPC throttling due to MT epoll) posted (#2) for review on master by Shyamsundar Ranganathan (srangana) REVIEW: http://review.gluster.org/9726 (epoll: Fix broken RPC throttling due to MT epoll) posted (#1) for review on master by Shyamsundar Ranganathan (srangana) Anoop, Vivkek, if you need this fixed in a Red Hat product, please clone the bug and set your keywords and flags there. This is a Gluster Community bug and ZStream or "blocker?" flags have no meaning here. REVIEW: http://review.gluster.org/9726 (epoll: Fix broken RPC throttling due to MT epoll) posted (#2) for review on master by Shyamsundar Ranganathan (srangana) REVIEW: http://review.gluster.org/9726 (epoll: Fix broken RPC throttling due to MT epoll) posted (#3) for review on master by Shyamsundar Ranganathan (srangana) COMMIT: http://review.gluster.org/9726 committed in master by Raghavendra G (rgowdapp) ------ commit c48cbccfafbcf71aaad4ed7d868dbac609bc34fe Author: Shyam <srangana> Date: Mon Feb 23 10:00:39 2015 -0500 epoll: Fix broken RPC throttling due to MT epoll The RPC throttle which kicks in by setting the poll-in event on a socket to false, is broken with the MT epoll commit. This is due to the event handler of poll-in attempting to read as much out of the socket till it receives an EAGAIN. Which may never happen and hence we would be processing far more RPCs that we want to. This is being fixed by changing the epoll from ET to LT, and reading request by request, so that we honor the throttle. The downside is that we do not drain the socket, but go back to epoll_wait before reading the next request, but when kicking in throttle, we need to anyway and so a busy connection would degrade to LT anyway to maintain the throttle. As a result this change should not cause deviation in the performance much for busy connections. Change-Id: I522d284d2d0f40e1812ab4c1a453c8aec666464c BUG: 1192114 Signed-off-by: Shyam <srangana> Reviewed-on: http://review.gluster.org/9726 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Krishnan Parthasarathi <kparthas> Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |