1628605 – One client hangs when another client loses communication with bricks during intensive write I/O

Bug 1628605 - One client hangs when another client loses communication with bricks during intensive write I/O

Summary: One client hangs when another client loses communication with bricks during i...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	rpc
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-09-13 14:14 UTC by Xavi Hernandez
Modified:	2019-03-25 16:30 UTC (History)
CC List:	1 user (show)
Fixed In Version:	glusterfs-6.0
Clone Of:
Environment:
Last Closed:	2019-03-25 16:30:38 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Xavi Hernandez 2018-09-13 14:14:44 UTC

Description of problem:

Bricks don't detect abrupt client disconnections in a reasonable time. If this happens when the dead client had locks held, accessing the locked file from other clients will take a huge amount of time.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Worker Ant 2018-09-13 14:16:57 UTC

REVIEW: https://review.gluster.org/21170 (socket: set 42 as default tpc-user-timeout) posted (#2) for review on master by Xavi Hernandez

Comment 2 Worker Ant 2018-09-14 05:38:08 UTC

COMMIT: https://review.gluster.org/21170 committed in master by "Raghavendra G" <rgowdapp> with a commit message- socket: set 42 as default tpc-user-timeout

The 'tcp-user-timeout' option is define in the 'socket' module, but it's
configured in 'protocol/server' and 'protocol/client', which are the
parents of the 'socket' module.

However, current options management logic only takes into consideration
default values specified in the 'socket' module itself, ignoring values
defined in the owner xlator.

This patch simply sets the default value of tcp-user-timeout in the
'socket' module so that server and client use the expected value.

Change-Id: Ib8ad7c4ac6aac725b01a78f8c3d10cf4063d2ee6
fixes: bz#1628605
Signed-off-by: Xavi Hernandez <xhernandez>

Comment 3 Shyamsundar 2019-03-25 16:30:38 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.