1430148 – USS is broken when multiplexing is on

Bug 1430148 - USS is broken when multiplexing is on

Summary: USS is broken when multiplexing is on

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jeff Darcy
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1431176
TreeView+	depends on / blocked

Reported:	2017-03-07 23:45 UTC by Jeff Darcy
Modified:	2017-05-30 18:46 UTC (History)
CC List:	1 user (show)
Fixed In Version:	glusterfs-3.11.0
Clone Of:
Clones:	1431176 (view as bug list)
Environment:
Last Closed:	2017-05-30 18:46:48 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Jeff Darcy 2017-03-07 23:45:32 UTC

This manifests as a test failure in uss.t when we first try to access snap1 through USS.  The underlying problem is described in the commit message for the patch I'll submit as soon as I have a bug number.

    This was causing USS tests to fail.  The underlying problem here is
    that if we try to queue the attach request too soon after starting a
    brick process then the socket code will get an error trying to write
    to the still-unconnected socket.  Its response is to shut down the
    socket, which causes the queued attach requests to be force-unwound.
    There's nothing to retry them, so they effectively never happen and
    those bricks (second and succeeding for a snapshot) never become
    available.
    
    We *do* have a retry loop for attach requests, but currently break out
    as soon as a request is queued - not actually sent.  The fix is to
    modify that loop so it will wait some more if the rpc connection isn't
    even complete yet.  Now we break out only when we have a completed
    connection *and* a queued request.

Comment 1 Worker Ant 2017-03-07 23:46:46 UTC

REVIEW: https://review.gluster.org/16868 (glusterd: don't queue attach reqs before connecting) posted (#1) for review on master by Jeff Darcy (jdarcy)

Comment 2 Worker Ant 2017-03-08 16:01:42 UTC

COMMIT: https://review.gluster.org/16868 committed in master by Jeff Darcy (jdarcy) 
------
commit 53e2c875cf97df8337f7ddb5124df2fc6dd37bca
Author: Jeff Darcy <jdarcy>
Date:   Tue Mar 7 18:36:58 2017 -0500

    glusterd: don't queue attach reqs before connecting
    
    This was causing USS tests to fail.  The underlying problem here is
    that if we try to queue the attach request too soon after starting a
    brick process then the socket code will get an error trying to write
    to the still-unconnected socket.  Its response is to shut down the
    socket, which causes the queued attach requests to be force-unwound.
    There's nothing to retry them, so they effectively never happen and
    those bricks (second and succeeding for a snapshot) never become
    available.
    
    We *do* have a retry loop for attach requests, but currently break out
    as soon as a request is queued - not actually sent.  The fix is to
    modify that loop so it will wait some more if the rpc connection isn't
    even complete yet.  Now we break out only when we have a completed
    connection *and* a queued request.
    
    Change-Id: Ib6be13646f1fa9072b4a944ab5f13e1b29084841
    BUG: 1430148
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: https://review.gluster.org/16868
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Prashanth Pai <ppai>

Comment 3 Shyamsundar 2017-05-30 18:46:48 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report.

glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.