Bug 1763865 - [GSS] rpc actor failed to complete successfully
Summary: [GSS] rpc actor failed to complete successfully
Keywords:
Status: CLOSED DUPLICATE of bug 1545277
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: protocol
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Pranith Kumar K
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-21 19:08 UTC by slenzen
Modified: 2019-11-04 09:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-04 09:00:32 UTC
Embargoed:


Attachments (Terms of Use)

Comment 13 Pranith Kumar K 2019-11-04 09:00:32 UTC
This bug is fixed in 3.4.x as part of https://bugzilla.redhat.com/show_bug.cgi?id=1545277.

RCA:
We see the following logs in glusterd:
=======================
[2019-10-17 09:35:13.561221] W [rpcsvc.c:265:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 10.55.210.131:1005
[2019-10-17 09:35:13.561247] E [rpcsvc.c:557:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
=======================

When I looked for the fop-program 1298437, we find that it is fop-program.
=======================
pk@localhost - ~/workspace/rhs-glusterfs ((HEAD detached at v3.8.4-52.5))
14:08:06 :) ⚡ git grep 1298437
rpc/rpc-lib/src/protocol-common.h:#define GLUSTER_FOP_PROGRAM   1298437 /* Completely random */
=======================
RPC calls intended for the bricks are being sent to glusterd.

On the brick we see the following errors:
=======================
[2019-10-17 11:26:47.527079] E [server-helpers.c:388:server_alloc_frame] (-->/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x325) [0x7f90a49058c5] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/server.so(+0x2e7bf) [0x7f908fde57bf] -->/usr/lib64/glusterfs/3.8.4/xlator/protocol/server.so(+0xe094) [0x7f908fdc5094] ) 0-server: invalid argument: client [Invalid argument]
[2019-10-17 11:26:47.527107] E [rpcsvc.c:557:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
=======================

These errors appear when brick finds that the rpc call is received from a client which is yet to complete set-volume on the brick.

Sequence of steps for a client to connect to the brick:
1) Connect to glusterd where the bricks reside
2) query glusterd for the port information on the machine where the brick is running.
3) disconnect from glusterd
4) Connect to the brick
5) do 'set-volume' indicating fops will start coming to the brick.

Fops should be sent over the wire only after step-5. In 3.3.x clients there was a bug where in, fops would be sent over the wire right after step-1). We would even see crashes if this happens at the time the brick is just coming up and is not initialized as in bz#1545277.

Since the bz is fixed in 3.4.x I am marking this a duplicate of the earlier bz.

*** This bug has been marked as a duplicate of bug 1545277 ***


Note You need to log in before you can comment on or make changes to this bug.