Bug 1628651

Summary: rpc marks brick disconnected from glusterd
Product: Red Hat Gluster Storage Reporter: Raghavendra Talur <rtalur>
Component: glusterdAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED WONTFIX QA Contact: Bala Konda Reddy M <bmekala>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.4CC: amukherj, apaladug, dmoessne, ekuric, hongkliu, mchangir, mifiedle, rcyriac, rhs-bugs, sanandpa, sankarshan, sheggodu, storage-qa-internal, vbellur
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: ocs-dependency-issue
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1631128 1631329 (view as bug list) Environment:
Last Closed: 2018-10-25 08:41:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1631128    
Bug Blocks: 1598890, 1631329    

Description Raghavendra Talur 2018-09-13 15:54:18 UTC
Description of problem:

As per Atin's comment https://bugzilla.redhat.com/show_bug.cgi?id=1573304#c89

Sometimes glusterd does not send brick detach request on volume delete and that's because of the disconnect event received for the brick in glusterd.

snip from logs:

----------------------------------------------------
[2018-08-28 21:30:36.238183] I [glusterd-utils.c:5542:send_attach_req] 0-management: not connected yet
[2018-08-28 21:23:49.759511] C [rpc-clnt.c:449:rpc_clnt_fill_request_info] 0-management: cannot lookup the saved frame corresponding to xid (2170)
[2018-08-28 21:23:49.759554] W [socket.c:1949:__socket_read_reply] 0-management: notify for event MAP_XID failed for /var/run/gluster/bd453b4f46250bdcc1f6683515c46138.socket
[2018-08-28 21:23:49.781489] I [MSGID: 106005] [glusterd-handler.c:6179:__glusterd_brick_rpc_notify] 0-management: Brick 172.31.30.234:/var/lib/heketi/mounts/vg_52ffb245bc8a0c21f0b7dd9add520a0c/                 brick_8765f57d935d670bc28e54eef6ed3f3e/brick has disconnected from glusterd.
----------------------------------------------------------


Why did we receive the disconnect event - that's something which needs debugging.

Reproducer:
Not known. Hit only once in QE setup.
Brick mux was enabled and many volume creates and deletes were being performed.

Comment 12 Hongkai Liu 2018-09-18 14:36:44 UTC
I am redoing the test ...
# tailf /var/log/messages

Did not see the file rolling ...

Any particular lines should I see in that file?

Comment 16 Atin Mukherjee 2018-09-19 13:00:13 UTC
RCA:

Turned out to be a brick multiplexing bug in glusterd.
When a volume is stopped, individual bricks get detached from the brick process through brick op phase in the volume stop transaction and in the brick-op callback the counter called "blockers" is decremented insider sync_lock & sync_unlock () section. Before sending the rpc_submit the same counter is incremented. The goal of this counter is to ensure at any given point of time, before originating a commit phase of a transaction this counter is set to 0. Now because of heketi distributes the volume stop command across different cli & glusterd, the processing of these volume stop requests become parallel and hence because of the additional context switches through sync_lock and sync_unlock for updating the blocker counter, the overall time of wait process goes up. Since RHGS 3.3.1 if any regular transactions take more than 180 seconds the lock timer is forcibly expired. Unfortunately in this situation some of the transactions are kicked out while they are still in progress. This is the reason why we see unlock timer callback log entries in glusterd log. I believe this also had a ripple effect on the saved frames which is why we see critical log like:

[2018-08-28 21:23:49.759511] C [rpc-clnt.c:449:rpc_clnt_fill_request_info] 0-management: cannot lookup the saved frame corresponding to xid (2170)

and this causes a disconnect to the original brick given the rpc connection for all the attached bricks are a copy of the parent brick connection.

The need of updating the blockers counter under synclock is an overkill. Updating it through GF_ATOMIC API should be sufficient for the atomicity guarantee.

upstream patch : https://review.gluster.org/#/c/glusterfs/+/21221/