Bug 1096729
Summary: | Disconnects of peer and brick is logged while snapshot creations were in progress during IO | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> | |
Component: | snapshot | Assignee: | rjoseph | |
Status: | CLOSED DEFERRED | QA Contact: | storage-qa-internal <storage-qa-internal> | |
Severity: | low | Docs Contact: | ||
Priority: | urgent | |||
Version: | rhgs-3.0 | CC: | amukherj, asengupt, asriram, jbyers, nlevinki, nsathyan, rcyriac, rhs-bugs, rjoseph, sasundar, storage-qa-internal, vagarwal, vbellur, vmallika | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.6.0.5-1.el7 | Doc Type: | Bug Fix | |
Doc Text: |
When multiple snapshot operation are performed simultaneously from different nodes in a cluster, glusterd peer disconnects can happen by ping-timer.
Solution is to disable the ping-timer by setting 'option ping-timeout 0' in file '/etc/glusterfs/glusterd.vol'
|
Story Points: | --- | |
Clone Of: | ||||
: | 1097224 1098021 1098025 1104459 (view as bug list) | Environment: | ||
Last Closed: | 2016-01-29 13:23:10 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1088355, 1097224, 1098021, 1098025, 1104459, 1104462, 1109150 |
Description
Rahul Hinduja
2014-05-12 10:53:11 UTC
Brick process could be busy processing something and didn't respond to glusterd within 30secs. I think this issue exists earlier as well, but it now visible after moving the ping-timer to the lower level to make it available to all the process. Hi Rahul, Can please re-try your test by disabling ping-timer. ping-timer can be disabled by setting the ping-timeout to 0 in the glusterd volume file. When there are many IO happening on the brick process, epoll thread will be busy and will not respong to the ping packet within 60secs. For now we will disable ping-timer for the below: glusterd->brick brick->glusterd cli->glusterd Later when the epoll thread model changed and made lighter, we need to revert back the change. Patch# 7753 is posted upstream <Summary of e-mail discussion> This issue requires some more analysis. To unblock QE immediately, a temporary patch is sent to downstream branch (RHS 3.0). https://code.engineering.redhat.com/gerrit/#/c/25040/ This would be a downstream only patch until we uncover the root cause for the ping timeouts. In this patch we are disabling ping timers on the some connections, so that we don't the problems faced go away. This just avoids the problem but doesn't fix it correctly. Root causing of the issue and fixing it shall be tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1098021 *** Bug 1098021 has been marked as a duplicate of this bug. *** I was trying to debug the root cause by executing below steps and found epoll thread is blocked by a big-lock: 1) I made a change in the snapshot creation part of code by adding a sleep of 30sec so that it holds a big-lock for 30+ secs. 2) Executed snapshot creation from two different nodes of two different volumes. After sometime ping-timer on the brick-process expired and disconnected the socket. 3) Here is the stack trace from glusterd. #0 0x0000003c9620b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007fac2b1a71fb in __synclock_lock (lock=0x16a9a58) at syncop.c:764 #2 0x00007fac2b1a728e in synclock_lock (lock=0x16a9a58) at syncop.c:782 #3 0x00007fac20e70691 in glusterd_big_locked_cbk (req=0x16d9118, iov=0x16d9158, count=1, myframe=0x7fac29dd7740, fn=0x7fac20ed5680 <gd_mgmt_v3_unlock_cbk_fn>) at glusterd-rpc-ops.c:206 #4 0x00007fac2af41e25 in rpc_clnt_handle_reply (clnt=0x16c62c0, pollin=0x170ca40) at rpc-clnt.c:767 #5 0x00007fac2af42cbf in rpc_clnt_notify (trans=<value optimized out>, mydata=0x16c62f0, event=<value optimized out>, data=<value optimized out>) at rpc-clnt.c:895 #6 0x00007fac2af3e568 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:512 #7 0x00007fac20bbce85 in socket_event_poll_in (this=0x16dbb90) at socket.c:2120 #8 0x00007fac20bbe86d in socket_event_handler (fd=<value optimized out>, idx=<value optimized out>, data=0x16dbb90, poll_in=1, poll_out=0, poll_err=0) at socket.c:2233 #9 0x00007fac2b1c07e7 in event_dispatch_epoll_handler (event_pool=0x16824d0) at event-epoll.c:384 #10 event_dispatch_epoll (event_pool=0x16824d0) at event-epoll.c:445 #11 0x0000000000407ecd in main (argc=2, argv=0x7fffcab7fdd8) at glusterfsd.c:2023 Notification handler is blocked by a big-locked callback. Ping-timer is set for every rpc connection which defaults to 30secs. Patch gives (https://code.engineering.redhat.com/gerrit/#/c/25040/) the workaround by disabling the ping-timer between glusterd and brick process connection. In case if the ping-timer expiry issue still exists, ping-timer can be disabled between glusterd-glusterd connection as well by setting ping-timeout option in 'glusterd.vol' file. We will keep the bug open util the fix is made for the actual issue. Patch posted: https://code.engineering.redhat.com/gerrit/25294 With this patch default ping-timeout value is '0' for all the rpc connection except for the glsuerd-glusterd (set to 30sec) and client-glusterd (set to 42sec). Closing this bug. We will open a new RFC bug to address the epoll issue We cannot set ping-timeout to 0 for glusterd peer connection, because this will create regression for one of the intuit bug# 1034479. Also we are seeing peer disconnect only when multiple snapshot operations are executed simultaneously from different nodes. So we can document the same saying 'set ping-timeout=0' in '/etc/glusterfs/glusterd.vol' when multiple snapshot operations executed simultaneously. Version: glusterfs-3.6.0.15-1.el6rhs.x86_64 Retried creating snapshots by setting ping-timeout to 30 (default value) in /etc/glusterfs/glusterd.vol, we saw disconnect messages and faced glusterd crashed. Steps followed : ============== created snapshots in loop on 4 volumes at the same time with IO in progress for i in {201..400}; do dd if=/dev/urandom of=fuse_vol2"$i" bs=10M count=1; done [2014-06-11 12:28:40.783197] C [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-management: server 10.70.40.169:24007 has not responded in the last 30 seconds, disconnecting. [2014-06-11 12:28:56.586750] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-06-11 12:28:57.214459] I [socket.c:2246:socket_event_handler] 0-transport: disconnecting now [2014-06-11 12:28:57.218328] I [MSGID: 106005] [glusterd-handler.c:4131:__glusterd_brick_rpc_notify] 0-management: Brick snapshot14.lab.eng.blr.redhat.com:/var/run/gluster/snaps/c8a70faa0459443c87b37991a243b405/brick2/b2 has disconnected from glusterd. Uploaded the sosreports and the core file: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/snapshots/1096729/ Doc bug raised to track the Known issue BZ 1109150. Lowering the severity of this bug. As discussed with the developers, raised another BZ 1110119 to track the issue of glusterd crash as mentioned in Comment 13. Please add doc text for this known issue. Doc text added When this bug was raised at that time multi-threaded epoll support was not in place, we had to turn off ping timer to get rid of this problem. However still ping timer is not enabled between GlusterDs due to some issues seen in BZ 1259992. Once we identify the root cause of the same and fix it, multi threaded epoll support with ping timer enabled can be brought in back and then this bug can be verified. Moving the bug from ateam to gabbar as test cases coverage belongs to snapshot. Current Gluster architecture does not support implementation of this feature. Therefore this feature request is deferred till Gluterd 2.0. |