+++ This bug was initially created as a clone of Bug #1225338 +++ Description of problem: ======================= From use case point of view: Created geo-rep session. Paused it and tried to create a snapshot. Snapshot hungs and timesout after 2 min of cli/barrier timeout. Problem is with the changelog/changelog on. Tried the following on the cleaned up system. 1. Create a volume 2. Set the changelog.changelog to on 3. create a snapshot, it times out as [root@georep1 scripts]# gluster snapshot create snapa master Error : Request timed out Snapshot command failed [root@georep1 scripts]# Brick Log snippet: =================== [2015-05-26 17:34:59.595211] I [changelog.c:2043:notify] 0-master-changelog: Barrier on notification [2015-05-26 17:34:59.595394] I [changelog-helpers.c:838:changelog_snap_logging_start] 0-master-changelog: Now starting to log in call path [2015-05-26 17:34:59.595410] E [changelog.c:2064:notify] 0-master-changelog: Received another barrier on notification when last one is not served yet [2015-05-26 17:34:59.595434] I [socket.c:3432:socket_submit_reply] 0-socket.glusterfsd: not connected (priv->connected = -1) [2015-05-26 17:34:59.595464] E [rpcsvc.c:1312:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: Gluster Brick operations, ProgVers: 2, Proc: 10) to rpc-transport (socket.glusterfsd) [2015-05-26 17:34:59.595480] E [glusterfsd-mgmt.c:149:glusterfs_submit_reply] 0-glusterfs: Reply submission failed [2015-05-26 17:34:59.595501] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2015-05-26 17:34:59.596373] E [socket.c:3421:socket_submit_reply] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x31c5624fb0] (--> /usr/lib64/glusterfs/3.7.0/rpc-transport/socket.so(+0x6f2f)[0x7fa8cdec7f2f] (--> /usr/lib64/libgfrpc.so.0(rpcsvc_transport_submit+0x76)[0x31c5a089a6] (--> /usr/lib64/libgfrpc.so.0(rpcsvc_submit_generic+0x1c8)[0x31c5a091f8] (--> /usr/lib64/libgfrpc.so.0(rpcsvc_error_reply+0x66)[0x31c5a09726] ))))) 0-socket: invalid argument: this->private [2015-05-26 17:34:59.596394] E [rpcsvc.c:1312:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: Gluster Brick operations, ProgVers: 2, Proc: 10) to rpc-transport (socket.glusterfsd) [2015-05-26 17:34:59.596568] C [mem-pool.c:560:mem_put] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x31c5624fb0] (--> /usr/lib64/libglusterfs.so.0(mem_put+0x105)[0x31c5655895] (--> /usr/lib64/libgfrpc.so.0(rpcsvc_submit_generic+0x256)[0x31c5a09286] (--> /usr/lib64/libgfrpc.so.0(rpcsvc_error_reply+0x66)[0x31c5a09726] (--> /usr/lib64/libgfrpc.so.0(rpcsvc_check_and_reply_error+0x6b)[0x31c5a0979b] ))))) 0-mem-pool: mem_put called on freed ptr 0x6d2d84 of mem pool 0x6d1610 [2015-05-26 17:34:59.597962] W [rpcsvc.c:571:rpcsvc_check_and_reply_error] 0-rpcsvc: failed to queue error reply [2015-05-26 17:34:59.598024] E [barrier.c:522:notify] 0-master-barrier: Already enabled [2015-05-26 17:34:59.598381] I [changelog.c:1989:notify] 0-master-changelog: Barrier off notification [2015-05-26 17:34:59.598688] I [changelog-helpers.c:860:changelog_snap_logging_stop] 0-master-changelog: Stopped to log in call path [2015-05-26 17:34:59.598713] E [changelog.c:2030:notify] 0-master-changelog: Changelog barrier already disabled (END) Version-Release number of selected component (if applicable): ============================================================= How reproducible: ================= always Steps to Reproduce: Way1: ===== 1. Create master and slave volume 2. Create geo-replication between them 3. Start and Pause the geo-rep session 4. Try to create the snapshot. It fails Way2: ===== 1. Create a volume 2. Set the volume option changelog.changelog on 3. Try to create the snapshot. It fails Actual results: =============== Snapshot creation fails with timeout Expected results: ================= Snapshot creation should succeed Additional info: ================
Upstream Patch Sent: http://review.gluster.org/#/c/10951/
REVIEW: http://review.gluster.org/10951 (featuress/changelog: On snapshot, notify irrespective of failures) posted (#2) for review on master by Kotresh HR (khiremat)
REVIEW: http://review.gluster.org/10951 (features/changelog: On snapshot, notify irrespective of failures) posted (#3) for review on master by Kotresh HR (khiremat)
REVIEW: http://review.gluster.org/10951 (featuress/changelog: On snapshot, notify irrespective of failures) posted (#4) for review on master by Kotresh HR (khiremat)
COMMIT: http://review.gluster.org/10951 committed in master by Venky Shankar (vshankar) ------ commit d76e9b83454786e6845d0cad3c2c0695815fae1b Author: Kotresh HR <khiremat> Date: Wed May 27 16:27:25 2015 +0530 featuress/changelog: On snapshot, notify irrespective of failures During snapshot, changelog barrier is enabled and a explicit rollover of changelog is initiated. During rollover of changelog, if any error or changelog is empty, the notification was not sent to reconfigure and hence snapshot was failing because of timeout. This patch addresses it by sending notification irrespective of failures and sends error if any back to barrier. Change-Id: I898af624b44555281a9e43c69066077e0e121c17 BUG: 1225542 Signed-off-by: Kotresh HR <khiremat> Reviewed-on: http://review.gluster.org/10951 Tested-by: NetBSD Build System <jenkins.org> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Aravinda VK <avishwan> Reviewed-by: Venky Shankar <vshankar>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user