+++ This bug was initially created as a clone of Bug #1218589 +++ Description of problem: Seeing failures while trying to snap a tiered volume Version-Release number of selected component (if applicable): glusterfs-server-3.7dev-0.994.git0d36d4f.el6.x86_64 How reproducible: Steps to Reproduce: 1. Created a tiered volume [Distribute (Hot) + EC (Cold)] 2. Mount the volume on the clinet and start linux untar on the mount 3. While the untar is happening, take snapshots Actual results: Snapshot are failing. [2015-05-05 14:59:36.250915] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-05-05 14:59:36.251808] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-05-05 14:59:36.252740] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-05-05 15:03:21.723471] I [socket.c:3432:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2015-05-05 15:03:21.723488] E [rpcsvc.c:1299:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 39) to rpc-transport (socket.management) [2015-05-05 15:03:21.723499] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed [2015-05-05 15:07:13.759341] W [glusterd-mgmt.c:190:gd_mgmt_v3_brick_op_fn] 0-management: snapshot brickop failed [2015-05-05 15:07:13.759356] E [glusterd-mgmt.c:943:glusterd_mgmt_v3_brick_op] 0-management: Brick ops failed for operation Snapshot on local node [2015-05-05 15:07:13.759362] E [glusterd-mgmt.c:2028:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Brick Ops Failed [2015-05-05 15:08:19.961699] I [socket.c:3432:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2015-05-05 15:08:19.961717] E [rpcsvc.c:1299:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 39) to rpc-transport (socket.management) [2015-05-05 15:08:19.961727] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed [2015-05-05 15:10:36.465294] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-05-05 15:10:36.466208] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-05-05 15:10:36.467075] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-05-05 15:12:33.198395] E [glusterd-op-sm.c:220:glusterd_get_txn_opinfo] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x3246822140] (--> /usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_get_txn_opinfo+0x197)[0x7fa5e8f4a7b7] (--> /usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_handle_stage_op+0x1f0)[0x7fa5e8f2d9e0] (--> /usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7fa5e8f2ad7f] (--> /usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x3246861c72] ))))) 0-management: Unable to get transaction opinfo for transaction ID : 5cc2bef3-3914-45eb-be68-529fbb2cb8d4 Expected results: Snapshot create should e sucessful. Additional info: Attaching logs.
I tried to reproduce this issue with latest master code. I'm able to create snapshots during an ongoing I/O on mount. My test scenario volume :>>>> Volume Name: patchy Type: Tier Volume ID: 358a6e6a-c0a2-4e3d-8260-2b83ac28c4b5 Status: Started Number of Bricks: 6 Transport-type: tcp Hot Tier : Hot Tier Type : Distribute Number of Bricks: 2 Brick1: 10.70.43.110:/d/backends/3/patchy_snap_mnt Brick2: 10.70.43.100:/d/backends/3/patchy_snap_mnt Cold Tier: Cold Tier Type : Disperse Number of Bricks: 1 x (3 + 1) = 4 Brick3: 10.70.43.100:/d/backends/1/patchy_snap_mnt Brick4: 10.70.43.110:/d/backends/1/patchy_snap_mnt Brick5: 10.70.43.100:/d/backends/2/patchy_snap_mnt Brick6: 10.70.43.110:/d/backends/2/patchy_snap_mnt Options Reconfigured: cluster.tier-promote-frequency: 10 cluster.tier-demote-frequency: 10 cluster.write-freq-threshold: 0 cluster.read-freq-threshold: 0 performance.io-cache: off performance.quick-read: off features.ctr-enabled: on performance.readdir-ahead: on on mount point>> tar -xvf /root/linux-4.1.2.tar.xz on server (during I/O)>> for i in {1..100} ; do gluster snapshot create snap$i patchy no-timestamp;done;
repeated the same test on NFS mount also, an both I/O and snapshot was success.
(In reply to Mohammed Rafi KC from comment #3) > I tried to reproduce this issue with latest master code. I'm able to create > snapshots during an ongoing I/O on mount. > > My test scenario > > > volume :>>>> > > Volume Name: patchy > Type: Tier > Volume ID: 358a6e6a-c0a2-4e3d-8260-2b83ac28c4b5 > Status: Started > Number of Bricks: 6 > Transport-type: tcp > Hot Tier : > Hot Tier Type : Distribute > Number of Bricks: 2 > Brick1: 10.70.43.110:/d/backends/3/patchy_snap_mnt > Brick2: 10.70.43.100:/d/backends/3/patchy_snap_mnt > Cold Tier: > Cold Tier Type : Disperse > Number of Bricks: 1 x (3 + 1) = 4 > Brick3: 10.70.43.100:/d/backends/1/patchy_snap_mnt > Brick4: 10.70.43.110:/d/backends/1/patchy_snap_mnt > Brick5: 10.70.43.100:/d/backends/2/patchy_snap_mnt > Brick6: 10.70.43.110:/d/backends/2/patchy_snap_mnt > Options Reconfigured: > cluster.tier-promote-frequency: 10 > cluster.tier-demote-frequency: 10 > cluster.write-freq-threshold: 0 > cluster.read-freq-threshold: 0 > performance.io-cache: off > performance.quick-read: off > features.ctr-enabled: on > performance.readdir-ahead: on > > > > on mount point>> > tar -xvf /root/linux-4.1.2.tar.xz > > on server (during I/O)>> > > for i in {1..100} ; do gluster snapshot create snap$i patchy > no-timestamp;done; Rafi, Moving to bug ON_QA would be valid only if there was an issue, that was fixed with a patch, and the patch was available on a certain build( as mentioned in FIXED-IN-VERSION ) If this issue is not reproducible, this bug should be closed as CLOSED - WORKSFORME. If there was really a issue, and that was fixed, then provide the patch URL, and once the patch is available in the build, update FIXED-IN-VERSION and move this bug to ON_QA I am moving this bug to ASSIGNED, as there were no new builds available
Removing FailedQA tag as this case was not really failed.
Seems more like a glusterd/rpc issue rather than tiering. Probably we can change the component to rpc/glusterd? <snip> [2015-05-05 15:03:21.723499] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed [2015-05-05 15:07:13.759341] W [glusterd-mgmt.c:190:gd_mgmt_v3_brick_op_fn] 0-management: snapshot brickop failed [2015-05-05 15:07:13.759356] E [glusterd-mgmt.c:943:glusterd_mgmt_v3_brick_op] 0-management: Brick ops failed for operation Snapshot on local node [2015-05-05 15:07:13.759362] E [glusterd-mgmt.c:2028:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Brick Ops Failed [2015-05-05 15:08:19.961699] I [socket.c:3432:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2015-05-05 15:08:19.961717] E [rpcsvc.c:1299:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 39) to rpc-transport (socket.management) [2015-05-05 15:08:19.961727] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed </snip>
As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days