Bug 1229267

Summary:	Snapshots failing on tiered volumes (with EC)
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Nag Pavan Chilakam <nchilaka>
Component:	tier	Assignee:	hari gowtham <hgowtham>
Status:	CLOSED WONTFIX	QA Contact:	Nag Pavan Chilakam <nchilaka>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.1	CC:	annair, rgowdapp, rhs-bugs, rkavunga, sankarshan, sasundar
Target Milestone:	---	Keywords:	Triaged, ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	tier-interops
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1218589	Environment:
Last Closed:	2018-11-08 18:43:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1218589
Bug Blocks:

Description Nag Pavan Chilakam 2015-06-08 10:44:18 UTC

+++ This bug was initially created as a clone of Bug #1218589 +++

Description of problem:
Seeing failures while trying to snap a tiered volume 

Version-Release number of selected component (if applicable):
glusterfs-server-3.7dev-0.994.git0d36d4f.el6.x86_64


How reproducible:


Steps to Reproduce:
1. Created a tiered volume [Distribute (Hot) + EC (Cold)]
2. Mount the volume on the clinet and start linux untar on the mount
3. While the untar is happening, take snapshots

Actual results:
Snapshot are failing.
[2015-05-05 14:59:36.250915] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-05-05 14:59:36.251808] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-05-05 14:59:36.252740] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-05-05 15:03:21.723471] I [socket.c:3432:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1)
[2015-05-05 15:03:21.723488] E [rpcsvc.c:1299:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 39) to rpc-transport (socket.management)
[2015-05-05 15:03:21.723499] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed
[2015-05-05 15:07:13.759341] W [glusterd-mgmt.c:190:gd_mgmt_v3_brick_op_fn] 0-management: snapshot brickop failed
[2015-05-05 15:07:13.759356] E [glusterd-mgmt.c:943:glusterd_mgmt_v3_brick_op] 0-management: Brick ops failed for operation Snapshot on local node
[2015-05-05 15:07:13.759362] E [glusterd-mgmt.c:2028:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Brick Ops Failed
[2015-05-05 15:08:19.961699] I [socket.c:3432:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1)
[2015-05-05 15:08:19.961717] E [rpcsvc.c:1299:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 39) to rpc-transport (socket.management)
[2015-05-05 15:08:19.961727] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed
[2015-05-05 15:10:36.465294] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-05-05 15:10:36.466208] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-05-05 15:10:36.467075] I [glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-05-05 15:12:33.198395] E [glusterd-op-sm.c:220:glusterd_get_txn_opinfo] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x3246822140] (--> /usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_get_txn_opinfo+0x197)[0x7fa5e8f4a7b7] (--> /usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_handle_stage_op+0x1f0)[0x7fa5e8f2d9e0] (--> /usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7fa5e8f2ad7f] (--> /usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x3246861c72] ))))) 0-management: Unable to get transaction opinfo for transaction ID : 5cc2bef3-3914-45eb-be68-529fbb2cb8d4


Expected results:
Snapshot create should e sucessful.

Additional info:

Attaching logs.

Comment 3 Mohammed Rafi KC 2015-08-06 07:29:41 UTC

I tried to reproduce this issue with latest master code. I'm able to create snapshots during an ongoing I/O on mount.

My test scenario


volume :>>>>

Volume Name: patchy
Type: Tier
Volume ID: 358a6e6a-c0a2-4e3d-8260-2b83ac28c4b5
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: 10.70.43.110:/d/backends/3/patchy_snap_mnt
Brick2: 10.70.43.100:/d/backends/3/patchy_snap_mnt
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (3 + 1) = 4
Brick3: 10.70.43.100:/d/backends/1/patchy_snap_mnt
Brick4: 10.70.43.110:/d/backends/1/patchy_snap_mnt
Brick5: 10.70.43.100:/d/backends/2/patchy_snap_mnt
Brick6: 10.70.43.110:/d/backends/2/patchy_snap_mnt
Options Reconfigured:
cluster.tier-promote-frequency: 10
cluster.tier-demote-frequency: 10
cluster.write-freq-threshold: 0
cluster.read-freq-threshold: 0
performance.io-cache: off
performance.quick-read: off
features.ctr-enabled: on
performance.readdir-ahead: on



on mount point>>
tar -xvf /root/linux-4.1.2.tar.xz

on server (during I/O)>>

for i in {1..100} ; do  gluster snapshot create snap$i patchy no-timestamp;done;

Comment 4 Mohammed Rafi KC 2015-08-06 07:31:41 UTC

repeated the same test on NFS mount also, an both I/O and snapshot was success.

Comment 5 SATHEESARAN 2015-08-10 03:14:40 UTC

(In reply to Mohammed Rafi KC from comment #3)
> I tried to reproduce this issue with latest master code. I'm able to create
> snapshots during an ongoing I/O on mount.
> 
> My test scenario
> 
> 
> volume :>>>>
> 
> Volume Name: patchy
> Type: Tier
> Volume ID: 358a6e6a-c0a2-4e3d-8260-2b83ac28c4b5
> Status: Started
> Number of Bricks: 6
> Transport-type: tcp
> Hot Tier :
> Hot Tier Type : Distribute
> Number of Bricks: 2
> Brick1: 10.70.43.110:/d/backends/3/patchy_snap_mnt
> Brick2: 10.70.43.100:/d/backends/3/patchy_snap_mnt
> Cold Tier:
> Cold Tier Type : Disperse
> Number of Bricks: 1 x (3 + 1) = 4
> Brick3: 10.70.43.100:/d/backends/1/patchy_snap_mnt
> Brick4: 10.70.43.110:/d/backends/1/patchy_snap_mnt
> Brick5: 10.70.43.100:/d/backends/2/patchy_snap_mnt
> Brick6: 10.70.43.110:/d/backends/2/patchy_snap_mnt
> Options Reconfigured:
> cluster.tier-promote-frequency: 10
> cluster.tier-demote-frequency: 10
> cluster.write-freq-threshold: 0
> cluster.read-freq-threshold: 0
> performance.io-cache: off
> performance.quick-read: off
> features.ctr-enabled: on
> performance.readdir-ahead: on
> 
> 
> 
> on mount point>>
> tar -xvf /root/linux-4.1.2.tar.xz
> 
> on server (during I/O)>>
> 
> for i in {1..100} ; do  gluster snapshot create snap$i patchy
> no-timestamp;done;


Rafi,

Moving to bug ON_QA would be valid only if there was an issue, that was fixed with a patch, and the patch was available on a certain build( as mentioned in FIXED-IN-VERSION )

If this issue is not reproducible, this bug should be closed as CLOSED - WORKSFORME.

If there was really a issue, and that was fixed, then provide the patch URL, and once the patch is available in the build, update FIXED-IN-VERSION and move this bug to ON_QA

I am moving this bug to ASSIGNED, as there were no new builds available

Comment 6 SATHEESARAN 2015-08-10 03:16:46 UTC

Removing FailedQA tag as this case was not really failed.

Comment 7 Raghavendra G 2016-01-27 07:13:16 UTC

Seems more like a glusterd/rpc issue rather than tiering. Probably we can change the component to rpc/glusterd?

<snip>

[2015-05-05 15:03:21.723499] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed
[2015-05-05 15:07:13.759341] W [glusterd-mgmt.c:190:gd_mgmt_v3_brick_op_fn] 0-management: snapshot brickop failed
[2015-05-05 15:07:13.759356] E [glusterd-mgmt.c:943:glusterd_mgmt_v3_brick_op] 0-management: Brick ops failed for operation Snapshot on local node
[2015-05-05 15:07:13.759362] E [glusterd-mgmt.c:2028:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Brick Ops Failed
[2015-05-05 15:08:19.961699] I [socket.c:3432:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1)
[2015-05-05 15:08:19.961717] E [rpcsvc.c:1299:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 39) to rpc-transport (socket.management)
[2015-05-05 15:08:19.961727] E [glusterd-utils.c:409:glusterd_submit_reply] 0-: Reply submission failed

</snip>

Comment 11 hari gowtham 2018-11-08 18:43:09 UTC

As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary.

Comment 12 Red Hat Bugzilla 2023-09-14 03:00:19 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days