Bug 1443051 - Brick Multiplexing: Unable to activate Snapshot
Summary: Brick Multiplexing: Unable to activate Snapshot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: RHGS 3.3.0
Assignee: Mohit Agrawal
QA Contact: Anil Shah
URL:
Whiteboard: brick-multiplexing
Depends On:
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-04-18 11:51 UTC by Nag Pavan Chilakam
Modified: 2017-09-21 04:37 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8.4-25
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-21 04:37:54 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Nag Pavan Chilakam 2017-04-18 11:51:35 UTC
Description of problem:
=======================
On a brick multiplexing setup, I am unable to activate snapshot

[root@dhcp35-45 rep2_19]# gluster snap activate rep2_19_snap1_GMT-2017.04.18-11.36.34
snapshot activate: failed: Commit failed on localhost. Please check log file for details.
Snapshot command failed
[root@dhcp35-45 rep2_19]# gluster snap list
cross3_59_snap1_GMT-2017.04.18-11.35.42
rep2_19_snap1_GMT-2017.04.18-11.36.34
[root@dhcp35-45 rep2_19]# gluster snap activate cross3_59_snap1_GMT-2017.04.18-11.35.42
snapshot activate: failed: Commit failed on 10.70.35.122. Please check log file for details.
Commit failed on 10.70.35.23. Please check log file for details.
Commit failed on 10.70.35.112. Please check log file for details.

I retried and below is the new message(one snapshot says it is activated)
[root@dhcp35-45 glusterfs]# gluster snap activate cross3_59_snap1_GMT-2017.04.18-11.35.42
snapshot activate: failed: Snapshot cross3_59_snap1_GMT-2017.04.18-11.35.42 is already activated.
Snapshot command failed
[root@dhcp35-45 glusterfs]# gluster snap activate rep2_19_snap1_GMT-2017.04.18-11.36.34
snapshot activate: failed: Commit failed on localhost. Please check log file for details.
Snapshot command failed
[root@dhcp35-45 glusterfs]# gluster v status cross3_59_snap1_GMT-2017.04.18-11.35.42
Volume cross3_59_snap1_GMT-2017.04.18-11.35.42 does not exist
[root@dhcp35-45 glusterfs]# gluster v status cross3_59_snap1
Volume cross3_59_snap1 does not exist



[root@dhcp35-45 glusterfs]# gluster snap status cross3_59
Snapshot Status : failed: Snapshot (cross3_59) does not exist
Snapshot command failed
[root@dhcp35-45 glusterfs]# gluster snap status 

Snap Name : cross3_59_snap1_GMT-2017.04.18-11.35.42
Snap UUID : 81ddf3a2-160f-4f45-b496-2a2741679d15

	Brick Path        :   10.70.35.122:/run/gluster/snaps/8130d35eaccb44f2ace7728c463a9a30/brick1/distrep_59
	Volume Group      :   RHS_vg3
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.22
	LV Size           :   5.00g


	Brick Path        :   10.70.35.112:/run/gluster/snaps/8130d35eaccb44f2ace7728c463a9a30/brick2/distrep_59
	Volume Group      :   RHS_vg3
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.22
	LV Size           :   5.00g


	Brick Path        :   10.70.35.23:/run/gluster/snaps/8130d35eaccb44f2ace7728c463a9a30/brick3/distrep_59
	Volume Group      :   RHS_vg3
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.22
	LV Size           :   5.00g


Snap Name : rep2_19_snap1_GMT-2017.04.18-11.36.34
Snap UUID : 23d015e8-4635-4a30-b449-b0bc82f589b8

	Brick Path        :   10.70.35.45:/run/gluster/snaps/aa911a40ce4749288a73aecb98d562d9/brick1/rep2_19
	Volume Group      :   RHS_vg2
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.12
	LV Size           :   10.00g


	Brick Path        :   10.70.35.130:/run/gluster/snaps/aa911a40ce4749288a73aecb98d562d9/brick2/rep2_19
	Volume Group      :   RHS_vg2
	Brick Running     :   No
	Brick PID         :   N/A
	Data Percentage   :   0.12
	LV Size           :   10.00g



Version-Release number of selected component (if applicable):
======
3.8.4-22

How reproducible:
==========
had the setup as in 1443025 - Brick Multiplexing: Volume gets unmounted during uss enable 
Took a snapshot for 2 of the volumes-->passed
tried to activated-->failed as above


[2017-04-18 11:42:21.728342] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:42:21.828454] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:42:22.004366] E [cli-rpc-ops.c:11269:gf_cli_snapshot] 0-cli: cli_to_glusterd for snapshot failed
[2017-04-18 11:42:22.004508] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2017-04-18 11:46:05.804310] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:46:05.928075] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:46:05.949291] E [cli-rpc-ops.c:11269:gf_cli_snapshot] 0-cli: cli_to_glusterd for snapshot failed
[2017-04-18 11:46:05.949371] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2017-04-18 11:46:24.714775] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:46:24.816249] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:46:24.927729] E [cli-rpc-ops.c:11269:gf_cli_snapshot] 0-cli: cli_to_glusterd for snapshot failed
[2017-04-18 11:46:24.927811] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2017-04-18 11:46:43.139642] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:46:43.238321] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:46:43.245428] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2017-04-18 11:46:46.291106] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:46:46.389018] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:46:46.394033] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2017-04-18 11:46:48.748779] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:46:48.846629] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:46:48.847976] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2017-04-18 11:47:46.502347] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:47:46.605922] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:47:46.616373] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:47:46.809372] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:47:46.809506] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:47:46.809558] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:47:46.809585] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:47:47.048943] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:47:47.049012] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:47:47.049040] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:47:47.049159] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2017-04-18 11:48:00.445302] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:48:00.545555] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:48:01.862073] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2017-04-18 11:48:09.068137] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:48:09.166509] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:48:09.167636] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:48:09.167778] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2017-04-18 11:48:44.424826] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:48:44.525676] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:48:44.602699] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:48:44.602769] E [cli-rpc-ops.c:10989:gf_cli_snapshot_cbk] 0-cli: Failed to display snapshot status output.
[2017-04-18 11:48:44.602857] E [cli-rpc-ops.c:11269:gf_cli_snapshot] 0-cli: cli_to_glusterd for snapshot failed
[2017-04-18 11:48:44.602957] I [input.c:31:cli_batch] 0-: Exiting with: -1
[2017-04-18 11:48:47.785363] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:48:47.881271] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:48:47.888654] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:48:48.055579] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:48:48.055741] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:48:48.055779] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:48:48.055803] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:48:48.214023] W [cli-rpc-ops.c:10677:gf_cli_generate_snapshot_event] 0-cli: Cannot generate event for unknown type.
[2017-04-18 11:48:48.214126] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:48:48.214164] I [cli-rpc-ops.c:10096:cli_get_snap_volume_status] 0-cli: Unable to get pid
[2017-04-18 11:48:48.214253] I [input.c:31:cli_batch] 0-: Exiting with: 0
[2017-04-18 11:49:00.627469] I [cli.c:757:main] 0-cli: Started running gluster with version 3.8.4
[2017-04-18 11:49:00.725112] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-18 11:49:00.731420] I [input.c:31:cli_batch] 0-: Exiting with: -1

Comment 2 Nag Pavan Chilakam 2017-04-18 12:18:51 UTC
glusterd log

The message "I [MSGID: 106488] [glusterd-handler.c:1539:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 61 times between [2017-04-18 11:53:27.468745] and [2017-04-18 11:53:43.855660]
[2017-04-18 11:55:17.939480] I [MSGID: 106487] [glusterd-handler.c:1476:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2017-04-18 12:04:04.292755] I [MSGID: 106487] [glusterd-handler.c:1476:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2017-04-18 12:09:53.291071] E [MSGID: 106024] [glusterd-snapshot.c:5689:glusterd_snapshot_activate_deactivate_prevalidate] 0-management: Snapshot rep2_20_snap1_GMT-2017.04.18-12.08.32 is already deactivated.
[2017-04-18 12:09:53.291183] W [MSGID: 106042] [glusterd-snapshot.c:8794:glusterd_snapshot_prevalidate] 0-management: Snapshot deactivate validation failed
[2017-04-18 12:09:53.291212] W [MSGID: 106122] [glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot Prevalidate Failed
[2017-04-18 12:09:53.291240] E [MSGID: 106122] [glusterd-mgmt.c:916:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Snapshot on local node
[2017-04-18 12:09:53.291260] E [MSGID: 106122] [glusterd-mgmt.c:2272:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre Validation Failed
[2017-04-18 12:10:02.840547] E [MSGID: 106005] [glusterd-utils.c:5396:glusterd_brick_start] 0-management: Unable to start brick 10.70.35.45:/run/gluster/snaps/094dceaf071f4425936152795ab2f888/brick1/rep2_20
[2017-04-18 12:10:02.841015] E [MSGID: 106041] [glusterd-snapshot.c:6201:glusterd_snapshot_activate_commit] 0-management: Failed to activate snap volume 094dceaf071f4425936152795ab2f888 of the snap rep2_20_snap1_GMT-2017.04.18-12.08.32
[2017-04-18 12:10:02.841121] W [MSGID: 106041] [glusterd-snapshot.c:8551:glusterd_snapshot] 0-management: Failed to activate snapshot
[2017-04-18 12:10:02.841166] W [MSGID: 106123] [glusterd-mgmt.c:283:gd_mgmt_v3_commit_fn] 0-management: Snapshot Commit Failed
[2017-04-18 12:10:02.841192] E [MSGID: 106123] [glusterd-mgmt.c:1456:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Snapshot on local node
[2017-04-18 12:10:02.841211] E [MSGID: 106123] [glusterd-mgmt.c:2333:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Commit Op Failed
[2017-04-18 12:10:40.401605] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-04-18 12:10:40.401756] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-04-18 12:10:40.404707] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-04-18 12:10:40.404797] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-04-18 12:10:40.460927] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4c82) [0x7ff647f88c82] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4735) [0x7ff647f88735] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7ff6533fc105] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=rep2_15 -o features.uss=enable --gd-workdir=/var/lib/glusterd
[2017-04-18 12:10:40.533349] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4c82) [0x7ff647f88c82] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4735) [0x7ff647f88735] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7ff6533fc105] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh --volname=rep2_15 -o features.uss=enable --gd-workdir=/var/lib/glusterd
[2017-04-18 12:10:44.278068] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick snapd-rep2_15 on port 49158
[2017-04-18 12:10:49.237155] W [MSGID: 101095] [xlator.c:147:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/3.8.4/xlator/features/ganesha.so: cannot open shared object file: No such file or directory
[2017-04-18 12:16:46.094428] E [MSGID: 106005] [glusterd-utils.c:5396:glusterd_brick_start] 0-management: Unable to start brick 10.70.35.45:/run/gluster/snaps/094dceaf071f4425936152795ab2f888/brick1/rep2_20
[2017-04-18 12:16:46.094808] E [MSGID: 106041] [glusterd-snapshot.c:6201:glusterd_snapshot_activate_commit] 0-management: Failed to activate snap volume 094dceaf071f4425936152795ab2f888 of the snap rep2_20_snap1_GMT-2017.04.18-12.08.32
[2017-04-18 12:16:46.094886] W [MSGID: 106041] [glusterd-snapshot.c:8551:glusterd_snapshot] 0-management: Failed to activate snapshot
[2017-04-18 12:16:46.094951] W [MSGID: 106123] [glusterd-mgmt.c:283:gd_mgmt_v3_commit_fn] 0-management: Snapshot Commit Failed
[2017-04-18 12:16:46.094979] E [MSGID: 106123] [glusterd-mgmt.c:1456:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Snapshot on local node
[2017-04-18 12:16:46.095037] E [MSGID: 106123] [glusterd-mgmt.c:2333:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Commit Op Failed
(END)

Comment 3 Mohammed Rafi KC 2017-04-24 09:50:12 UTC
RCA:

With multiplexing enabled we are creating socket file in different location namely in volume configuration directory,

For example : /var/lib/glusterd/snaps/Scheduled-job1-vol0_GMT-2017.04.22-09.20.01/eeadecc528164cb7880e0f747c6cc710/run/daemon-gluster-arch-srv1.lab.eng.centos.com.socket 


So the issue is that unix socket has a imposed limit of 108 character in path. Most of the time with a decent snap name and hostname this limit will crosss, that is why the snap brick start is failing.

In previous case we were creating socket path in /var/run/gluster/2950de2ce6ccd8c926136c9696496797.socket

Comment 18 Atin Mukherjee 2017-05-04 13:03:48 UTC
upstream patch : https://review.gluster.org/#/c/17101/

Comment 23 Anil Shah 2017-05-31 09:48:58 UTC
Able to activate snapshot when brick multiplexing is enabled
Bug verified on build glusterfs-3.8.4-25.el7rhgs.x86_64

Comment 25 errata-xmlrpc 2017-09-21 04:37:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.