Description of problem: creating a distributed shapshot on thin-provisioned LVs fails. Steps to Reproduce: 1. create bricks on thin-LV 2. create distributed volume 3. gluster snapshot ... Actual results: % time sudo gluster snapshot create storage_snap storage_www snapshot create: failed: Commit failed on localhost. Please check log file for details. Snapshot command failed sudo gluster snapshot create storage_snap storage_www 0.03s user 0.01s system 0% cpu 7.049 total for each try i get a LV left on the node where i issue the above command: % sudo lvs LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert acc0f3d72e2940d7ba28c9485a16c447_0 vg_crypt Vwi-a-tz- 42.00t brickthinpool data 0.01 b66cf2b9867b49308b6352ddbe6894d0_0 vg_crypt Vwi-a-tz- 42.00t brickthinpool data 0.01 brickthinpool vg_crypt twi-a-tz- 45.00t 0.02 data vg_crypt Vwi-aotz- 42.00t brickthinpool 0.01 e8f75cf4fac74027b59545b72646e1dd_0 vg_crypt Vwi-a-tz- 42.00t brickthinpool data 0.01 Expected results: snapshot is taken. Additional info: the volume contains one file and both brick hosts are idle. when i run the command on another brick host, i get similar results. log: [2015-01-19 17:38:50.439886] D [cli.c:612:cli_rpc_init] 0-cli: Connecting to glusterd using default socket [2015-01-19 17:38:50.439921] D [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glusterfs: defaulting frame-timeout to 30mins [2015-01-19 17:38:50.439929] D [rpc-clnt.c:986:rpc_clnt_connection_init] 0-glusterfs: disable ping-timeout [2015-01-19 17:38:50.439938] D [rpc-transport.c:262:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so [2015-01-19 17:38:50.440259] D [socket.c:3684:socket_init] 0-glusterfs: disabling nodelay [2015-01-19 17:38:50.440270] D [socket.c:3799:socket_init] 0-glusterfs: SSL support on the I/O path is NOT enabled [2015-01-19 17:38:50.440275] D [socket.c:3802:socket_init] 0-glusterfs: SSL support for glusterd is NOT enabled [2015-01-19 17:38:50.440280] D [socket.c:3819:socket_init] 0-glusterfs: using system polling thread [2015-01-19 17:38:50.440288] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2015-01-19 17:38:50.440294] T [socket.c:2871:socket_connect] 0-glusterfs: connecting 0x7f5aaff0f5a0, state=0 gen=0 sock=-1 [2015-01-19 17:38:50.440301] T [name.c:290:af_unix_client_get_remote_sockaddr] 0-glusterfs: using connect-path /var/run/glusterd.socket [2015-01-19 17:38:50.440310] T [name.c:106:af_unix_client_bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to assign default value [2015-01-19 17:38:50.440359] D [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glusterfs: defaulting frame-timeout to 30mins [2015-01-19 17:38:50.440367] D [rpc-clnt.c:986:rpc_clnt_connection_init] 0-glusterfs: disable ping-timeout [2015-01-19 17:38:50.440375] D [rpc-transport.c:262:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so [2015-01-19 17:38:50.440397] D [socket.c:3799:socket_init] 0-glusterfs: SSL support on the I/O path is NOT enabled [2015-01-19 17:38:50.440402] D [socket.c:3802:socket_init] 0-glusterfs: SSL support for glusterd is NOT enabled [2015-01-19 17:38:50.440407] D [socket.c:3819:socket_init] 0-glusterfs: using system polling thread [2015-01-19 17:38:50.440427] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2015-01-19 17:38:50.440432] T [socket.c:2871:socket_connect] 0-glusterfs: connecting 0x7f5aaff149f0, state=0 gen=0 sock=-1 [2015-01-19 17:38:50.440438] T [name.c:290:af_unix_client_get_remote_sockaddr] 0-glusterfs: using connect-path /tmp/quotad.socket [2015-01-19 17:38:50.440446] T [name.c:106:af_unix_client_bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to assign default value [2015-01-19 17:38:50.440463] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440477] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440484] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440490] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440496] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440503] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440511] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440519] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440526] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440534] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440539] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440545] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440551] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440557] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.440564] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474027] D [cli-cmd-volume.c:1799:cli_check_gsync_present] 0-cli: Returning 0 [2015-01-19 17:38:50.474069] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474080] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474094] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474106] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474115] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474123] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474131] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474137] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474145] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474151] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474159] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474166] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474171] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474176] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474182] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474189] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474195] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474201] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474207] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474213] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474219] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474224] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474230] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474235] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474242] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474248] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474267] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474272] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474278] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474284] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474292] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474298] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474304] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474310] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474317] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474326] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474332] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474338] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474344] D [registry.c:408:cli_cmd_register] 0-cli: Returning 0 [2015-01-19 17:38:50.474382] T [cli.c:264:cli_rpc_notify] 0-glusterfs: got RPC_CLNT_CONNECT [2015-01-19 17:38:50.474394] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT [2015-01-19 17:38:50.474400] I [socket.c:2344:socket_event_handler] 0-transport: disconnecting now [2015-01-19 17:38:50.474414] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT [2015-01-19 17:38:50.474459] T [rpc-clnt.c:1381:rpc_clnt_record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner: [2015-01-19 17:38:50.474479] T [rpc-clnt.c:1238:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 264, payload: 200, rpc hdr: 64 [2015-01-19 17:38:50.474704] T [socket.c:2863:socket_connect] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f5aae800c96] (--> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so(+0x5d79)[0x7f5aab904d79] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_submit +0x3a4)[0x7f5aae17cde4] (--> gluster(cli_submit_request+0x149)[0x7f5aaecb1009] (--> gluster(cli_cmd_submit+0x7f)[0x7f5aaecb2bdf] ))))) 0-glusterfs: connect () called on transport already connected [2015-01-19 17:38:50.474732] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 39) to rpc-transport (glusterfs) [2015-01-19 17:38:50.474743] D [rpc-clnt-ping.c:231:rpc_clnt_start_ping] 0-glusterfs: ping timeout is 0, returning [2015-01-19 17:38:53.440595] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2015-01-19 17:38:53.440840] T [socket.c:2863:socket_connect] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f5aae800c96] (--> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so(+0x5d79)[0x7f5aab904d79] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_reconn ect+0xb9)[0x7f5aae17b339] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_timer_proc+0xd0)[0x7f5aae81ac90] (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f5aadbf3182] ))))) 0-glusterfs: connect () called on transport already connected [2015-01-19 17:38:53.440855] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2015-01-19 17:38:53.440861] T [socket.c:2871:socket_connect] 0-glusterfs: connecting 0x7f5aaff149f0, state=0 gen=0 sock=-1 [2015-01-19 17:38:53.440871] T [name.c:290:af_unix_client_get_remote_sockaddr] 0-glusterfs: using connect-path /tmp/quotad.socket [2015-01-19 17:38:53.440884] T [name.c:106:af_unix_client_bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to assign default value [2015-01-19 17:38:53.440941] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT [2015-01-19 17:38:53.440959] I [socket.c:2344:socket_event_handler] 0-transport: disconnecting now [2015-01-19 17:38:53.440978] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT [2015-01-19 17:38:56.441116] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2015-01-19 17:38:56.441361] T [socket.c:2863:socket_connect] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f5aae800c96] (--> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so(+0x5d79)[0x7f5aab904d79] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_reconn ect+0xb9)[0x7f5aae17b339] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_timer_proc+0xd0)[0x7f5aae81ac90] (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f5aadbf3182] ))))) 0-glusterfs: connect () called on transport already connected [2015-01-19 17:38:56.441377] T [rpc-clnt.c:418:rpc_clnt_reconnect] 0-glusterfs: attempting reconnect [2015-01-19 17:38:56.441383] T [socket.c:2871:socket_connect] 0-glusterfs: connecting 0x7f5aaff149f0, state=0 gen=0 sock=-1 [2015-01-19 17:38:56.441392] T [name.c:290:af_unix_client_get_remote_sockaddr] 0-glusterfs: using connect-path /tmp/quotad.socket [2015-01-19 17:38:56.441403] T [name.c:106:af_unix_client_bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to assign default value [2015-01-19 17:38:56.441457] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT [2015-01-19 17:38:56.441475] I [socket.c:2344:socket_event_handler] 0-transport: disconnecting now [2015-01-19 17:38:56.441493] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT [2015-01-19 17:38:57.482590] T [rpc-clnt.c:660:rpc_clnt_reply_init] 0-glusterfs: received rpc message (RPC XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 39) from rpc-transport (glusterfs) [2015-01-19 17:38:57.482696] D [cli-cmd.c:384:cli_cmd_submit] 0-cli: Returning -1 [2015-01-19 17:38:57.482714] E [cli-rpc-ops.c:9189:gf_cli_snapshot] 0-cli: cli_to_glusterd for snapshot failed [2015-01-19 17:38:57.482721] D [cli-rpc-ops.c:9234:gf_cli_snapshot] 0-cli: Returning -1 [2015-01-19 17:38:57.482744] D [cli-cmd-snapshot.c:75:cli_cmd_snapshot_cbk] 0-cli: frame->local is not NULL (0x7f5a9c000b80) [2015-01-19 17:38:57.482755] I [input.c:36:cli_batch] 0-: Exiting with: -1
Can you also provide glusterd logs from all nodes?
Comment from pille <pille+redhat+bugzilla>: that's a good hint. reading this, i have to add that /run is a tmpfs without xattr. node 1) [local] [2015-01-19 17:38:57.269064] E [glusterd-snapshot.c:3997:glusterd_snap_brick_create] 0-management: Failed to set extended attribute trusted.glusterfs.volume-id on /run/gluster/snaps/b66cf2b9867b49308b6352ddbe6894d0/brick2. Reason: No data available, snap: b66cf2b9867b49308b6352ddbe6894d0 [2015-01-19 17:38:57.269093] W [glusterd-snapshot.c:4004:glusterd_snap_brick_create] 0-management: unmounting the snap brick mount /run/gluster/snaps/b66cf2b9867b49308b6352ddbe6894d0/brick2 [2015-01-19 17:38:57.287664] E [glusterd-snapshot.c:4309:glusterd_take_brick_snapshot] 0-management: not able to create the brick for the snap storage_snap, volume b66cf2b9867b49308b6352ddbe6894d0 [2015-01-19 17:38:57.287686] E [glusterd-snapshot.c:5366:glusterd_take_brick_snapshot_task] 0-management: Failed to take backend snapshot for brick 2a.gluster.xxxxx.com:/run/gluster/snaps/b66cf2b9867b49308b6352ddbe6894d0/brick2 volume(b66cf2b9867b49308b6352ddbe6894d0) [2015-01-19 17:38:57.287719] E [glusterd-snapshot.c:5503:glusterd_schedule_brick_snapshot] 0-management: Failed to create snapshot [2015-01-19 17:38:57.287739] E [glusterd-snapshot.c:5621:glusterd_snapshot_create_commit] 0-management: Failed to take backend snapshot storage_snap [2015-01-19 17:38:57.287790] W [glusterd-snapshot.c:2514:glusterd_snap_volume_remove] 0-management: Failed to remove lvm snapshot volume b66cf2b9867b49308b6352ddbe6894d0 [2015-01-19 17:38:57.288194] E [glusterd-snapshot.c:6792:glusterd_snapshot] 0-management: Failed to create snapshot [2015-01-19 17:38:57.288204] W [glusterd-mgmt.c:223:gd_mgmt_v3_commit_fn] 0-management: Snapshot Commit Failed [2015-01-19 17:38:57.288210] E [glusterd-mgmt.c:1139:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Snapshot on local node [2015-01-19 17:38:57.288216] E [glusterd-mgmt.c:1906:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Commit Op Failed [2015-01-19 17:38:57.481014] I [glusterd-snapshot.c:5278:glusterd_do_snap_cleanup] 0-management: Snapshot (storage_snap) does not exist node 2) [2015-01-19 17:38:57.486189] I [glusterd-snapshot.c:5278:glusterd_do_snap_cleanup] 0-management: Snapshot (storage_snap) does not exist
btw: i've created on all nodes /run/gluster as xfs. the issue including the log stays the same. the snaps subdirectory was created.
so what's the progress on this? i've had snapshots running in the meantime, but after an issue with the LVM layer (thinpool's tmeta ran full on 3/4 nodes) i see this behavior again on one node. the LVM issue was resolved by lvextending the tmeta and gluster eventually worked again as if nothing happened. but i cannot take any new snapshots. i'm on 3.6.2 now.
This bug is being closed as GlusterFS-3.6 is nearing its End-Of-Life and only important security bugs will be fixed. If you still face this bug with the newer GlusterFS versions, please open a new bug.