Bug 1047502

Summary: [SNAPSHOT]: Restaring glusterd and listing CG resulted in glusterd crash
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: senaik
Component: snapshotAssignee: Sachin Pandit <spandit>
Status: CLOSED DEFERRED QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact: senaik
Priority: high    
Version: rhgs-3.0CC: nsathyan, rhs-bugs, rjoseph, ssamanta, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: SNAPSHOT
Fixed In Version: 3.4.1.snap.jan15.2014 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-26 07:23:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description senaik 2013-12-31 11:53:52 UTC
Description of problem:
======================
While listing CG with -d option after restarting glusterd resulted in glusterd crash


Version-Release number of selected component (if applicable):
============================================================
glusterfs 3.4.0.snap.dec30.2013git

How reproducible:


Steps to Reproduce:
==================
1.Create 2 distribute volumes and start it 

2.Mount the volume and create some files 

3.Create CG of both the volumes 
[root@snapshot-01 brick7]# gluster snapshot create vol1 vol2 -n CG1
snapshot create: CG1: consistency group created successfully

4.Create another CG with both the volumes with description
[root@snapshot-01 brick7]# gluster snapshot create vol1 vol2 -n CG2 -d "this is the first_CG"
snapshot create: CG2: consistency group created successfully

5. List the CG in detail with -d option 

[root@snapshot-01 brick7]# gluster snapshot list -c CG2 -d
CG Name : CG2
CG ID : 772ee2e6-2e0a-4503-aa37-a58e3d074983
CG Description : Does not exist
CG Status : Init

Volume Name : vol1
Number of snaps taken : 2
Number of snaps available : 254

	Snap Name : CG1_vol1_snap
	Snap Time : 2013-12-31 06:08:13
	Snap ID : c0c78145-20a5-4036-9b21-e442d6259ad8
	CG Name : CG1
	CG ID : d3a2a7e7-cb07-40ed-a028-2450d16d62c4
	Snap Description : Description not present
	Snap Status : In-use

	Snap Name : CG2_vol1_snap
	Snap Time : 2013-12-31 06:09:12
	Snap ID : 50b78f90-a12c-4d99-b807-d24b2d555bf7
	CG Name : CG2
	CG ID : 772ee2e6-2e0a-4503-aa37-a58e3d074983
	Snap Description : this is the first_CG
	Snap Status : In-use

Volume Name : vol2
Number of snaps taken : 2
Number of snaps available : 254

	Snap Name : CG1_vol2_snap
	Snap Time : 2013-12-31 06:08:14
	Snap ID : c51de252-1c37-4e3a-b537-f3b39a19eca9
	CG Name : CG1
	CG ID : d3a2a7e7-cb07-40ed-a028-2450d16d62c4
	Snap Description : Description not present
	Snap Status : In-use

	Snap Name : CG2_vol2_snap
	Snap Time : 2013-12-31 06:09:12
	Snap ID : ffd861f9-0c08-4964-97e8-012ffc2e7949
	CG Name : CG2
	CG ID : 772ee2e6-2e0a-4503-aa37-a58e3d074983
	Snap Description : this is the first_CG
	Snap Status : In-use

6.List the CGs from different nodes multiple times 

7.On the first node restart glusterd and list CG again 

[root@snapshot-01 brick7]# service glusterd restart
Starting glusterd:                                         [  OK  ]
[root@snapshot-01 brick7]# gluster snapshot list -c CG1 -d
Connection failed. Please check if gluster daemon is operational.
Snapshot command failed

8.On the second node , restart glusterd and list CG again 
[root@snapshot-02 vol2]# gluster snapshot list -c CG2 -d
Snapshot command failed


Actual results:
==============
After restarting glusterd and listing CG , glusterd crashed 


Expected results:


Additional info:
===============

[2013-12-31 04:33:56.004455] I [glusterd-handler.c:3294:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.43.32 (0), ret: 0
pending frames:
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-12-31 04:34:03configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.0.snap.dec30.2013git
/lib64/libc.so.6[0x3e330329a0]
/usr/lib64/glusterfs/3.4.0.snap.dec30.2013git/xlator/mgmt/glusterd.so(+0xabc86)[0x7fe90d603c86]
/usr/lib64/glusterfs/3.4.0.snap.dec30.2013git/xlator/mgmt/glusterd.so(+0xac1dc)[0x7fe90d6041dc]
/usr/lib64/glusterfs/3.4.0.snap.dec30.2013git/xlator/mgmt/glusterd.so(glusterd_handle_snapshot_list+0x708)[0x7fe90d604fc8]
/usr/lib64/glusterfs/3.4.0.snap.dec30.2013git/xlator/mgmt/glusterd.so(glusterd_handle_snapshot_fn+0x42f)[0x7fe90d60b7ff]
/usr/lib64/glusterfs/3.4.0.snap.dec30.2013git/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7fe90d5813cf]
/usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x30e7a4cdd2]
/lib64/libc.so.6[0x3e33043bf0]

Comment 3 Sachin Pandit 2014-01-16 06:13:41 UTC
This was happening if the snap description contained multiple words.
Till now we were handling the (key,value) containing only single value ( no space in between), But in Description the user can specify multiple words and while reading the value during glusterd restart there was mismatch in the key value pair. This has been fixed in patch http://review.gluster.org/#/c/6543/.

Comment 4 senaik 2014-01-20 12:05:11 UTC
Version :  3.4.1.snap.jan15.2014git
=========

Repeated the steps as mentioned in Description, after creating CG s with multiple volumes , and listing them before and after restarting glusterd did not result in glusterd crash.

Marking the bug as 'Verified'

Comment 6 Nagaprasad Sathyanarayana 2014-04-21 06:18:06 UTC
Marking snapshot BZs to RHS 3.0.

Comment 7 ssamanta 2014-04-21 11:16:15 UTC
Not in scope for Denali.

Comment 8 Nagaprasad Sathyanarayana 2014-05-06 12:06:00 UTC
Fixing RHS 3.0 flags.

Comment 9 Nagaprasad Sathyanarayana 2014-05-06 12:12:53 UTC
Fixing RHS 3.0 flags.

Comment 10 Nagaprasad Sathyanarayana 2014-05-19 10:56:42 UTC
Setting flags required to add BZs to RHS 3.0 Errata

Comment 11 rjoseph 2014-05-26 07:23:49 UTC
CG functionality is removed from snapshot code. Therefore moving the bug to close state.