Bug 765027 (GLUSTER-3295)

Summary: [44598a525afadf2602733d1da2dfa767b5b857f2]: glusterd crashed while doing rebalance
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: glusterdAssignee: Amar Tumballi <amarts>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.5CC: gluster-bugs, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: release-3.2, release-3.1 Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Raghavendra Bhat 2011-08-01 00:44:02 EDT
Its on release-3.1 branch.
Comment 1 Amar Tumballi 2011-08-01 01:36:43 EDT
Seems to be a race. need to change the set of variables we are considering to say volume rebalance is in progress. This crash can happen in all the branches (3.1/3.2/master)
Comment 2 Raghavendra Bhat 2011-08-01 03:43:33 EDT
glusterd crashed while doing rebalance. Operations performed:

1) Created a replicate volume with replica count 2.

2) mounted it through fuse client and started untarring linux tarball on the mount point.

3) Added 2 more bricks to the volume making it 2x2 distributed replicate volume.

4) Gave rebalance fix-layout start.

5) After fix-layout is complete gave migrate-data start.

6) While migrate-data is happening gave rebalance stop (gluster volume rebalance <volname> stop)

7) Then gave rebalance start (gluster volume rebalance <volname> start).

8) Started doing rebalance status and after many times doing the status glusterd crashed.

This is the backtrace.

Core was generated by `glusterd'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f051224387f in gf_glusterd_rebalance_fix_layout (volinfo=0x1116ff0, 
    dir=0x7f0510c3cf20 "/etc/glusterd/mount/mirror/linux-")
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-rebalance.c:344
344	                        volinfo->defrag->total_files += 1;
(gdb) bt
#0  0x00007f051224387f in gf_glusterd_rebalance_fix_layout (volinfo=0x1116ff0, 
    dir=0x7f0510c3cf20 "/etc/glusterd/mount/mirror/linux-")
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-rebalance.c:344
#1  0x00007f05122438a2 in gf_glusterd_rebalance_fix_layout (volinfo=0x1116ff0, 
    dir=0x7f0510c3d490 "/etc/glusterd/mount/mirror/linux-") at ../../../../../xlators/mgmt/glusterd/src/glusterd-rebalance.c:347
#2  0x00007f05122438a2 in gf_glusterd_rebalance_fix_layout (volinfo=0x1116ff0, dir=0x112aaf8 "/etc/glusterd/mount/mirror")
    at ../../../../../xlators/mgmt/glusterd/src/glusterd-rebalance.c:347
#3  0x00007f0512243b50 in glusterd_defrag_start (data=0x1116ff0) at ../../../../../xlators/mgmt/glusterd/src/glusterd-rebalance.c:407
#4  0x00007f0513de99ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#5  0x00007f0513b466fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()
(gdb) p volinfo->defrag
$1 = (glusterd_defrag_info_t *) 0x0
(gdb) l
339	                if (S_ISDIR (stbuf.st_mode)) {
340	                        /* Fix the layout of the directory */
341	                        sys_lgetxattr (full_path, "trusted.distribute.fix.layout",
342	                                       &value, 128);
344	                        volinfo->defrag->total_files += 1;
346	                        /* Traverse into subdirectory */
347	                        ret = gf_glusterd_rebalance_fix_layout (volinfo,
348	                                                                full_path);

Possible issue:

l glusterd_is_defrag_on
2256	        return ret;
2257	}
2259	int
2260	glusterd_is_defrag_on (glusterd_volinfo_t *volinfo)
2261	{
2262	        return ((volinfo->defrag_status == GF_DEFRAG_STATUS_LAYOUT_FIX_STARTED) ||
2263	                (volinfo->defrag_status == GF_DEFRAG_STATUS_MIGRATE_DATA_STARTED));
2264	}

glusterd_is_defrag_on says defrag is on just by checking if layout-fix is started or migrate-data started. But if the defrag status is in fix-layout completed or migrate-data-completed, then its not considered and glusterd takes it as defrag is not on. 

So in this situation since I stopped rebalance and started it again, by the time I gave glusterd rebalance start , if by that time if the status was migrate-data-completed, then gluster_defrag_on does not consider it and says defrag on off. glusterd_handle_defrag_start which calls glusterd_defrag_on and gets the wrong information that defrag is not on will create one more thread for glusterd_defrag_start (thus overriding the defrag->th value which had the thread id of the previous rebalance thread) and starts rebalance (it will use the previous volinfo->defrag instead of allocating new one because while allocating volinfo->defrag we do it only if it i.e. volinfo->defrag is NULL. Since in the previous rebalance invocation we allocated volinfo->defrag both the threads refer to the same object.)

Now when the previous rebalance thread (whose context is lost since we created another thread overriding the defrag->th value for the 2nd rebalance) is about to complete it sets volinfo->defrag to NULL and exit. After that if the current rebalance thread which is unaware of that previous rebalance thread is running and it has set volinfo->defrag to NULL, access the volinfo->defrag it crashes.
Comment 3 Anand Avati 2011-08-03 22:23:55 EDT
CHANGE: http://review.gluster.com/138 (due to the race, there was a possibility of having two (or more) threads doing) merged in master by Anand Avati (avati@gluster.com)
Comment 4 Anand Avati 2011-08-03 22:32:41 EDT
CHANGE: http://review.gluster.com/140 (due to the race, there was a possibility of having two (or more) threads doing) merged in release-3.1 by Anand Avati (avati@gluster.com)
Comment 5 Anand Avati 2011-08-03 22:32:57 EDT
CHANGE: http://review.gluster.com/139 (due to the race, there was a possibility of having two (or more) threads doing) merged in release-3.2 by Anand Avati (avati@gluster.com)
Comment 6 Amar Tumballi 2011-08-03 22:38:52 EDT
The race condition which could have caused this crash is fixed now. The ultimate fixes in all rebalance related issues should come once we make rebalance cluster friendly (ie, like other CLI ops).
Comment 7 Raghavendra Bhat 2011-08-19 05:34:21 EDT
This is a very racy condition in which the rebalance status should not be layout fix started or migrate data started (they can be completed which was not considered to decide if rebalance is running). 

In such situation one more rebalance start command will conclude that rebalance is not running and creates one more thread (which also uses the same defrag structure as the 1st thread, because defrag structure is allocated only if it is NULL) thus both threads when access the same structure the process crashes.

Now that racy code is removed where to decide if rebalance is already running, glusterd does not see if fix-layout or migrate-data started. Instead it directly checks for volinfo>defrag and if it is not NULL, then considers rebalance is running.