Bug 762646 (GLUSTER-914) - [3.0.4] Crash in afr_opendir_cbk
Summary: [3.0.4] Crash in afr_opendir_cbk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-914
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.0.3
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Pavan Vilas Sondur
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-10 06:08 UTC by Anush Shetty
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTNR
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Pavan Vilas Sondur 2010-05-10 03:15:27 UTC
<snip>

	LOCK (&frame->lock);
	{
		local = frame->local;

		if (op_ret >= 0)
			local->op_ret = op_ret;

		local->op_errno = op_errno;
	}
	UNLOCK (&frame->lock);

	call_count = afr_frame_return (frame);

	if (call_count == 0) {
                if (local->op_ret == 0) {
                        ret = afr_fd_ctx_set (this, local->fd); <<<

<snip>

The bug is in setting local->op_ret. If the last opendir call fails (call_count == 0) and previous ones have succeeded, it results in a crash.

Comment 1 Anush Shetty 2010-05-10 06:08:54 UTC
Found this crash with error-gen on Replicate with 4 subvolumes when 2 subvolumes of Replicates were error-gen subvolumes. 

volume replicate
  type cluster/replicate
  subvolumes client1 client2 client3-error-gen client4-error-gen
end-volume


(gdb) bt
#0  0x00007f410694c18e in afr_opendir_cbk (frame=0x143a3d0, cookie=0x143d680, this=0x142bed0, op_ret=-1, op_errno=19, fd=0x0) at afr-dir-read.c:242
#1  0x00007f4106b9854d in error_gen_opendir (frame=0x143d680, this=0x142bbd0, loc=0x14350b8, fd=0x14426c0) at error-gen.c:1272
#2  0x00007f410694c909 in afr_opendir (frame=0x143a3d0, this=0x142bed0, loc=0x14350b8, fd=0x14426c0) at afr-dir-read.c:314
#3  0x00007f4107f8078b in default_opendir (frame=0x143a370, this=0x142c790, loc=0x14350b8, fd=0x14426c0) at defaults.c:701
#4  0x00007f4106b986d3 in error_gen_opendir (frame=0x143af40, this=0x142d0b0, loc=0x14350b8, fd=0x14426c0) at error-gen.c:1276
#5  0x00007f4107f8078b in default_opendir (frame=0x1441040, this=0x142d340, loc=0x14350b8, fd=0x14426c0) at defaults.c:701
#6  0x00007f410630f6e2 in fuse_opendir (this=0x1425550, finh=0x143aba0, msg=0x143abc8) at fuse-bridge.c:2131
#7  0x00007f410631433e in fuse_thread_proc (data=0x1425550) at fuse-bridge.c:3191
#8  0x00007f4107b46a04 in start_thread (arg=<value optimized out>) at pthread_create.c:300
#9  0x00007f41078b080d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

Comment 2 Anand Avati 2010-05-21 04:31:40 UTC
PATCH: http://patches.gluster.com/patch/3271 in master (cluster/afr: Don't dereference fd ptr - it might be NULL due to a failed call.)

Comment 3 Anand Avati 2010-05-21 04:32:22 UTC
PATCH: http://patches.gluster.com/patch/3272 in release-3.0 (cluster/afr: Don't dereference fd ptr - it might be NULL due to a failed call.)


Note You need to log in before you can comment on or make changes to this bug.