Bug 762784 (GLUSTER-1052) - Crash in server_lookup_cbk
Summary: Crash in server_lookup_cbk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-1052
Product: GlusterFS
Classification: Community
Component: protocol
Version: mainline
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-06 09:45 UTC by Anush Shetty
Modified: 2013-12-19 00:04 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Anush Shetty 2010-07-06 09:45:52 UTC
Caught this while running dbench on a dht+afr setup with error-gen below server protocol

Server vol file:
volume posix4
  type storage/posix 
  option directory /mnt/exportnew1
end-volume

volume locks
  type features/posix-locks
  option mandatory on
  subvolumes posix4
end-volume


volume iot
 type performance/io-threads
 option thread-count 8
 subvolumes locks
end-volume

volume brick4
  type debug/error-gen
  option failure 5
  subvolumes iot
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option listen-port 8181
  subvolumes brick4
  option auth.addr.brick4.allow *              
end-volume

(gdb) bt
#0  0x00007fcab3086106 in server_lookup_cbk (frame=0x1ed6700, cookie=0x1ed8da8, this=0x1eb6158, op_ret=0, op_errno=22, inode=0x1ed67a8, 
    stbuf=0x7fcab1efde60, dict=0x0, postparent=0x7fcab1efddf0) at server3_1-fops.c:163
#1  0x00007fcab32a48c0 in error_gen_lookup_cbk (frame=0x1ed8da8, cookie=0x1ed1418, this=0x1eb4e58, op_ret=0, op_errno=22, inode=0x1ed67a8, 
    buf=0x7fcab1efde60, dict=0x0, postparent=0x7fcab1efddf0) at error-gen.c:381
#2  0x00007fcab34ba693 in iot_lookup_cbk (frame=0x1ed1418, cookie=0x1ed0e48, this=0x1eb3ba8, op_ret=0, op_errno=22, inode=0x1ed67a8, buf=0x7fcab1efde60, 
    xattr=0x0, postparent=0x7fcab1efddf0) at io-threads.c:168
#3  0x00007fcab36d60c8 in pl_lookup_cbk (frame=0x1ed0e48, cookie=0x1ec8908, this=0x1eb2868, op_ret=0, op_errno=22, inode=0x1ed67a8, buf=0x7fcab1efde60, 
    dict=0x0, postparent=0x7fcab1efddf0) at posix.c:1127
#4  0x00007fcab38e64f6 in posix_lookup (frame=0x1ec8908, this=0x1eb1558, loc=0x1ed14e8, xattr_req=0x0) at posix.c:532
#5  0x00007fcab36d650a in pl_lookup (frame=0x1ed0e48, this=0x1eb2868, loc=0x1ed14e8, xattr_req=0x0) at posix.c:1167
#6  0x00007fcab34ba897 in iot_lookup_wrapper (frame=0x1ed1418, this=0x1eb3ba8, loc=0x1ed14e8, xattr_req=0x0) at io-threads.c:178
#7  0x00007fcab52c484b in call_resume_wind (stub=0x1ed14b8) at call-stub.c:2471
#8  0x00007fcab52cb0b5 in call_resume (stub=0x1ed14b8) at call-stub.c:3954
#9  0x00007fcab34ba472 in iot_worker (data=0x1ebb758) at io-threads.c:118
#10 0x00007fcab4e77a04 in start_thread (arg=<value optimized out>) at pthread_create.c:300
#11 0x00007fcab4be0d4d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#12 0x0000000000000000 in ?? ()

Comment 1 Amar Tumballi 2010-07-06 11:13:05 UTC
Patch http://patches.gluster.com/patch/3542/ fixes the issue.

This happened because 'frame->local' was set to NULL at the entry of server_lookup_cbk, and in a failure path, there was a STACK_WIND () with _CBKFN value set to 'server_lookup_cbk'. 

When this path of the code is hit, when the server_lookup_cbk comes back second time, invariably 'req' used to be NULL, and this used to crash. 

With the above patch, the bug gets fixed.


Note You need to log in before you can comment on or make changes to this bug.