Bug 762296 (GLUSTER-564) - 3.0.1rc3 server daemon crashes when any of the 2.0.x version client connects
Summary: 3.0.1rc3 server daemon crashes when any of the 2.0.x version client connects
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-564
Product: GlusterFS
Classification: Community
Component: protocol
Version: 3.0.0
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Anand Avati
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-22 23:08 UTC by Amar Tumballi
Modified: 2015-12-01 16:45 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)
afr logs (46.21 KB, application/x-bzip)
2010-01-24 08:55 UTC, Lakshmipathi G
no flags Details
dht logs (25.89 KB, application/x-bzip)
2010-01-24 08:56 UTC, Lakshmipathi G
no flags Details
stripe logs (41.46 KB, application/x-bzip)
2010-01-24 08:57 UTC, Lakshmipathi G
no flags Details

Description Amar Tumballi 2010-01-22 20:10:43 UTC
http://patches.gluster.com/patch/2603/ was submitted to solve this issue.. but never got any reply on the same.. (reject/accept/on-hold etc)

Comment 1 Amar Tumballi 2010-01-22 23:08:58 UTC
Core was generated by `/tmp/g3.0/sbin/glusterfsd -f 11.vol -l /dev/stdout'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f58e170f46f in gf_print_trace (signum=11) at common-utils.c:413
413				if ((tmp->root->type == GF_OP_TYPE_FOP_REQUEST) ||
(gdb) bt
#0  0x00007f58e170f46f in gf_print_trace (signum=11) at common-utils.c:413
#1  <signal handler called>
#2  0x00007f58dff0aa14 in server_decode_groups (frame=0x1de9198, 
    hdr=<value optimized out>) at server-protocol.c:6124
#3  0x00007f58dff0c1f3 in get_frame_for_call (trans=<value optimized out>, 
    hdr=0x1de8fd0) at server-protocol.c:6158
#4  0x00007f58dff0c44c in protocol_server_interpret (this=0x1ddfac0, 
    trans=0x1de8870, hdr_p=0x1de8fd0 "", hdrlen=312, iobuf=0x0)
    at server-protocol.c:6288
#5  0x00007f58dff0c62a in protocol_server_pollin (this=0x1ddfac0, 
    trans=0x1de8870) at server-protocol.c:6681
#6  0x00007f58dff0c6b3 in notify (this=0x1ddfac0, event=<value optimized out>, 
    data=0x12) at server-protocol.c:6737
#7  0x00007f58e1705793 in xlator_notify (xl=0x1ddfac0, event=2, data=0x1de8870)
    at xlator.c:923
#8  0x00007f58df4fc578 in socket_event_handler (fd=<value optimized out>, 
    idx=1, data=0x1de8870, poll_in=1, poll_out=0, 
    poll_err=<value optimized out>) at socket.c:829
#9  0x00007f58e171fced in event_dispatch_epoll_handler (event_pool=0x1dd8340)
    at event.c:804
#10 event_dispatch_epoll (event_pool=0x1dd8340) at event.c:867
#11 0x0000000000404466 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at glusterfsd.c:1388
(gdb) 
-----------------------

<not able to find other bug report, please mark duplicate if there is one more bug on this>

Comment 2 Anand Avati 2010-01-23 03:13:10 UTC
(In reply to comment #1)
> http://patches.gluster.com/patch/2603/ was submitted to solve this issue.. but
> never got any reply on the same.. (reject/accept/on-hold etc)

patch 2603 breaks compatibility within 3.0.x server/clients.

Comment 3 Anand Avati 2010-01-23 04:01:51 UTC
PATCH: http://patches.gluster.com/patch/2674 in master (protocol/server: handle group id decoding in a stricter way)

Comment 4 Anand Avati 2010-01-23 15:37:20 UTC
PATCH: http://patches.gluster.com/patch/2682 in master (protocol/client: Look only for op_ret while handling a setvolume response.)

Comment 5 Lakshmipathi G 2010-01-24 08:31:41 UTC
For following cases
1) 3.0.1rc5 server and  2.0.x clients
2) 3.0.1rc5 client and 2.0.x servers

3.0.1rc5  didn't crash with 2.0.1,2.0.4.,2.0.6,2.0.7rc9,2.0.8,2.0.9rc4.

Comment 6 Lakshmipathi G 2010-01-24 08:55:49 UTC
Created attachment 136 [details]
The bad XF86Config file

Comment 7 Lakshmipathi G 2010-01-24 08:56:49 UTC
Created attachment 137 [details]
Patch to add UniCyr fonts

Comment 8 Lakshmipathi G 2010-01-24 08:57:30 UTC
Created attachment 138 [details]
Good .spec file that fix both 9898 and 9908 bugs

Comment 9 Lakshmipathi G 2010-01-24 08:59:14 UTC
For following test cases logs are attached

afr(server+client-status)
3.0.1rc5 + 2.0.9rc4 - okay
3.0.1rc5 + 2.0.8- okay
3.0.1rc5 + 2.0.7rc9 - okay
3.0.1rc5 + 2.0.6- okay
3.0.1rc5 +  2.0.4-okay
3.0.1rc5 + 2.0.1-okay
++++++++++++++
2.0.9rc4 +3.0.1rc5  - okay
2.0.8 + 3.0.1rc5 - okay
2.0.7rc9 + 3.0.1rc5  -okay
2.0.6 + 3.0.1rc5 -okay
2.0.4 + 3.0.1rc5 -okay
2.0.1 + 3.0.1rc5-okay
----------------------------------------------
dht(server+client-status)
----------------------------------------------
3.0.1rc5 + 2.0.9rc4 -okay
3.0.1rc5 + 2.0.8- okay
3.0.1rc5 + 2.0.7rc9 -okay
3.0.1rc5 + 2.0.6- okay
3.0.1rc5 + 2.0.4-okay
3.0.1rc5 + 2.0.1-okay


2.0.9rc4 +3.0.1rc5
2.0.8+3.0.1rc5 -okay
2.0.7rc9 +3.0.1rc5 -okay
2.0.6+3.0.1rc5 - okay
2.0.4+3.0.1rc5 - okay
2.0.1+3.0.1rc5 -okay

------------------------------------------------
stripe(server+client-status)
----------------------------------------------
3.0.1rc5 + 2.0.9rc4 -okay
3.0.1rc5 + 2.0.8- okay
3.0.1rc5 + 2.0.7rc9 -okay
3.0.1rc5 + 2.0.6- okay
3.0.1rc5 + 2.0.4-okay
3.0.1rc5 + 2.0.1-okay

2.0.9rc4 +3.0.1rc5 -okay
2.0.8+3.0.1rc5 -okay
2.0.7rc9 +3.0.1rc5 -okay
2.0.6+3.0.1rc5 - okay
2.0.4+3.0.1rc5 -okay
2.0.1+3.0.1rc5 -okay

Comment 10 Vijay Bellur 2010-07-16 16:38:39 UTC
3.0.5 servers crash when 2.0.x clients attempt to connect and perform a set volume. Hence re-opening this. Check needs to be added to return an error from protocol_server_interpret() when there is a failure in decoding of groups.

Comment 11 Vijay Bellur 2010-07-19 02:35:19 UTC
The crash seems to happen only when a 32-bit client/server is involved.

Comment 12 shishir gowda 2010-08-06 08:10:09 UTC
fixed in bug 762763 for 3.0 and mainline


Note You need to log in before you can comment on or make changes to this bug.