Bug 951903

Summary: Glusterd rejects peers maybe due to little/big endian incompatibility
Product: [Community] GlusterFS Reporter: Martin Hoefling <martin.hoefling>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: unspecified Docs Contact:
Priority: medium    
Version: pre-releaseCC: bugs, gluster-bugs, ndevos
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-22 15:40:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Hoefling 2013-04-14 09:14:08 UTC
Description of problem:

gluster peer status yields 

State: Peer Rejected (Connected)
between amd64 and ppc

Version-Release number of selected component (if applicable):

master branch: 6d6205ede5e90a919d082b4413055d684114253a

How reproducible:

always

Steps to Reproduce:
1. Compile glusterd on amd64 (in my case ubuntu 12.04) and ppc (Debian Squeeze on a Mybook Live)
2. Create volume, after adding bricks - connection is accepted.
3. Restart glusterd on one side.
  
Actual results:
Hostname: xxx                                                                             Uuid: xxx
State: Peer Rejected (Connected)   

on both sides.

Expected results:

Peer is not rejected.

Additional info:

Log indicates that cksum and/or version comparison fails, maybe due to missing conversion from little to big endian -> htonl
[] I [glusterd-rpc-ops.c:322:glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: xxx, host: x, port: 0
[] I [glusterd-handler.c:1743:glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: xxxx
[] E [glusterd-utils.c:2147:glusterd_compare_friend_volume] 0-: Cksums of volume intranet differ. local cksum = 1939930268, remote cksum = -1728160709
[] I [glusterd-handler.c:2713:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to xxxx (0), ret: 0


from glusterd.log:

[2013-03-14 21:32:40.887103] I [client-handshake.c:1456:client_setvolume_cbk] 0-intranet-client-1: Connected to xxx:24010, attached to remote volume '/shares/gluster/intranet'.
[2013-03-14 21:32:40.887119] I [client-handshake.c:1468:client_setvolume_cbk] 0-intranet-client-1: Server and Client lk-version numbers are not same, reopening the fds

Comment 1 Niels de Vos 2013-04-14 12:14:48 UTC
Combining little- and big-endian systems in one cluster will probably have other
issues as well. There was a little discussion about this while inspecting the
network traffic. Not all requests (for example dictionaries with a 'gfid-req'
attribute) are encoded to big-endian (as most of the protocol does).

Reference: https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=7310#c20

Comment 2 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.