Bug 1090298

Summary: Addition of new server after upgrade from 3.3 results in peer rejected
Product: [Community] GlusterFS Reporter: Awktane <bmackie>
Component: coreAssignee: Ravishankar N <ravishankar>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.4.3CC: gluster-bugs, kkeithle, ravishankar
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-21 14:54:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1095324    

Description Awktane 2014-04-23 05:56:52 UTC
Description of problem:
Upgrade from 3.3 to 3.4, and then adding a new node results in Peer Rejected (connected) because the info files do not match

How reproducible:
Appears to be every time

Steps to Reproduce:
1. Take a 3.3 or prior installation, upgrade it to 3.4
2. Setup brand new 3.4 server, peer probe it from trusted member
3. New server will show as Peer Rejected (connected)

Vol file on new server(s) contain two extra lines:
op-version=2
client-op-version=2

This causes a mismatch and therefore peer rejected as the old server's info file does not contain these lines. The peer is therefore rejected. Workaround is currently to add these lines to the old servers /var/lib/glusterd/vol/{mount}/info file (or perhaps delete them from the new?)

Comment 1 Anand Avati 2014-05-09 11:27:48 UTC
REVIEW: http://review.gluster.org/7729 (glusterd: update op-version info during upgrades.) posted (#1) for review on release-3.4 by Ravishankar N (ravishankar)

Comment 2 Ravishankar N 2014-05-16 04:26:14 UTC
The patch to fix this is being abandoned for reasons described in the review comments. Proposed solution (sic):

Once all the peers have been upgraded, the user must do a dummy volume set operation on all volumes. This ensures that the volume information and checksums are updated correctly. This will allow probing new peers without any problem. For eg:

# gluster volume set <name> brick-log-level INFO

(This won't have any affect on the operation volume as the default log-level is already INFO, but would update the volume info and checksums)"

Comment 3 Awktane 2014-05-16 09:05:43 UTC
Alright, for the existing folk like me who just added those two lines are there any ramifications? I think I did do a volume set to remove lookup-unhashed as it was causing a bunch of files/folders to error out. I assumed this was due to the re-balancing state.

Comment 4 Ravishankar N 2014-05-16 09:18:43 UTC
(In reply to Awktane from comment #3)
> Alright, for the existing folk like me who just added those two lines are
> there any ramifications? I think I did do a volume set to remove
> lookup-unhashed as it was causing a bunch of files/folders to error out. I
> assumed this was due to the re-balancing state.

I don't think it should matter. If the peer status shows all peers in connected state then we are good.