Bug 1632935 - "peer probe" rejected after a tier detach commit: "Version of Cksums nas-volume differ".
Summary: "peer probe" rejected after a tier detach commit: "Version of Cksums nas-volu...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: tiering
Version: mainline
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-25 20:29 UTC by Jeff Byers
Modified: 2018-11-02 08:13 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-02 08:13:49 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jeff Byers 2018-09-25 20:29:33 UTC
"peer probe" rejected after a tier detach commit: "Version of Cksums nas-volume differ". Peer state gets stuck in "Peer Rejected (Connected)".

If you had tiered a volume, then detached and committed the
detach, any attempt to peer probe another node then fails
due to "Version of Cksums nas-volume differ".

The problem appears to be that the "tier detach commit" leaves
the volume 'info' file setting "tier-enabled=1" even though
the volume is no longer tiered.

The only way to be able to probe a peer successfully is to
stop 'glusterd', hand edit the volume's 'info' file changing
"tier-enabled=0", and restarting 'glusterd'.

# gluster volume tier nas-volume attach 192.168.101.66:/brick/nas-volume
# gluster volume tier nas-volume detach start
# gluster volume tier nas-volume detach status
# gluster volume tier nas-volume detach commit

# gluster --version
glusterfs 3.12.14

# gluster volume info nas-volume
Volume Name: nas-volume
Type: Distribute
Volume ID: 4a8f5b05-16d8-47db-9494-ce2f19eb58ff
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 192.168.101.66:/brick/nas-volume
Options Reconfigured:
server.allow-insecure: on
performance.quick-read: off
performance.stat-prefetch: off
nfs.addr-namelookup: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
snap-activate-on-create: enable

# gluster peer status
Number of Peers: 0
# gluster peer probe 192.168.101.68
peer probe: success.
# gluster peer status
Number of Peers: 1

Hostname: 192.168.101.68
Uuid: c0efccec-b0a0-a091-e517-000c297b1839
State: Peer Rejected (Connected)

__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: c0efccec-b0a0-a091-e517-000c297b1839
glusterd_compare_friend_volume] 0-management: Version of Cksums nas-volume differ. local cksum = 3949508827, remote cksum = 3950289631 on peer 192.168.101.68
glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 192.168.101.68 (0), ret: 0, op_ret: -1

# grep tier-enabled /var/lib/glusterd/vols/nas-volume-0003/info
tier-enabled=1

Comment 1 Jeff Byers 2018-09-26 23:19:06 UTC
Note that the converse problem happens when you have a GlusterFS node with tiering enabled, so the 'info' file has "tier-enabled=1". When you probe the peer, it fails because the 'info' file it gets has 'tier-enabled=0' instead of 'tier-enabled=1'.

Comment 2 Shyamsundar 2018-10-23 14:55:27 UTC
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 3 Amar Tumballi 2018-11-02 08:13:49 UTC
Patch https://review.gluster.org/#/c/glusterfs/+/21331/ removes tier functionality from GlusterFS. 

https://bugzilla.redhat.com/show_bug.cgi?id=1642807 is used as the tracking bug for this. Recommendation is to convert your tier volume to regular volume (either replicate, ec, or plain distribute) with "tier detach" command before upgrade, and use backend features like dm-cache etc to utilize the caching from backend to provide better performance and functionality.


Note You need to log in before you can comment on or make changes to this bug.