"peer probe" rejected after a tier detach commit: "Version of Cksums nas-volume differ". Peer state gets stuck in "Peer Rejected (Connected)". If you had tiered a volume, then detached and committed the detach, any attempt to peer probe another node then fails due to "Version of Cksums nas-volume differ". The problem appears to be that the "tier detach commit" leaves the volume 'info' file setting "tier-enabled=1" even though the volume is no longer tiered. The only way to be able to probe a peer successfully is to stop 'glusterd', hand edit the volume's 'info' file changing "tier-enabled=0", and restarting 'glusterd'. # gluster volume tier nas-volume attach 192.168.101.66:/brick/nas-volume # gluster volume tier nas-volume detach start # gluster volume tier nas-volume detach status # gluster volume tier nas-volume detach commit # gluster --version glusterfs 3.12.14 # gluster volume info nas-volume Volume Name: nas-volume Type: Distribute Volume ID: 4a8f5b05-16d8-47db-9494-ce2f19eb58ff Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 192.168.101.66:/brick/nas-volume Options Reconfigured: server.allow-insecure: on performance.quick-read: off performance.stat-prefetch: off nfs.addr-namelookup: off transport.address-family: inet nfs.disable: on performance.client-io-threads: on snap-activate-on-create: enable # gluster peer status Number of Peers: 0 # gluster peer probe 192.168.101.68 peer probe: success. # gluster peer status Number of Peers: 1 Hostname: 192.168.101.68 Uuid: c0efccec-b0a0-a091-e517-000c297b1839 State: Peer Rejected (Connected) __glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: c0efccec-b0a0-a091-e517-000c297b1839 glusterd_compare_friend_volume] 0-management: Version of Cksums nas-volume differ. local cksum = 3949508827, remote cksum = 3950289631 on peer 192.168.101.68 glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 192.168.101.68 (0), ret: 0, op_ret: -1 # grep tier-enabled /var/lib/glusterd/vols/nas-volume-0003/info tier-enabled=1
Note that the converse problem happens when you have a GlusterFS node with tiering enabled, so the 'info' file has "tier-enabled=1". When you probe the peer, it fails because the 'info' file it gets has 'tier-enabled=0' instead of 'tier-enabled=1'.
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.
Patch https://review.gluster.org/#/c/glusterfs/+/21331/ removes tier functionality from GlusterFS. https://bugzilla.redhat.com/show_bug.cgi?id=1642807 is used as the tracking bug for this. Recommendation is to convert your tier volume to regular volume (either replicate, ec, or plain distribute) with "tier detach" command before upgrade, and use backend features like dm-cache etc to utilize the caching from backend to provide better performance and functionality.