Description of problem: When you do configuration changes (adding/removing volumes, adding bricks to a volume) while a node in the trusted storage pool is down (e.g. for maintenance), these changes are not reflected on the "down" node once it rejoins the pool. Version-Release number of selected component (if applicable): 2.0 How reproducible: Reproducible Steps to Reproduce: 1. Shutdown one node of the pool (rhs1-3) 2. On one of the remaining nodes, change volume configs (remove volume, add volume, add brick to a volume) 3. Start the node that was down Actual results: [root@rhs1-2 ~]# gluster volume list repvol1 distvol1 testvol2 <-- this volume was added while rhs1-3 was down [root@rhs1-3 ~]# gluster volume list distvol1 repvol1 testvol1 <-- this vol was deleted while rhs1-3 was down [root@rhs1-2 ~]# gluster volume info distvol1 Volume Name: distvol1 Type: Distribute Volume ID: 186892db-e18e-4c85-9dd3-38f248e26c02 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: rhs1-1:/export2 Brick2: rhs1-2:/export2 Brick3: rhs1-3:/export2 Brick4: rhs1-1:/export5 <-- this brick was added while rhs1-3 was down [root@rhs1-3 ~]# gluster volume info distvol1 Volume Name: distvol1 Type: Distribute Volume ID: 186892db-e18e-4c85-9dd3-38f248e26c02 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: rhs1-1:/export2 Brick2: rhs1-2:/export2 Brick3: rhs1-3:/export2 Expected results: Consistent volume information across all nodes, should be synced automatically once a node rejoins Additional info:
as long as I knew, the syncing should have happened fine. Kaushal, when you get a chance can you have a look on this?
Btw., /etc/fstab and /etc/samba/smb.conf don't get synced during rejoin, too.
A volume deleted in the absence of one of the peers wouldn't be removed from the cluster's list of volumes. This is because the 'import' logic of peers that rejoin the cluster is not capable of differentiating between volumes deleted and volumes added in the absence of the other (conflicting) peers. For now, we intend to manually detect it which may involve analysing cli cmd logs to get the cluster view of the volumes that 'ought' to be present. Once we arrive at this picture, we could use volume-sync to reconcile the skewed view of volumes in the cluster. Bricks added/deleted to a volume while some of the peers were down/unreachable, are 'imported' as they rejoin the cluster. This works fine on upstream/master.
marking it for known issues (for 3.4.0 release?). Patric, let us know if comment#4 has provided you with right information. Would like to close it after documenting, as worksforme in that case.
Patric, Could you let us know if you are in agreement with comment#4 ?
Hi, thank you. Yes, I'm in agreement with comment #4, please document the behaviour for the time being. Best regards, Patric
Btw., the behaviour is the same if you change volume tunables on a volume while a node is down. This leads to quite nasty incositencies. Please document this, too. Patric