Description of problem: I was testing few disk full scenario's with add-brick/ replace-brick and gluster rebalance resulted in many failures, also errors were reported in logs # gluster v rebal arbitervol status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- dhcp46-55.lab.eng.blr.redhat.com 71393 6.9GB 539238 837 17235 completed 10:30:35 10.70.47.184 66826 6.6GB 405381 709 10993 completed 9:44:52 10.70.47.67 28714 4.6GB 239012 436 13727 completed 6:32:24 10.70.46.169 33532 2.4GB 276708 393 14647 completed 6:53:49 10.70.47.122 4159 308.6KB 29717 8 4 completed 2:11:21 Version-Release number of selected component (if applicable): # rpm -qa | grep gluster gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-rdma-3.12.2-34.el7rhgs.x86_64 glusterfs-server-3.12.2-34.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-34.el7rhgs.x86_64 glusterfs-fuse-3.12.2-34.el7rhgs.x86_64 glusterfs-events-3.12.2-34.el7rhgs.x86_64 glusterfs-3.12.2-34.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64 glusterfs-api-3.12.2-34.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-34.el7rhgs.x86_64 How reproducible: 1/1 Steps to Reproduce: 1. Create 2X (2+1) arbiter vol in 5 node cluster. 2. Mount it to client and do IO's smallfiles, large files, symlinks, hardlinks and rename. Atleast 10 lakh files were created and modified 3. Now bring down one data brick from each replica set and fill the bricks with junk data from backend. This resulted in these down data bricks to be 95% full and rest bricks are close to 70% full due to client IO's 4. Now when bricks are down, do IO's large files to make the rest bricks 90% full, also untarring the kernel multiple times in the same directory. 5. Now bring up the down bricks and issue heal. 6. Heal is triggered and files are starting to be created on the previously down bricks, 7. Disk usage reached 100 % on the previously down bricks, hence heal was failing with "no space left" error which is expected. 8. Now disabled shd, did peer probed another node, did a replace-brick to uniformly distribute bricks on the new node and enabled shd 9. Now did a add-brick twice to convert 2X(2+1) vol to 4X(2+1) vol 10. Triggered rebalance 11. The newly added bricks started filling up and the bricks with =~ 100% full started freeing up space as expected 12. Rebalance was successful but resulted in few failures and errors in logs 13. Few files resulted in split-brain 14. All the above steps were done when IO's were ongoing Actual results: Failures reported in rebalance log and few files ended up in split-brain Expected results: No failures to be reported in rebalance log and no files should be in split-brain. Additional info: I checked for trusted.glusterfs.dht for all the subvol, there was no overlapping and no holes seen. As a part of testing, nodes were brought down few times but at any given time only one of the 6 nodes was offline. # gluster v info arbitervol Volume Name: arbitervol Type: Distributed-Replicate Volume ID: a06d1cf0-a3b2-4dcc-90b5-d0cae8e9d93b Status: Started Snapshot Count: 0 Number of Bricks: 4 x (2 + 1) = 12 Transport-type: tcp Bricks: Brick1: 10.70.46.55:/bricks/brick3/replica3 Brick2: 10.70.47.184:/bricks/brick3/replica3 Brick3: 10.70.46.193:/bricks/brick3/replica3 (arbiter) Brick4: 10.70.47.67:/bricks/brick1/arbitervol Brick5: 10.70.46.169:/bricks/brick1/arbitervol Brick6: 10.70.46.193:/bricks/brick4/replica3 (arbiter) Brick7: 10.70.46.193:/bricks/brick1/arbitervol Brick8: 10.70.46.55:/bricks/brick1/arb Brick9: 10.70.46.169:/bricks/brick2/arbb (arbiter) Brick10: 10.70.46.193:/bricks/brick6/arbiter Brick11: 10.70.46.55:/bricks/brick6/arbiter Brick12: 10.70.47.122:/bricks/brick1/arbiter (arbiter) Options Reconfigured: cluster.shd-max-threads: 40 transport.address-family: inet nfs.disable: on performance.client-io-threads: off cluster.self-heal-daemon: on -------------------------------------------------------------- # gluster v heal arbitervol info summary Brick 10.70.46.55:/bricks/brick3/replica3 Status: Connected Total Number of entries: 14 Number of entries in heal pending: 0 Number of entries in split-brain: 12 Number of entries possibly healing: 2 Brick 10.70.47.184:/bricks/brick3/replica3 Status: Connected Total Number of entries: 29 Number of entries in heal pending: 0 Number of entries in split-brain: 27 Number of entries possibly healing: 2 Brick 10.70.46.193:/bricks/brick3/replica3 Status: Connected Total Number of entries: 29 Number of entries in heal pending: 0 Number of entries in split-brain: 27 Number of entries possibly healing: 2 Brick 10.70.47.67:/bricks/brick1/arbitervol Status: Connected Total Number of entries: 746 Number of entries in heal pending: 736 Number of entries in split-brain: 6 Number of entries possibly healing: 4 Brick 10.70.46.169:/bricks/brick1/arbitervol Status: Connected Total Number of entries: 391 Number of entries in heal pending: 367 Number of entries in split-brain: 22 Number of entries possibly healing: 2 Brick 10.70.46.193:/bricks/brick4/replica3 Status: Connected Total Number of entries: 405 Number of entries in heal pending: 381 Number of entries in split-brain: 20 Number of entries possibly healing: 4 Brick 10.70.46.193:/bricks/brick1/arbitervol Status: Connected Total Number of entries: 0 Number of entries in heal pending: 0 Number of entries in split-brain: 0 Number of entries possibly healing: 0 Brick 10.70.46.55:/bricks/brick1/arb Status: Connected Total Number of entries: 0 Number of entries in heal pending: 0 Number of entries in split-brain: 0 Number of entries possibly healing: 0 Brick 10.70.46.169:/bricks/brick2/arbb Status: Connected Total Number of entries: 0 Number of entries in heal pending: 0 Number of entries in split-brain: 0 Number of entries possibly healing: 0 Brick 10.70.46.193:/bricks/brick6/arbiter Status: Connected Total Number of entries: 0 Number of entries in heal pending: 0 Number of entries in split-brain: 0 Number of entries possibly healing: 0 Brick 10.70.46.55:/bricks/brick6/arbiter Status: Connected Total Number of entries: 0 Number of entries in heal pending: 0 Number of entries in split-brain: 0 Number of entries possibly healing: 0 Brick 10.70.47.122:/bricks/brick1/arbiter Status: Connected Total Number of entries: 0 Number of entries in heal pending: 0 Number of entries in split-brain: 0 Number of entries possibly healing: 0 ----------------------------------------------------- DHT logs 55 ~]# grep 'no subvolume for hash' -C1 /var/log/glusterfs/arbitervol-rebalance.log | tail [2018-12-20 23:28:32.370547] E [MSGID: 109011] [dht-rebalance.c:2740:gf_defrag_migrate_single_file] 0-arbitervol-dht: Failed to get hashed subvol for /newtest/level01/symlink_to_files/5c1bf228%%C6W78DWP97 -- [2018-12-20 23:28:32.630403] I [dht-rebalance.c:1516:dht_migrate_file] 0-arbitervol-dht: /newtest/level01/symlink_to_files/5c1bf227%%QLIKT3PCJV: attempting to move from arbitervol-replicate-2 to arbitervol-replicate-1 [2018-12-20 23:28:32.879469] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-arbitervol-dht: no subvolume for hash (value) = 2161412967 [2018-12-20 23:28:33.042548] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-arbitervol-dht: no subvolume for hash (value) = 2161412967 [2018-12-20 23:28:33.042553] E [MSGID: 109011] [dht-rebalance.c:2740:gf_defrag_migrate_single_file] 0-arbitervol-dht: Failed to get hashed subvol for /newtest/level01/symlink_to_files/5c1bf227%%RHTLGZ00I1 [2018-12-20 23:28:33.048683] I [dht-rebalance.c:1516:dht_migrate_file] 0-arbitervol-dht: /newtest/level01/symlink_to_files/5c1bf227%%X00GE3VSW8: attempting to move from arbitervol-replicate-2 to arbitervol-replicate-1 [2018-12-20 23:28:33.077278] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-arbitervol-dht: no subvolume for hash (value) = 2277217227 [2018-12-20 23:28:33.154179] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-arbitervol-dht: no subvolume for hash (value) = 2277217227 [2018-12-20 23:28:33.154184] E [MSGID: 109011] [dht-rebalance.c:2740:gf_defrag_migrate_single_file] 0-arbitervol-dht: Failed to get hashed subvol for /newtest/level01/symlink_to_files/5c1bf227%%0OT2UE5BOP
Status?