Created attachment 1427611 [details] node info and device info Description of problem: I have 3 nodes. All are in diff zones. All the devices on node1 and node2 are tagged as disabled. All except one device of node3 is also marked as disabled. The other device is marked as supported on node3. Volume creation for arbiter volume was done for 10 times. The volume creation was successful only 3 times out of 10. Version-Release number of selected component (if applicable): 6.0.0-11 How reproducible:7/10 Steps to Reproduce: 1. mark all devices of node1 to disabled 2. mark all devices of node2 to disabled 3. mark all devices of node3 to disabled except one device. 4. mark that one device which is left on node3 to supported. 5. create arbiter volume. Actual results: volume is not created 7 out of 10 times. Expected results: volume should be created 10 out of 10 times.
Please list the exact commands that you used.
See this scenario: N1 N2 N3 #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-s N stans for node # stands for device d stands for arbiter disabled s stands for arbiter supported commands which i used: 1. Give the required tag to drives of nodes. # heketi-cli device settags device_id arbiter:disabled 2. Give the supported tag to the last device of third node. # heketi-cli device settags device_id arbiter:supported 3. Create volume. # heketi-cli volume create --size=10 --gluster-volume-options='user.heketi.arbiter true' 4. Try to create 10 volumes atleast. 5. Check that it is actually working according to tags. # heketi-cli topology info # gluster v info Comment created You will see that some times it will create volumes and some times it will not create volumes.
In these two scenario it is also create volumes 4 times out of 10 N1 N2 N3 #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb ------------------------------------- N1 N2 N3 #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-d #100gb-r
I've looked into this a bit and think I understand what you're running into. There's a random element to how the devices are selected for bricks and this can lead to placements where all the constraints are impossible to satisfy. This is true of regular volumes as well as arbiter. However, it gets a bit worse on arbiter when you start tagging the devices as you're adding additional restrictions where bricks can go. What triggers the problem is when it picks devices in a way that it can't satisfy both the constraint that no bricks in a brick set share a node and the free size and tag constraints. This can't be made 100% reliable in the current version of Heketi, as this is baked in at the moment. However, I will look into trying to make the likelihood of successful placement higher.
Updated doc text in the Doc Text field. Please review for technical accuracy.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2686