Bug 1697965 - [gluster-ansible] Order of the hosts are not preserved, when creating the volume
Summary: [gluster-ansible] Order of the hosts are not preserved, when creating the volume
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: cockpit-ovirt
Classification: oVirt
Component: gluster-ansible
Version: 0.12.6
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ovirt-4.3.3-1
: ---
Assignee: Gobinda Das
QA Contact: SATHEESARAN
URL:
Whiteboard:
: 1686909 (view as bug list)
Depends On:
Blocks: 1696868 1700727
TreeView+ depends on / blocked
 
Reported: 2019-04-09 11:16 UTC by SATHEESARAN
Modified: 2019-04-29 13:57 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1696868
Environment:
Last Closed: 2019-04-29 13:57:45 UTC
oVirt Team: Gluster
Embargoed:
sasundar: ovirt-4.3?
sasundar: blocker?
sbonazzo: planning_ack?
godas: devel_ack+
sasundar: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 99295 0 master MERGED Removed extra vars from playbook to preserve brick creation order 2019-04-11 05:19:50 UTC
oVirt gerrit 99348 0 ovirt-4.3 MERGED Removed extra vars from playbook to preserve brick creation order 2019-04-11 05:39:24 UTC
oVirt gerrit 99499 0 cockpit-ovirt-0.12.7.z MERGED Removed extra vars from playbook to preserve brick creation order 2019-04-17 09:37:14 UTC

Description SATHEESARAN 2019-04-09 11:16:36 UTC
Description of problem:
------------------------
Order of the hosts is not preserved and it won't be the problem for replicated volume, but for the arbitrated volume, gluster-ansible role would place the arbiter brick on any of the host against the users expectation.

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
cockpit-ovirt-dashboard-0.12.7

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Create the vars file with hostnames in order host1, host2, host3
2. Create volumes

Actual results:
---------------
volumes are created with different order of bricks - say host2, host3, host1

Expected results:
-----------------
Volume should be created in the same order as listed in the vars file


Additional info:

--- Additional comment from Sachidananda Urs on 2019-04-07 06:44:33 UTC ---

When I tested I see the bricks are created in the order provided:

Vars:
     gluster_features_hci_volumes:
        - { volname: 'data', brick: '/data-1/data' }
        - { volname: 'engine', brick: '/data-1/engine' }
        - { volname: 'store', brick: '/data-1/store' }
     gluster_features_hci_cluster:
        - host2
        - host3
        - host1


Result:

Volume Name: store
Type: Replicate
Volume ID: 7b03f1d3-8dac-4235-a6a6-fb2e650a6f57
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: host2:/data-1/store
Brick2: host3:/data-1/store
Brick3: host1:/data-1/store
Options Reconfigured:

Volume Name: engine                                                                                                               
Type: Replicate                                                                                                                   
Volume ID: acc4b7a9-2eec-4116-819b-ae8f51d19ed3                                                                                   
Status: Started                                                                                                                   
Snapshot Count: 0                                                                                                                 
Number of Bricks: 1 x 3 = 3                                                                                                       
Transport-type: tcp                                                                                                               
Bricks:                                                                                                                           
Brick1: host2:/data-1/engine                                                                                                      
Brick2: host3:/data-1/engine                                                                                                      
Brick3: host1:/data-1/engine                                                                                                      
Options Reconfigured:                                    

Volume Name: data                                                                                                                 
Type: Replicate                                                                                                                   
Volume ID: aeeb2cd5-6c53-4365-baaa-016ab07059d8                                                                                   
Status: Started                                                                                                                   
Snapshot Count: 0                                                                                                                 
Number of Bricks: 1 x 3 = 3                                                                                                       
Transport-type: tcp                                                                                                               
Bricks:                                                                                                                           
Brick1: host2:/data-1/data                                                                                                        
Brick2: host3:/data-1/data                                                                                                        
Brick3: host1:/data-1/data                                                                                                        
Options Reconfigured:                                                                                                             
cluster.granular-entry-heal: enable       

======================================================

There is no sorting happening.

--- Additional comment from Sachidananda Urs on 2019-04-07 06:44:53 UTC ---

The same bug was raised in earlier version and please see my comment below:

https://bugzilla.redhat.com/show_bug.cgi?id=1636427#c3

If this is still happening, could be a regression in cockpit plugin.
Please attach the generated variable file as well.


The reason for this is using groups['hosts'] and we should not be using that till
Ansible fixes the bug: https://github.com/ansible/ansible/issues/34861

--- Additional comment from SATHEESARAN on 2019-04-08 08:51:45 UTC ---

(In reply to Sachidananda Urs from comment #1)
> When I tested I see the bricks are created in the order provided:
> 
> Vars:
>      gluster_features_hci_volumes:
>         - { volname: 'data', brick: '/data-1/data' }
>         - { volname: 'engine', brick: '/data-1/engine' }
>         - { volname: 'store', brick: '/data-1/store' }
>      gluster_features_hci_cluster:
>         - host2
>         - host3
>         - host1

Sac,

vars file is generated as:

<snip>
    cluster_nodes:
      - host10.lab.eng.blr.redhat.com
      - host11.lab.eng.blr.redhat.com
      - host12.lab.eng.blr.redhat.com
    gluster_features_hci_cluster: '{{ cluster_nodes }}'
    gluster_features_hci_volumes:
      - volname: engine
        brick: /gluster_bricks/engine/engine
        arbiter: 0
      - volname: data
        brick: /gluster_bricks/data/data
        arbiter: 1
      - volname: vmstore
        brick: /gluster_bricks/vmstore/vmstore
        arbiter: 0
      - volname: extravol1
        brick: /gluster_bricks/extravol1/extravol1
        arbiter: false
</snip>

In this case the order of the host is not preserved.

Sac, Could you help finding the actual problem ?

--- Additional comment from SATHEESARAN on 2019-04-08 08:56:31 UTC ---

The generated vars file is located here - http://mango.lab.eng.blr.redhat.com/hc_inventory.txt

Vars file has the host listed as grafton{10,11,12}
But the volume is created in the order of grafton{12,11,10}

<snip>
    cluster_nodes:
      - host10.lab.eng.blr.redhat.com
      - host11.lab.eng.blr.redhat.com
      - host12.lab.eng.blr.redhat.com
    gluster_features_hci_cluster: '{{ cluster_nodes }}'
    gluster_features_hci_volumes:
      - volname: engine
        brick: /gluster_bricks/engine/engine
        arbiter: 0
      - volname: data
        brick: /gluster_bricks/data/data
        arbiter: 1
      - volname: vmstore
        brick: /gluster_bricks/vmstore/vmstore
        arbiter: 0
      - volname: extravol1
        brick: /gluster_bricks/extravol1/extravol1
        arbiter: false
</snip>



Here is the engine volume that is created:

[root@rhsqa- ~]# gluster volume info engine
 
Volume Name: engine
Type: Replicate
Volume ID: 7b626052-79c9-4d3f-8936-e72de3a6bcf7
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: rhsqa-grafton12.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine
Brick2: rhsqa-grafton11.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine
Brick3: rhsqa-grafton10.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine
Options Reconfigured:
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.strict-o-direct: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.low-prio-threads: 32
network.remote-dio: off
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
cluster.choose-local: off
client.event-threads: 4
server.event-threads: 4
network.ping-timeout: 30
storage.owner-uid: 36
storage.owner-gid: 36
cluster.granular-entry-heal: enable

--- Additional comment from Sachidananda Urs on 2019-04-08 09:58:41 UTC ---

(In reply to SATHEESARAN from comment #4)
> The generated vars file is located here -
> http://mango.lab.eng.blr.redhat.com/hc_inventory.txt
> 

I see that volumes are created in the order specified with the same inventory file:

gluster vol info                                                                                             
                                                                                                                                 
Volume Name: data                                                                                                                
Type: Replicate                                                                                                                  
Volume ID: 561372be-c3b9-4ecd-b21a-98ae03d80edd                                                                                  
Status: Started                                                                                                                  
Snapshot Count: 0                                                                                                                
Number of Bricks: 1 x (2 + 1) = 3                                                                                                
Transport-type: tcp                                                                                                              
Bricks:                                                                                                                          
Brick1: host10.lab.eng.blr.redhat.com:/gluster_bricks/data/data                                                                  
Brick2: host11.lab.eng.blr.redhat.com:/gluster_bricks/data/data                                                                  
Brick3: host12.lab.eng.blr.redhat.com:/gluster_bricks/data/data (arbiter)                                                        
Options Reconfigured:                                                                            

Volume Name: engine                                                                                                              
Type: Replicate                                                                                                                  
Volume ID: cb3cbd2a-4601-4a1b-99ef-0cffb98c0332                                                                                  
Status: Started                                                                                                                  
Snapshot Count: 0                                                                                                                
Number of Bricks: 1 x 3 = 3                                                                                                      
Transport-type: tcp                                                                                                              
Bricks:                                                                                                                          
Brick1: host10.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine                                                              
Brick2: host11.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine                                                              
Brick3: host12.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine                                                              
Options Reconfigured:                                                                                                            
cluster.granular-entry-heal: enable         

Volume Name: vmstore
Type: Replicate
Volume ID: c3a2b72a-216b-49d9-90ef-e82643130c9e
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: host10.lab.eng.blr.redhat.com:/gluster_bricks/vmstore/vmstore                                                           
Brick2: host11.lab.eng.blr.redhat.com:/gluster_bricks/vmstore/vmstore                                                           
Brick3: host12.lab.eng.blr.redhat.com:/gluster_bricks/vmstore/vmstore                                                           
Options Reconfigured:

Can you give access to the machines so that I can recreate the issue?
I do not see any issues in my setup. And provide the path to the inventory
file and the playbook you used.

--- Additional comment from SATHEESARAN on 2019-04-09 11:13:05 UTC ---

(In reply to Sachidananda Urs from comment #5)
> (In reply to SATHEESARAN from comment #4)
> > The generated vars file is located here -
> > http://mango.lab.eng.blr.redhat.com/hc_inventory.txt
> > 
> 
> I see that volumes are created in the order specified with the same
> inventory file:
> 
> gluster vol info                                                            
> 
>                                                                             
> 
> Volume Name: data                                                           
> 
> Type: Replicate                                                             
> 
> Volume ID: 561372be-c3b9-4ecd-b21a-98ae03d80edd                             
> 
> Status: Started                                                             
> 
> Snapshot Count: 0                                                           
> 
> Number of Bricks: 1 x (2 + 1) = 3                                           
> 
> Transport-type: tcp                                                         
> 
> Bricks:                                                                     
> 
> Brick1: host10.lab.eng.blr.redhat.com:/gluster_bricks/data/data             
> 
> Brick2: host11.lab.eng.blr.redhat.com:/gluster_bricks/data/data             
> 
> Brick3: host12.lab.eng.blr.redhat.com:/gluster_bricks/data/data (arbiter)   
> 
> Options Reconfigured:                                                       
> 
> 
> Volume Name: engine                                                         
> 
> Type: Replicate                                                             
> 
> Volume ID: cb3cbd2a-4601-4a1b-99ef-0cffb98c0332                             
> 
> Status: Started                                                             
> 
> Snapshot Count: 0                                                           
> 
> Number of Bricks: 1 x 3 = 3                                                 
> 
> Transport-type: tcp                                                         
> 
> Bricks:                                                                     
> 
> Brick1: host10.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine         
> 
> Brick2: host11.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine         
> 
> Brick3: host12.lab.eng.blr.redhat.com:/gluster_bricks/engine/engine         
> 
> Options Reconfigured:                                                       
> 
> cluster.granular-entry-heal: enable         
> 
> Volume Name: vmstore
> Type: Replicate
> Volume ID: c3a2b72a-216b-49d9-90ef-e82643130c9e
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: host10.lab.eng.blr.redhat.com:/gluster_bricks/vmstore/vmstore       
> 
> Brick2: host11.lab.eng.blr.redhat.com:/gluster_bricks/vmstore/vmstore       
> 
> Brick3: host12.lab.eng.blr.redhat.com:/gluster_bricks/vmstore/vmstore       
> 
> Options Reconfigured:
> 
> Can you give access to the machines so that I can recreate the issue?
> I do not see any issues in my setup. And provide the path to the inventory
> file and the playbook you used.

Sac,

The setup is provided and Thanks Sac for helping to resolve the issue

Pasting the RCA from the chats:
<chat>
Sac: It is using groups['hc_nodes'] ...
And what you see is expected behaviour, since you are sorting the hosts by using groups['hc_nodes'].
</chat>

Comment 1 Gobinda Das 2019-04-09 12:18:00 UTC
*** Bug 1686909 has been marked as a duplicate of this bug. ***

Comment 2 SATHEESARAN 2019-04-16 06:17:41 UTC
This issue doesn't have the functional impact like data loss.
But its bad experience for the users as the arbiter brick is not placed in the
expected node and its against the user expectation.

Because of this issue leads to bad user experience, proposing this fix to be
included for latest cockpit-ovirt-dashboard release

Comment 3 SATHEESARAN 2019-04-19 11:05:27 UTC
Tested with cockpit-ovirt-dashboard 0.12.8

The bricks are created on the hosts in the given order.

Volume Name: engine
Type: Replicate
Volume ID: 8b81d419-6994-48d3-993b-95a99e9a07f2
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: host1:/gluster_bricks/engine/engine
Brick2: host2:/gluster_bricks/engine/engine
Brick3: host3:/gluster_bricks/engine/engine


Note You need to log in before you can comment on or make changes to this bug.