Bug 1286428

Summary: Duplicate gluster servers appear in backup-volfile-servers
Product: [oVirt] vdsm Reporter: Ala Hino <ahino>
Component: GlusterAssignee: Ala Hino <ahino>
Status: CLOSED CURRENTRELEASE QA Contact: Natalie Gavrielov <ngavrilo>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.17.11CC: ahino, amureini, bugs, gklein, ngavrilo, ogofen, tnisan, ylavi
Target Milestone: ovirt-3.6.1Flags: rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
amureini: devel_ack+
gklein: testing_ack+
Target Release: 4.17.14   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-13 14:38:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm.log (aqua-vds4, aqua-vds5), engine.log none

Description Ala Hino 2015-11-29 11:29:19 UTC
Description of problem:

Gluster volinfo could return duplicate servers. For example, if
the volume is a distributed replica volume (2 X 3) with 6 bricks,
the bricks could be spread over 4 volumes.



How reproducible:
100% when there are duplicates

Steps to Reproduce:
1. create dist-replicated volume that uses same servers
2. create gluster sd using that volime

Actual results:
duplicate servers appears in back-volfile-servers

Expected results:
no duplicates to appear

Comment 1 Natalie Gavrielov 2015-12-21 18:37:58 UTC
Hi Ala,

I'm sorry, but I don't really get the scenario described in comment #0. 
What are duplicated servers?
How many servers were defined? 3?
2 is the number of replicas?


Thanks,

Comment 2 Ala Hino 2015-12-21 18:57:19 UTC
Hi Natalie,

Apologize for the not very clear comment.

Basically, here we want to verify that no duplicate servers appear backup-volfile-server option.

To verify this, you can create replica N volume where 1 replica on a host1 and N-1 replicas using (N-1) bricks on host2. In this case we have N replicas but on two real servers. Then, in the log file, you need to verify that in backup-volfile-servers contains a single server (host2).

Hope this helps. Let me know if anything else is needed.

Comment 3 Natalie Gavrielov 2015-12-22 17:30:18 UTC
Using: vdsm-4.17.13-1.el7ev.noarch
       rhevm-3.6.1.3-0.1.el6.noarch

Scenario tested:
----------------

1. Create gluster volume
gluster volume create natalie replica 3 gluster-server01.qa.lab.tlv.redhat.com:/gluster_volumes/natalie_brick_1 gluster-server01.qa.lab.tlv.redhat.com:/gluster_volumes/natalie_brick_2 gluster-server02.qa.lab.tlv.redhat.com:/gluster_volumes/natalie_brick_3 gluster-server01.qa.lab.tlv.redhat.com:/gluster_volumes/natalie_brick_5 gluster-server01.qa.lab.tlv.redhat.com:/gluster_volumes/natalie_brick_4 gluster-server01.qa.lab.tlv.redhat.com:/gluster_volumes/natalie_brick_6 force

2. Create a storage domain (first round using ip address, second round using hostname):

Path: 10.35.65.25:natalie

vdsm.log (aqua-vds4):

jsonrpc.Executor/4::WARNING::2015-12-22 16:51:09,402::storageServer::348::Storage.StorageServer.MountConnection::(_get_backup_servers_option) gluster server u'10.35.65.25' is not in bricks ['gluster-server01.qa.lab.tlv.redhat.com', 'gluster-server02.qa.lab.tlv.redhat.com'], possibly mounting duplicate servers
/Mount
jsonrpc.Executor/4::WARNING::2015-12-22 16:51:09,402::storageServer::348::Storage.StorageServer.MountConnection::(_get_backup_servers_option) gluster server u'10.35.65.25' is not in bricks ['gluster-server01.qa.lab.tlv.redhat.com', 'gluster-server02.qa.lab.tlv.redhat.com'], possibly mounting duplicate server
jsonrpc.Executor/4::DEBUG::2015-12-22 16:51:09,403::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=gluster-server01.qa.lab.tlv.redhat.com:gluster-server02.qa.lab.tlv.redhat.com 10.35.65.25:/natalie /rhev/data-center/mnt/glusterSD/10.35.65.25:_natalie (cwd None)


Path: gluster-server01.qa.lab.tlv.redhat.com:natalie
vdsm.log (aqua-vds5):
jsonrpc.Executor/1::DEBUG::2015-12-22 18:00:38,774::storageServer::342::Storage.StorageServer.MountConnection::(_get_backup_servers_option) Using bricks: ['gluster-server01.qa.lab.tlv.redhat.com', 'gluster-server02.qa.lab.tlv.redhat.com']
jsonrpc.Executor/1::DEBUG::2015-12-22 18:00:38,774::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o backup-volfile-servers=gluster-server02.qa.lab.tlv.redhat.com gluster-server01.qa.lab.tlv.redhat.com:natalie /rhev/data-center/mnt/glusterSD/gluster-server01.qa.lab.tlv.redhat.com:natalie (cwd None)


When using ip address we get duplicate servers.

Many thanks to Ori (ogofen) for helping me out here.

(Attaching logs in a few)

Comment 4 Red Hat Bugzilla Rules Engine 2015-12-22 17:30:20 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Natalie Gavrielov 2015-12-22 17:32:02 UTC
Created attachment 1108676 [details]
vdsm.log (aqua-vds4, aqua-vds5), engine.log

Comment 6 Ala Hino 2015-12-22 19:14:28 UTC
Hi Natalie,

Please note that this bug specifically handles duplicate server in backup-volfile-servers property.

In both use cases you tried, there are no duplicate servers in backup-volfile-server property.

The patch doesn't handle use cases where we create gluster volumes using host name(s) and create domains using ip address(es). If we think this an issue, we need to open a separate bug for it.

Verification of this bug would fail if "server01.qa.lab.tlv.redhat.com" appeared multiple times in backup-volfile-server property.

Comment 7 Natalie Gavrielov 2015-12-23 10:00:40 UTC
Hi Ala,

Comment #2:
> you need to verify that in backup-volfile-servers contains a single server (host2).

I had 2 hosts, so I would expect only one of them to appear as a property.

Comment #3:
> backup-volfile-servers=gluster-server01.qa.lab.tlv.redhat.com:gluster-server02.qa.lab.tlv.redhat.com

There are 2 servers as a property.

Comment #6:
> The patch doesn't handle use cases where we create gluster volumes using host > name(s) and create domains using ip address(es). If we think this an issue, we > need to open a separate bug for it.

We didn't think it's the issue, we just thought it might be a relevant piece of information.

> Verification of this bug would fail if "server01.qa.lab.tlv.redhat.com" 
> appeared multiple times in backup-volfile-server property.

It's different than what it says in comment #2.

Do you have a relevant log file (maybe from when you encountered the issue) for me to take a look and try to have a better understanding of this issue?

Comment 8 Ala Hino 2015-12-23 10:27:16 UTC
Hi Natalie,

Unfortunately I don't have a log file showing the issue.
I will try to explain the issue.
Let's assume that we have a replica 3 volume were 1 brick on host01 and two bricks on host02.
Before this fix, the value of backup-server-volfile property would be:
backup-server-volfile=host02:host02 (host02 appears twice)
With the fix:
backup-server-volfile=host02 (host02 appears only one time)

Please note that backup-server-volfile could contain any number of servers, per the created replica. So, if there were N replicas on N different hosts, there will be (N-1) hosts in backup-server-volfile (and 1 server will the "primary"). There could be N *different* servers if we mess ip addresses and host names.

In the tests you performed, 2 hosts appear in the property when you used the ip to connect. Again, in this case, you created gluster replicated volume using host names and, in the engine, created the domain using the ip address. In this case, when we run gluster volume info, we get 2 host names: host01 and host02. We try to remove the admin provided server (the provided ip address) from the list but it doesn't exist hence, we have two servers: host01 and host02. Yet again, host and host02 in the list are *not* duplicates. host01 or host02 may be a duplicate of the provided ip address, but it is a different issue and the ip address doesn't appear in backup-server-volfile.

Comment 9 Natalie Gavrielov 2015-12-23 14:56:11 UTC
Verified using the following versions:
 vdsm-4.17.13-1.el7ev.noarch
 rhevm-3.6.1.3-0.1.el6.noarch

Scenario tested is the one described in comment #3. 
Test result: backup-volfile-servers property does not show duplicate hosts.

Will open another issue regarding the warning ("possibly mounting duplicate server") when using an ip address

Comment 10 Sandro Bonazzola 2016-01-13 14:38:28 UTC
oVirt 3.6.1 has been released, closing current release