Bug 1038988 - Gluster brick sync does not work when host has multiple interfaces
Summary: Gluster brick sync does not work when host has multiple interfaces
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-webadmin
Version: 3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.4.0
Assignee: Ramesh N
QA Contact: bugs@ovirt.org
URL:
Whiteboard: gluster
Depends On:
Blocks: 1024889
TreeView+ depends on / blocked
 
Reported: 2013-12-06 09:57 UTC by Sahina Bose
Modified: 2014-03-31 12:32 UTC (History)
13 users (show)

Fixed In Version: ovirt-3.4.0-beta2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-03-31 12:32:33 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 22693 0 None None None Never
oVirt gerrit 23432 0 None None None Never
oVirt gerrit 23438 0 None None None Never

Description Sahina Bose 2013-12-06 09:57:14 UTC
Description of problem:

Syncing of gluster volume info does not work when host has multiple networks, as described in the user scenario below:

Because of a few issues I had with keepalived, I moved my storage network to it's own VLAN but it seems to have broken part of the ovirt gluster management.

Same scenario:
2 Hosts

1x Engine, VDSM, Gluster
1x VDSM,Gluster

So to properly split the gluster data and ovirtmgmt I simply assigned them two host names and two IPS.

172.16.0.1 (ovirtmgmt) hvx.melb.example.net
172.16.1.1 (gluster) gsx.melb.example.net

However the oVirt engine does not seem to like this, it would not pick up the gluster volume as "running" until I did a restart through the UI. 

2013-12-06 13:15:08,940 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-75) START, GlusterVolumesListVDSCommand(HostName = HV01, HostId = 91c776e4-8454-4b2a-90b2-8700b6f58d9d), log id: 6efbe3fe
2013-12-06 13:15:08,973 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-75) Could not find server gs01.melb.example.net in cluster 99408929-82cf-4dc7-a532-9d998063fa95
2013-12-06 13:15:08,976 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler_Worker-75) FINISH, GlusterVolumesListVDSCommand, return: {a285e87a-d191-4b55-98f5-a4e0bcb85517=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@9a3ec542}, log id: 6efbe3fe
2013-12-06 13:15:08,989 ERROR [org.ovirt.engine.core.bll.gluster.GlusterSyncJob] (DefaultQuartzScheduler_Worker-75) Error while updating Volume DATA!: java.lang.NullPointerException
        at org.ovirt.engine.core.common.utils.gluster.GlusterCoreUtil.findBrick(GlusterCoreUtil.java:65) [common.jar:]
        at org.ovirt.engine.core.common.utils.gluster.GlusterCoreUtil.findBrick(GlusterCoreUtil.java:51) [common.jar:]
        at org.ovirt.engine.core.common.utils.gluster.GlusterCoreUtil.containsBrick(GlusterCoreUtil.java:39) [common.jar:]
        at org.ovirt.engine.core.bll.gluster.GlusterSyncJob.removeDeletedBricks(GlusterSyncJob.java:518) [bll.jar:]
        at org.ovirt.engine.core.bll.gluster.GlusterSyncJob.updateBricks(GlusterSyncJob.java:510) [bll.jar:]


Volume information isn't being pulled as it thinks the gs01.melb.example.net is not within the cluster, where in fact it is but registered under hv01.melb.example.net



Version-Release number of selected component (if applicable):
3.3

How reproducible:
Always

Steps to Reproduce:
As above

Expected results:

Gluster brick sync should use the gluster host UUID.


Additional info:

Comment 1 Sahina Bose 2013-12-06 09:58:38 UTC
Tim, 
can you enhance the vdsm verb for glusterVolumesList to return the host uuid as well?

Comment 2 Sahina Bose 2013-12-06 10:02:44 UTC
Changing target release to 3.4, as there's a change required in glusterfs as well.

Comment 3 Sahina Bose 2014-02-05 06:08:42 UTC
Issue with gluster sync when host has multiple interfaces has been fixed.

However, please note that if engine and gluster operate on the same host with different ip addresses, operations from engine like remove-brick and add brick will fail as gluster is not aware of the ip address issued for these commands.

The safest way is to work with FQDNs in these cases.

Comment 4 Sandro Bonazzola 2014-03-31 12:32:33 UTC
this is an automated message: moving to Closed CURRENT RELEASE since oVirt 3.4.0 has been released


Note You need to log in before you can comment on or make changes to this bug.