Bug 764633 (GLUSTER-2901)

Summary: Cannot start geo-replication
Product: [Community] GlusterFS Reporter: Vikas Gorur <vikas>
Component: geo-replicationAssignee: Lakshmipathi G <lakshmipathi>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 3.2.0CC: aavati, bala, dave, eco, gluster-bugs, jacob, platform, renee, vijay, vs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM: http://support.gluster.com/rt//Ticket/Display.html?id=3197
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Vikas Gorur 2011-05-13 19:35:50 UTC
After manually creating the directories:

# gluster volume geo-replication rep ip-10-86-198-172::rep start
Starting geo-replication session between rep & ip-10-86-198-172::rep has been successful

Comment 1 Vikas Gorur 2011-05-13 22:26:38 UTC
Steps taken to start geo-replication:

Spawned two Gluster AMIs.

On each:

# yum update
# gluster-app-migrate
# gluster-repo-switch 3.2

Created a volume called "repo" on both servers. The two volumes are independent. Each server's volume only contains a single brick that is local.

gluster volume geo-replication rep ip-10-124-129-140::rep start

internal error, cannot startthe geo-replication session
geo-replication command failed

After doing some debugging with gdb, it looks like the command works if I manually create the following directories:

/etc/glusterd/geo-replication
/var/log/glusterfs/geo-replication

Comment 2 Renee 2011-05-16 13:37:42 UTC
i do not believe we have this on the baremetal installation - only AMI.   Eco can you please confirm?

Comment 3 Renee 2011-05-16 13:47:04 UTC
Raising to P1- BLOCKER - we cannot get geo-rep to work in the AMI.  This needs an immediate fix

Comment 4 Anand Avati 2011-05-16 19:57:24 UTC
(In reply to comment #3)
> Raising to P1- BLOCKER - we cannot get geo-rep to work in the AMI.  This needs
> an immediate fix

geo-replication works on the AMI. I have personally confirmed geo-replication to be sync'ing data across AWS regions just a few minutes back. The issue being seen here is essentially an upgrade issue, and not a geo replication issue as such. Here's the dissection of the problem -

During a yum upgrade to 3.2 (performed as part of gluster-app-migrate), the way yum/rpm works is that the post-install script of glusterfs-core RPM is executed _before_ the installation of glusterfs-geo-replication RPM. This is preventing the creation of /etc/glusterd/geo-replication directory (which is basically that glusterd believes that geo-replication was chosen not be installed in this environment, and internally wouldn't have "activated" the geo-sync features). Now the same glusterd continues executing after the upgrade process (which still believes geo-replication is not installed in the system). But by the time we start executing geo-replication commands, we find that geo-replication is installed (presence of gsyncd.py binary) and glusterd would be in quasi-upgraded state.

Workaround - "service glusterd restart" right after the gluster-app-migrate. This is just a one time workaround command which will fix the problem for good.

Next step - we will work on restarting glusterd in the post-install of geo-replication RPM as well.

This is _NOT_ a problem in the new 3.2 AMI as there is no upgradation involved.

Avati

Comment 5 Anand Avati 2011-05-16 20:03:56 UTC
*** Bug 2900 has been marked as a duplicate of this bug. ***

Comment 6 Anand Avati 2011-06-08 13:57:49 UTC
PATCH: http://patches.gluster.com/patch/7378 in master (rpmbuild : restart glusterd after installing geo-replication rpm)

Comment 7 Anand Avati 2011-06-08 13:58:15 UTC
PATCH: http://patches.gluster.com/patch/7379 in release-3.2 (restart glusterd after installing geo-replication rpm)

Comment 8 Lakshmipathi G 2011-06-10 08:02:26 UTC
tested with 3.2.1qa4. installing geo-replication now restart glusterd.