Bug 1034326

Summary: [RFE] Proposal for Georeplication hooks
Product: Red Hat Gluster Storage Reporter: Marcel Hergaarden <mhergaar>
Component: geo-replicationAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: low Docs Contact:
Priority: low    
Version: 2.1CC: avishwan, chrisw, csaba, khiremat
Target Milestone: ---Keywords: FutureFeature, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-19 04:37:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Marcel Hergaarden 2013-11-25 15:41:39 UTC
Description of problem:

[RFE]
We already have volume lifecycle hooks where the customer can inject it's own scripts, both before and after, the different volume actions (add, remove, start, stop, etc) I would like to propose the same mechanism for georeplication

Version-Release number of selected component (if applicable):
RHS 2.1 Big Bend

Additional info:
Proposal for Georeplication hooks

We already have volume lifecycle hooks where the customer can inject it's own scripts, both before and after, the different volume actions (add, remove, start, stop, etc)

I would like to propose the same mechanism for georeplication. Here, I would like to have a hook-script called (when present), when a file is selected for actual replication. The script should be called with the full file path (or inode nr) as the parameter.

There schould be 4 hooks per volume:
- Pre Replication on Source
- Post Replication on Source
- Pre Replication on Destination
- Post Replication on Destination

When the error level from the Pre Replication hooks is a certain value, the replication of that file should be aborted with reason logged.

The pre replication hooks are synchronous, the post could be implemented as a-synchronous.

With this mechanism in place, customers can build their own logic around replication. In our use case, It could be used to save the previous version of a file to another location before the replication is done, so we would implement this on the target as a pre-script. We could i.e. append a datetime to the file extention or whatever. It is also possible to have some notifcation mechanism implementeted: If file X is being replicated, notify X.

Of course this has (potentially serious) consequences for replication performance, but it is the choice of the customer. If their script corrupts the target somehow. it can be corrected by doing a full resync without the hook scripts.

This mechanism also works with the new georepl engine working per brick. It would be nice though to have all the scripts synced over all nodes somehow involved in the volume.

As georepl is written in Python with rsync as it's engine, I suspect that the basic implementation could be rather easy. If, because of the nature of geo-repl implementation, it is not possible to have target-hooks, source-hooks could suffice They then do actions on the target from the source remotely though, for example ssh. It's up to the customer.
</PROPOSAL>

I have added it as a feature proposal on the GlusterFS website:
http://www.gluster.org/community/documentation/index.php/Features/Geo_Replication_Hooks

Regards,

Fred van Zwieten

--------
On behalf of Fred;
Marcel Hergaarden as the local SA involved.