Red Hat Bugzilla – Bug 988204
RFE: Efficiently push and pull recently changed documents
Last modified: 2015-07-28 22:46:31 EDT
Currently the only way to sync documents with Zanata is to do a full sync of a project. There is no way to search for documents that have been changed within a specified period. As more content is included in Zanata performing a full sync is going to become less practical.
It would be great if we could sync a subset of documents based on whether or not they have been edited, either by searching for documents based on their last edited time, or by receiving notifications through a queue.
After talking to Carlos we could implement ETags to save some bandwidth.
Assigning to Damian for triage.
This is something we need sooner rather than later.
Sounds like a great idea.
I think this bug might be misnamed, because Zanata already pushes and pull individual documents. The problem is knowing *which* individual documents to push or pull.
And is it really about push and pull, or just about making pull more efficient? (Working out what to push is mainly the client's problem, although storing ETags returned from Zanata on PUT could help here too.)
A queue sounds like it might be a good solution. We should see if it would be feasible to expose a HornetQ queue for updated documents/locales within a project version.
After discussing with Sean and a bit of research, HornetQ offers a REST API for interacting with queues, which would avoid any dependence on JMS APIs. It also allows for subscribing consumers (basically urls that hornetq will push to upon receipt of a message). We might need to restrict the number of consumers per queue, but all in all sounds like a good solution.
I don't think we need to avoid dependence on JMS APIs, but being able to expose the queues over REST would give us another option, and may help with firewalls.
We don't need to avoid that dependence, but I think we definitely want to. Exposing this as a JMS API solely would make it difficult for non-java clients to make use of these 'notifications'. Even if we decided (for some reason) not to go with HornetQ in the end, we should take a cue from their API and try to implement it as a RESTful endpoint.
Here are the docs for HornetQ's REST API:
Incidentally, the HornetQ docs mention the "Accept-Wait" header, which could be useful in a few other places I can think of.
But perhaps to be more RESTful (and for integration with other tools like RSS readers or Yahoo Pipes) we should think about publishing something like an ATOM feed of changed documents: http://answers.oreilly.com/topic/2153-rest-in-practice-how-to-use-atom-for-event-driven-systems/
Based on our discussion today:
1. Zanata should expose a REST query resource which returns a list of documents changed since a specified date.
Perhaps something like this:
where the three parameters are optional, but version may only be specified if project is specified. Suggestions welcome for the actual URL.
The result could be returned as an Atom feed, where each entry includes a link to a REST resource for a translated document, plus some metadata  to identify the project slug, version slug, docName and locale, plus perhaps a link to let humans view the document in Zanata's editor or download it.
2. As an extension, Zanata might publish a HornetQ topic which pushes newly changed document IDs. (A subscriber of this topic would still need the above REST query resource for initial synchronisation or synchronisation after an extended disconnection.) The advantage of the topic is that changes would be visible immediately to the subscriber, and the synchronisation load could be spread throughout the day.
- estimate assumes that we have a queue provider working
- add items to a queue when there are any changes to translations (just starting with translations in the initial implementation to keep it simple).
- interested parties poll an endpoint that will provide an atom feed based on the queue.
- items in the feed should be available to consumers for at least a month
Reassigned to PM
Migrated; check JIRA for bug status: http://zanata.atlassian.net/browse/ZNTA-185