Red Hat Bugzilla – Bug 986145
Erratic behaviour during concurrent or cancelled TMX exports
Last modified: 2013-11-26 22:35:46 EST
Description of problem:
If an admin starts five or more concurrent TMX exports, Zanata will stop responding, and the exports will not complete.
Version-Release number of selected component (if applicable):
Easily, appears to be always
Steps to Reproduce:
1. Log in as an administrator
2. Press Projects button
3. Middle click the Export All to TMX button five times
- Zanata does not respond to the browser actions (connecting indefinitely)
- Download streams will not complete
Five downloaded TMX files with no system performance degradation.
Four seemed to work ok, if not a bit heavy on the system resources.
Serious enough to schedule for release
Created attachment 775626 [details]
Thread dump from jboss
Due to an Okapi bug / API flaw (https://code.google.com/p/okapi/issues/detail?id=352), write errors due to the cancelled download are not being detected. After the socket is closed, the server continues trying to write data, and reading from the database, until it has written the entire thing. This takes much longer than usual due to all the exceptions being thrown and caught.
The major issue with this bug is the IOExceptions causing a long delay on finishing the process.
The need for "semaphoring" of downloads is another issue (https://bugzilla.redhat.com/show_bug.cgi?id=986741), which is somewhat compounded by this bug.
The reason that Zanata stopped responding is that the JDBC connection pool (default size 10) was exhausted. This happened because the cancelled TMX downloads continued to use connections for quite a long time.
https://github.com/zanata/zanata-server/pull/61 should ensure that cancelled downloads are detected promptly. A cancelled download should now finish just as quickly as if the download were not cancelled, hopefully a little faster.
However, due to http://bugs.mysql.com/bug.php?id=42929 the JDBC driver still has to consume the entire result set. Now we could split up the queries so that the result set is smaller, but this would complicate the implementation quite a bit, especially with regards to transactions, and I believe it would really hurt the performance of TMX downloads.
The implemented speed improvement should be enough to reduce the likelihood of say 10 concurrent cancelled TMX exports happening, which would cause Zanata to become non-responsive. A fix for bug 986741 should be able to prevent it entirely.
Major improvement. Mysqld still tops out for about 15 seconds but zanata stops the process almost immediately.
Tested at 81583983d079566b5f6b0c18c7c028e4aef59c77
Closing VERIFIED bugs for Zanata versions <= 3.1.