Bug 986145
Summary: | Erratic behaviour during concurrent or cancelled TMX exports | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Zanata | Reporter: | Damian Jansen <djansen> | ||||
Component: | Performance | Assignee: | Sean Flanigan <sflaniga> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Zanata-QA Mailling List <zanata-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | development | CC: | zanata-bugs | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.0 | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-11-27 03:24:56 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 960786 | ||||||
Attachments: |
|
Description
Damian Jansen
2013-07-19 04:43:18 UTC
Serious enough to schedule for release Created attachment 775626 [details]
Thread dump from jboss
Due to an Okapi bug / API flaw (https://code.google.com/p/okapi/issues/detail?id=352), write errors due to the cancelled download are not being detected. After the socket is closed, the server continues trying to write data, and reading from the database, until it has written the entire thing. This takes much longer than usual due to all the exceptions being thrown and caught. The major issue with this bug is the IOExceptions causing a long delay on finishing the process. The need for "semaphoring" of downloads is another issue (https://bugzilla.redhat.com/show_bug.cgi?id=986741), which is somewhat compounded by this bug. The reason that Zanata stopped responding is that the JDBC connection pool (default size 10) was exhausted. This happened because the cancelled TMX downloads continued to use connections for quite a long time. https://github.com/zanata/zanata-server/pull/61 should ensure that cancelled downloads are detected promptly. A cancelled download should now finish just as quickly as if the download were not cancelled, hopefully a little faster. However, due to http://bugs.mysql.com/bug.php?id=42929 the JDBC driver still has to consume the entire result set. Now we could split up the queries so that the result set is smaller, but this would complicate the implementation quite a bit, especially with regards to transactions, and I believe it would really hurt the performance of TMX downloads. The implemented speed improvement should be enough to reduce the likelihood of say 10 concurrent cancelled TMX exports happening, which would cause Zanata to become non-responsive. A fix for bug 986741 should be able to prevent it entirely. Major improvement. Mysqld still tops out for about 15 seconds but zanata stops the process almost immediately. Tested at 81583983d079566b5f6b0c18c7c028e4aef59c77 Closing VERIFIED bugs for Zanata versions <= 3.1. Closing VERIFIED bugs for Zanata versions <= 3.1. Closing VERIFIED bugs for Zanata versions <= 3.1. Closing VERIFIED bugs for Zanata versions <= 3.1. Closing VERIFIED bugs for Zanata versions <= 3.1. Closing VERIFIED bugs for Zanata versions <= 3.1. |