Bug 1102964
Summary: | CopyTrans takes excessively long hours to complete copying translations | ||
---|---|---|---|
Product: | [Retired] Zanata | Reporter: | Yuko Katabami <ykatabam> |
Component: | Component-CopyTrans | Assignee: | Patrick Huang <pahuang> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Zanata-QA Mailling List <zanata-qa> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 3.3 | CC: | aeng, camunoz, dchen, pnemade, tagoh, zanata-bugs |
Target Milestone: | --- | ||
Target Release: | 3.4 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | 3.4.2-SNAPSHOT (git-server-3.4.1-47-g88e8fe3) | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-07-17 06:39:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yuko Katabami
2014-05-29 23:24:49 UTC
Number of words: 5480 Number of strings: 1080 Number of locales: 7 (zh-TW is not active) (In reply to Yuko Katabami from comment #1) Correction: Number of words: 5480 Number of strings: 1084 Number of locales: 7 (zh-TW is not active) We are currently investigating this issue. Patrick has identified a degradation point where the process starts getting slower and slower as it goes. In local machine for 2000 messages, it has reduced copyTrans time from 30 min to 12 min. Not sure how well it will do in production *** Bug 1104469 has been marked as a duplicate of this bug. *** Tested with Zanata 3.4.2-SNAPSHOT (git-server-3.4.1-6-g206676f) With gettext type project GCC (gcc-4.8.3) gcc.pot (9657 messages) And server log shows: 12:46:33,012 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-2) copyTrans: 0 zh-CN translations for document "gcc/po/gcc" - duration: 2555 s 13:28:47,419 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-2) copyTrans: 0 zh-TW translations for document "gcc/po/gcc" - duration: 2534 s 14:10:42,116 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-2) copyTrans: 0 de-DE translations for document "gcc/po/gcc" - duration: 2515 s ..... In other words, the speed of copytrans 9657/2555= 3.78 message per second for one locale. I will test this with other server that have not applied this fix. Tested with Zanata 3.5.0-SNAPSHOT (git-server-3.4.1-62-g6551e0d) which does not include the fix: 16:39:12,427 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-2) copyTrans: 0 zh-CN translations for document "gcc/po/gcc" - duration: 2423 s Note that the test on 3.5.0-SNAPSHOT, different branch. So, essentially there's no difference? (In reply to Ding-Yi Chen from comment #8) > Tested with Zanata 3.5.0-SNAPSHOT (git-server-3.4.1-62-g6551e0d) which does > not include the fix: > > 16:39:12,427 INFO [org.zanata.service.impl.CopyTransServiceImpl] > (DefaultQuartzScheduler_Worker-2) copyTrans: 0 zh-CN translations for > document "gcc/po/gcc" - duration: 2423 s > > > Note that the test on 3.5.0-SNAPSHOT, different branch. With large enough data set, yes it makes no difference with current fix. Our latest finding lead us to believe the cache is more likely be the culprit. (In reply to Carlos Munoz from comment #9) > So, essentially there's no difference? > > (In reply to Ding-Yi Chen from comment #8) > > Tested with Zanata 3.5.0-SNAPSHOT (git-server-3.4.1-62-g6551e0d) which does > > not include the fix: > > > > 16:39:12,427 INFO [org.zanata.service.impl.CopyTransServiceImpl] > > (DefaultQuartzScheduler_Worker-2) copyTrans: 0 zh-CN translations for > > document "gcc/po/gcc" - duration: 2423 s > > > > > > Note that the test on 3.5.0-SNAPSHOT, different branch. 11:48:34,096 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-2) copyTrans start: document "gcc/po/gcc" 12:07:42,845 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-2) copyTrans: 0 fr translations for document "gcc/po/gcc" - duration: 1149 s Now we have an over 50% improvement. 16:04:21,394 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans start: document "gcc/po/gcc" 16:23:30,190 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 6 de translations for document "gcc/po/gcc" - duration: 1149 s 16:42:08,828 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 zh translations for document "gcc/po/gcc" - duration: 1119 s 17:00:50,631 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 2 ja translations for document "gcc/po/gcc" - duration: 1122 s 17:19:03,680 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 2 pl translations for document "gcc/po/gcc" - duration: 1093 s 17:37:30,717 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 en-US translations for document "gcc/po/gcc" - duration: 1107 s 17:55:56,921 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 1 de-DE translations for document "gcc/po/gcc" - duration: 1106 s 18:14:01,691 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 it translations for document "gcc/po/gcc" - duration: 1085 s 18:32:14,236 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 es translations for document "gcc/po/gcc" - duration: 1093 s 18:50:41,244 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 zh-Hant-TW translations for document "gcc/po/gcc" - duration: 1107 s 19:08:44,455 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 uk translations for document "gcc/po/gcc" - duration: 1083 s 19:26:54,246 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans: 0 en translations for document "gcc/po/gcc" - duration: 1090 s 19:26:54,250 INFO [org.zanata.service.impl.CopyTransServiceImpl] (DefaultQuartzScheduler_Worker-1) copyTrans finished: document "gcc/po/gcc" Tested with Zanata 3.4.2-SNAPSHOT (git-server-3.4.1-43-g2f664d4) The first is very fast 9657/1124= 8.59 msg/s The second and later are slow, about 4 msg/s Perhaps some heuristic logic can be apply here. When only one version exists, and all the copyTrans options are set as "Don't Copy", there should be nothing to copy from, it should not be taking thousands of seconds to process this. This will definitely help for first version push. Above suggestion is implemented. Now if project mismatch or docId mismatch is set to reject, it will skip over copyTrans if there is only one version. It will also skip over locales that don't have any translation. VERIFIED with Zanata 3.4.2-SNAPSHOT (git-server-3.4.1-47-g88e8fe3), as it won't spend needless time on checking unrelated TextFlow. *** Bug 1120034 has been marked as a duplicate of this bug. *** |