Bug 999729
Summary: | RFE: Support TM Export as TXT file | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Zanata | Reporter: | Isaac Rooskov <irooskov> | ||||
Component: | Usability | Assignee: | Isaac Rooskov <irooskov> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Zanata-QA Mailling List <zanata-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.0 | CC: | asasaki, dchen, sflaniga, yshao, zanata-bugs | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-06-12 04:33:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Isaac Rooskov
2013-08-22 00:16:42 UTC
Hi Aiko, What is the use case you have in mind? What would you do with the txt file? If it's just for reading in a text editor, would it help if the TMX were formatted more nicely? Do you have a sample of the expected layout? Wordfast and Trados are two incompatible TXT formats for translation memories that I know of, and I'm sure there are more. Or would any simple txt layout be okay? (TMX is a form of text... :-) Also, have you looked for a separate tool that can convert the translation memory from TMX to your preferred format? So the main thing is to have source and target in a single line? Like CSV? Created attachment 798089 [details]
memory sample
Thanks. Here's an excerpt: <Segment>0000013719 <Control> 00011800000001122533351English(U.S.)JAPANESEXXXXXX_.000XXX_.dita </Control> <Source>Click <uicontrol outputclass="XXXguicontrol">OK</uicontrol>.</Source> <Target><uicontrol outputclass="XXXguicontrol">?OK?</uicontrol>?????????</Target> </Segment> ... It looks a bit like XML, except without the top-level element, and with random nested tags (like "uicontrol"). I think its main virtue is that each Source and Target segment is on a line by itself, which should help with grep. We could look at making sure our TMX is exported in a neatly formatted way, with only one string per line. It would look something like this: <tu srclang="en-US" tuid="myproject:1.0:myproject:edbc3dc4ac083b40418f0dee7f552177"> <tuv xml:lang="en-US"> <seg>Disk Usage Analyzer</seg> </tuv> <tuv xml:lang="ja"> <seg>??????????</seg> </tuv> </tu> ... In the meantime, you could run Zanata's exported TMX through an XML pretty printer. If you have XMLStarlet installed, you can format TMX like this: $ xmlstarlet fo zanata-myproject-1.0-allLocales.tmx As root, you can install xmlstarlet from fedora or EPEL with: # yum install xmlstarlet But I'm sure there are other XML formatters too. Would something like that work? Aiko, Does Sean's method work for you? Ding-Yi, So sorry not to respond to your confirmation email. Can you help me with how to install XMLStarlet? After running # yum install xmlstarlet, the message "No package xmlstarlet available." appears. Thank you for your help. Aiko For Fedora, yum -y install xmlstarlet should work. For RHEL/CentOS you need to have EPEL installed: + For RHEL/CentOS 7, run following: yum -y localinstall http://mirror.aarnet.edu.au/pub/epel/beta/7/x86_64/epel-release-7-0.1.noarch.rpm + For RHEL/CentOS 6, run following: yum -y localinstall http://mirror.aarnet.edu.au/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm + For RHEL/CentOS 5, run following: yum -y localinstall http://mirror.aarnet.edu.au/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm After you install EPEL, you can start installing XMLStarlet by: yum -y install xmlstarlet Correction, xmlstarlet is not in EPEL6 and EPEL5. However, my epel6-collection has it. To get the epel6-collection: wget http://repos.fedorapeople.org/repos/dchen/epel6-collection/epel-epel6-collection.repo sudo mv epel-epel6-collection.repo /etc/yum.repos.d/ Then you can: yum -y install xmlstarlet Aiko, If I understand correctly, what you like to do are: 1. Search existing translation for your locale. 2. Share translation among translators. In this case, you don't actually need to export TM, you can either: 1. Use TM in Zanata. In the bottom of Translation editor, you can search TM. 2. Use Glossary in Zanata. Japanese translation team can upload the standard terms, and upload as Glossary, so every Japanese translation team member can search, read, and use the glossary. TMX, on the other hand, is meant to be used by system admins who need to copy TM from one Zanata server to another Zanata server. Am I understand your need correctly? Hi Ding-Yi I initially asked about global search or grep function for memory in text format. So part of my questions was whether we can export TM or not. If possible, please let me know how we can do it. Thanks Aiko Yes, Zanata can export TM, and the output file can be read by any of the plain text editor. You can also use grep on the .tmx file. However, currently global (All projects, all locales) TMX export is only available to administrators, because this action consume a lot of system resources and lots of time. Perhaps there should be another RFE for exporting a single locale for all projects, but that's probability only offer to language consolidator. For individual translators, with Zanata-version (3.3.x) they can export all locales for a given project or project version. With Zanata-version (3.4.x), translators can also export one locale in a given project or project version. After talked with Aiko, her requirement can be addressed in Bug 1108444. Thus, I hereby close this bug, as you can already use any plain text utilities including grep on TMX files. |