Word of advice here. If plugin is enabled and set as preferred in your client, MT results show up alongside Translation Memory ™ hits in Translation results pane. So if you spend a lot of time there clicking through segments it’ll cost you some money. Make sure to uncheck Offer machine-translated results while working in the translation grid or adjust it to your actual needs.
Tags in memoQ, as in other localization tools, serve as placeholders for source file elements which are wrapping or are placed between text to be translated. If you translate Word documents your tags will be mainly formatting. So, if these tags will be placed at the end of translation, formatting will be broken. Which is maybe not big thing if you’re using MT as support to human translation, because your translator will place tags where they belong. Although he probably won’t be very happy about it. But I’ve wanted to use MT as some kind of self-service for teams which don’t require quality translations, just want to understand the text.
Luckily memoQ offers export/import of bilingual files and its native format is mqxliff which is XML. Each segment is wrapped in
trans-unit
tag. Source segment if it contains tags looks like this.<source xml:space="preserve" mq:segpart="8">You can <bpt id="1" ctype="underlined">{}</bpt>
<bpt id="2"><hlnk id="rId8" history="1" fileName="document.xml" href="@07c1b597-2ac9-446f-b816-78a8a151172a"></bpt>
<bpt id="3"><rpr id="0"></bpt>download it here for free<ept id="1">{}</ept>
<ept id="3"></rpr id="0" transform="close"></ept><ept id="2"></hlnk></ept>.</source>
We need to work on XML level as we want these tags encoded back
exactly as they’re now. First we need to get rid of source tags, I don’t
like regex in my code, but this time it’s necessary </?source(.+?)?>
. Now we have this.You can <bpt id="1" ctype="underlined">{}</bpt>
<bpt id="2"><hlnk id="rId8" history="1" fileName="document.xml" href="@07c1b597-2ac9-446f-b816-78a8a151172a"></bpt>
<bpt id="3"><rpr id="0"></bpt>download it here for free<ept id="1">{}</ept>
<ept id="3"></rpr id="0" transform="close"></ept><ept id="2"></hlnk></ept>.
Not bad, but still you don’t want to translate tags, so we’ll split this with another regex (<.+?>.+?</.+?>)
to get nice array of strings. And then if item in array starts with <
we add it as-is to target tag, else we first MT it.Of course we could pass it as-is to Google Translate and receive back translation with tags intact, but then while adding it to original mqxliff all tags will be escaped, like
<bpt id="1" ctype="underlined">
.
And after import they’ll appear as regular text in memoQ and will be
exported to DOCX (or whatever was original format) as such. So not only
formatting will be broken, you’ll have garbage text within your
translated text.Last, but very important, thing. We need to set
mq:status
attribute for each changed trans-unit tag, so it equals "MachineTranslated"
.
Otherwise even though we have translation for memoQ it’s empty and on
export it’ll either not be exported or reverted to source text,
depending on your settings.As I was digging through mqxliff I’ve found another interesting thing. If segment is locked, it has following attributes
translate="no" mq:locked="locked"
.
Apparently, it’s enough to remove them from trans-unit tag and after
import segment will be unlocked in memoQ project. It’s very useful as
currently you need to have PM license to be able to lock/unlock
segments, which is expensive. Plus, it’s tedious task if you need to do
it manually. Source: https://blog.liox.eu