Sunday, November 7, 2010

Translating WordFast TXML in memoQ

WordFast as a Word add-on (currently version Classic) was always quite popular – it was not expensive, simple and fully compatible with Trados. However, last year the company released a completely new version – WordFast Pro is written from the ground up in Java, which makes it possible to use it on different operating systems (Windows, Linux, MacOS). The software works in table layout typical for all new CAT tools and is… quite specific. It does have a lot of fans, unfortunately I don’t like it (especially the terrible way it handles insertion of tags). Luckily, we don’t have to translate .txml files using WordFast, it can be done with memoQ. Below you’ll find the procedure step-by-step.

Recently I’m getting a lot of translations in WordFast Pro files. The first translation I did using WF demo and I didn’t liked it, so when I got the next ones, I looked for a way to process them in a much more comfortable environment, that is memoQ. As it came out, it isn’t so hard.
What do we need:
  • WordFast Pro demo (not necessary, but recommended)
  • Excel/Calc
  • memoQ
  • Word/Writer (optional)
I have divided the procedure into two parts: preparation of txml file and preparation of wordfast TM. Let’s start with the file preparation. If we don’t have a WordFast, it can be downloaded from here.
  1. Open the file to translate (.txml) in the WordFast.
  2. WordFast Pro window. Use Ctrl-Alt-Ins to copy all source language segments to target segments.
    WordFast Pro window. Use Ctrl-Alt-Ins to copy all source language segments to target segments.
  3. Use Ctrl-Alt-Ins to copy all source segments contents into target segments.
  4. Save the file, close WordFast (repeat the procedure for additional files if necessary). If, for some reason, you don’t want or can’t use the WordFast demo, you can use the search/replace procedure described here.
  5. Download this filter file and save it on your disc.
  6. Start the memoQ.
  7. From the Tools menu choose Resource console, then Filter configurations and Import new from the left pane. Select the file saved in step 4.
  8. Create new project, in the Add document window select Add document as. You’ll have to switch file type filter in the lower part of the window from All supported files to All files (*.*). Select the txml file(s) saved in step 3.
  9. In the Document import settings click the yellow folder icon (Load filter configuration) and select WordFast filter imported in step 6. The filter works by hiding the original source segments (<source>), displaying for edition target segments (<target>) and defining handling of WordFast tags. See picture below.
    TXML file imported into memoQ. Source segments (<source>) you can see in the preview pane below are hidden, you can translate only <target> segments.
    TXML file imported to memoQ. Source segments (<source>) you can see in the preview pane below are hidden, you can translate only <target> segments.
    Tip: WordFast tags, displayed natively as consecutive numbers {1}, {2}, {3}, in memoQ are displayed as “inline” tags. By default you can insert them using F9 key, but in my there are more convenient key combinations, for example Ctrl-Alt-Down, like in Trados. To modify keyboard settings choose Tools > Resource console > Keyboard shortcuts > Clone (for the Default) > Edit.
  10. After translation export the finished file using Export command. Excellent QA features of memoQ won’t allow you to export files with tags mismatch and will show you the affected segments (if any).
  11. Just to be extra sure you can open the translated file in WordFast, to check if everything is OK.
The finished translation in WordFast The percentage of TM substitutions are shown.
Finished translation in WordFast. The percentage of TM substitutions are shown.
The procedure may seem complicated, especially at first, but after the software configuration, the preparation and file opening wont’ take much more time than in case of any other file format. The situations becomes a bit more complicated when along with TXML files client sends us a WordFast TM file with the .txt extension. To use this TM in memoQ we have to prepare it by removing unnecessary data and tags.
  1. Open the TM file in Word (or OpenOffice Writer) (option). In the WF TM tags are encoded as &tA;, &tB;, &tC;, etc. It is possible to remove them all at once using Word. Open Search & Replace window (Ctrl-H), click More >> button and select Use wildcards. Enter \&t[A-Z]\; in the Find field. Leave the Replace field empty. Click Replace all button. Now deselect Use wildcards checkbox and replace &’A9; with ©, and &’AE; with ® (of course, if there are such strings in your TM. Save as text file using UTF-8 encoding.
  2. Start Excel (or OpenOffice Calc), import the TM file by selecting Data > External data > From text. Choose Tab as a column separator and import the first row checkbox.
    Where to find a text import command.
    Where to find a text import command.
  3. Remove all columns except the ones with source and target texts. In the example above we have to remove columns A, B, C, D, F, H and I, leaving only E and G. As an alternative we can leave also column B, with the translators ID.
    WordFast TM file correctly imported into Excel.
    WordFast TM file correctly imported into Excel.
  4. Enter the language identifiers into first row of source and target columns. If necessary, we can add columns with additional information, like the translation author, domain, client, etc.
    File with unnecessary data removed and additional column added (in this case - translation agency name).
    File with unnecessary data removed and additional column added (in this case - translation agency name).
  5. If step 1 was skipped, you have to remove tags – unfortunately, Excel requires removing them one by one, i.e. you have to enter manually all possible letter combinations (see step 1). Save the file: select round office button > Save as > Other formats > Unicode text. Close Excel.
  6. In memoQ select an existing memory or create a new one. Choose the Import from TMX/CSV command. Select the file saved in step 5.
  7. In the Translation memory CSV import settings window select proper settings:
    • File encoding – UTF-8
    • Delimiter – Tab
    • First row is header
    Select proper fields for Import as source segment (in this case, EN-US) and Import as target segment (in this case, PL). If there are additional fields (rows) to import, select them and choose proper option in the Import as other field field.Proper settings for CSV file import into TM in memoQ.
  8. Voila.
If there are a lot of segments with tags in the WordFast TM, you’ll get several percent lower concordance matches in memoQ than in WordFast, but you’ll still gain by working in much more comfortable environment without having to buy just another CAT tool. Very similar procedure can be used to translate WordFast TXML files in Trados Studio 2009 – you can define an xml import file there as well. Source: http://hell.pl