Friday, March 12, 2010

Translating WordFast TXML files in MemoQ

image We sometimes receive files that have been processed using the latest version of WordFast Pro. These are recognizable from the .txml extension.
This format is just a specific XML structure, and as such it should be possible to translate the files using MemoQ after formatting them properly.
Open the files in WordFast and check the total number of segments
Open the TXML file in WordFast and go to the last line, taking note of the number of segments contained in the file. In the picture below note the presence of WordFast’s tag {ut1}.
wordfast
Also check for the presence of text in the target column. This will cause problems, so, you will need to delete it. The presence of < > signs in the source text may create problems, so you will need to replace them with some different placeholders.
Once you have done these checks, close the file.
Open the .txml file with an editor and replace some strings
  1. Open the .txml file using the jEdit text editor. Click here to go to the download page. It is important to use this editor because it allows for a very simple search/replace syntax that takes care of “greedy” wildcards. You can obtain the same results using a different editor, but the syntax to use might be different.
  2. After opening the file in JEdit, place the cursor at the top and choose Search > Find…
  3. In the Search for field, insert the string below (be careful not to add superfluous spaces if copying from this page):
    <segment(.*?)>(.*?)<source>(.*?)</source>(.*?)</segment>
  4. In the Replace with field, insert the following string:
    <segment$1>$2<source>$3</source>$4<target>$3</target></segment>
  5. Check that the search options are configured as in the screenshot below:
    image
  6. Click on Replace All, save the file and quit jEdit.
Open the modified file in MemoQ
  1. First perform a quick check by opening the file you just saved with WordFast. Now the target column should be identical to the source column, tags included. The total number of segments should be identical to the value you saw when you first opened the file in WordFast. After checking this, you can open the file in MemoQ.
  2. Add the .xml extension to the file name (e.g. filename.txml.xml), since MemoQ likes this better.
  3. Open MemoQ and create a new project. Call it for instance “Wordfast”, so you can re-use it easily for subsequent projects that involve translating WordFast files.
  4. Go to the Settings > Source segmentation rules pane. (Warning: NEVER modify these settings using the Tools > Options > Segmentation rules menu because this will affect the global segmentation rules. We only want to change local rules for this project.)
  5. Select the various segmentation rules in the Rules list on the left and delete all of them. We need to do this (only once if you the re-use the same “WordFast” MemoQ project) so that MemoQ’s segmentation has no effect on WordFast’s.
  6. Go to Translations > Add document as…
  7. Select the file with .XML extension and open it.
  8. The Document import settings window is displayed. Download this this MemoQ XML definition file, then click on the on top to import this file.
  9. Click OK at the bottom of the window. The window closes and the file is imported.
  10. Open the file in MemoQ and check that the total number of segments is identical to the number you checked when opening the file in WordFast initially. MemoQ should also have inserted the any tags in the correct positions corresponding to the tags contained in WordFast.
  11. Translate the file normally.
  12. When ready, export it with Export (dialog)
Check the translated file in MemoQ
  1. Restore the .txml extension and open the translated file in WordFast. You should get no error messages. Check that the total number of segments is still the same, check the tag position, etc.
  2. Confirm all segments in WordFast (the only way I know is to press Alt – down arrow in each segment.
  3. Save and deliver the file. Source: http://blog.albatrossolutions.com 
This procedure works for version 3.X.X.
For version 4.X there is a workaround: Step 8: Tools -> Options -> Resource console -> Filter configurations -> Clone. Rename clone filter to Wordfast (Properties). Copy the contents of the xml filter (MemoQ XML definition file) to the file located in C:\Documents and Settings\All Users\Application Data\MemoQ\Resources\Local\FilterConfigs named XMLConverter#Wordfast. Save and close. Continue in MemoQ: The Document import settings window is displayed. Select Wordfast from the xml filter list to import this file. Now it should work, continue with step 9.