Monday, June 13, 2011

Bing Translator API

There are 3 primary ways to interact with the service:
They all seem to expose the same methods but it’s just the way you call them that differs.  For example, the sample code published for the HTTP method looks like:
   1: string appId = "myAppId";
   2: string text = "Translate this for me";
   3: string from = "en";
   4: string to = "fr";
   5:  
   6: string detectUri = "http://api.microsofttranslator.com/v2/Http.svc/Translate?appId=" + appId +
   7:     "&text;=" + text + "&from;=" + from + "&to;=" + to;
   8: HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(detectUri);
   9: WebResponse resp = httpWebRequest.GetResponse();
  10: Stream strm = resp.GetResponseStream();
  11: StreamReader reader = new System.IO.StreamReader(strm);
  12: string translation = reader.ReadToEnd();
  13:  
  14: Response.Write("The translated text is: '" + translation + "'.");
Then, for the SOAP method:
   1: string result;
   2: TranslatorService.LanguageServiceClient client = 
   3:                     new TranslatorService.LanguageServiceClient(); 
   4: result = client.Translate("myAppId", 
   5:                           "Translate this text into German", 
   6:                           "en", "de"); 
   7: Console.WriteLine(result);
And lastly for the AJAX method:
   1: var languageFrom = "en";
   2: var languageTo = "es";
   3: var text = "translate this.";
   4:  
   5: function translate() {
   6:     window.mycallback = function(response) { alert(response); }
   7:     
   8:     var s = document.createElement("script");
   9:     s.src = "http://api.microsofttranslator.com/V2/Ajax.svc/Translate?oncomplete=mycallback&appId;=myAppId&from;=" 
  10:                 + languageFrom + "&to;=" + languageTo + "&text;=" + text;
  11:     document.getElementsByTagName("head")[0].appendChild(s);
  12: }
Fortunately, it all works as you’d expect – cleanly and simply.  The really nice thing about this (and the Google Translator) is that when faced with straight-up HTML like:
   1: <p class="style">Hello World!</p>
They will both return the following:
   1: <p class="style">¡Hola mundo!</p> 
Both translators will keep the HTML tags intact and only translate the actual text.  This undoubtedly comes in handy if you do any large bulk translations.  For example, I’m working with another couple of guys here on an internal (one day external) tool that has a lot of data in XML files with markup.  Essentially we need to translate something like the following:
   1: <Article Id="this does not get translated" 
   2:            Title="Title of the article" 
   3:            Category="Category for the article"
   4:            >
   5:   <Content><![CDATA[<P>description for the article<BR/>another line </p>]]></Content>
   6: </Article>
The cool thing is that if I just deserialize the above into an object and send the value of the Content member to the service like:
   1: string value = client.Translate(APPID_TOKEN, 
   2:                                 content, "en", "es");
I get only the content of the HTML translated:
   1: <p>Descripción del artículo<br>otra línea</p> 
Pretty nice and easy.  One thing all of the translator services have trouble with is if I just try to translate the entire xml element from the above in one shot.  Bing returns:
   1: <article id="this does not get translated" 
   2:          title="Title of the article" 
   3:          category="Category for the article">
   4: </article> 
   5:     <content><![CDATA[<P>Descripción del artículo<br>otra línea]]</content> >
And Google returns:
   1: <= Id artículo "esto no se traduce"
   2: Título = "Título del artículo"
   3: Categoría = "Categoría para el artículo">
   4:  
   5: <Content> <! [CDATA [descripción <P> para el artículo <BR/> otra línea </ p >]]>
   6: </ contenido>
   7: </> Artículo
Source: http://www.samuraiprogrammer.com