Wednesday, November 13, 2019

Access IATE API programmatically using PYTHON

The IATE terminology database is accessible for search and research purposes on the address https://iate.europa.eu and can be accesses programatically using the API. A description of their endpoints and procedures is available at https://iate.europa.eu/developers
Here is a Python script to use this database from the command line or even as a program line in Goldendict (always use latest RC version, not the stable 1.0.1 version).
This line is for using the Programs function in Goldendict as HTML type:
python IATEPOST.py de ro %GDWORD%
or try
cmd.exe /c chcp 65001 > nul & python IATEPOST.py de ro %GDWORD%
if you encounter unicode problems,
where "de" and "ro" are the language codes for the source and the target language.

#!/usr/bin/env python3
#PYTHONIOENCODING='utf-8'
#PYTHONLEGACYWINDOWSSTDIO='utf-8'
# -*- coding: utf-8 -*-
import json
import requests
import sys

"""
To use, just run the Python script:
python IATEPOST.py de ro %GDWORD%
where "de" and "ro" are the language codes for the source and the target language.
"""
sourcelanguage = sys.argv[1]
targetlanguage = sys.argv[2]
word2search = sys.argv[3]
datapost = {'query': word2search, 'source': sourcelanguage, 'targets': [targetlanguage, 'mul', 'la'], 'search_in_fields': [0 ], 'search_in_term_types': [0, 1, 2, 3, 4, 5], 'query_operator': 5, 'mediaType': 'application/json', 'authType': 'No Authorization', 'Content-Type': 'application/json'}
url = 'https://iate.europa.eu/em-api/entries/_search?expand=true&offset=0&limit=100'

data_json = json.dumps(datapost, indent=4)
response = requests.post(url, data=data_json, headers={'Content-type': 'application/json'})
jsontext = json.loads(response.content) # the result is a Python dictionary:
jsontext = str(jsontext).replace("'term_value'", "\r\nterm_value'")
jsontext = str(jsontext).replace("'term_references'", "\r\nterm_references'")
jsontext = str(jsontext).replace("'highlighted_term_value'", "\r\n'")
jsontext = str(jsontext).replace("'context'", "\r\n''context'")
jsontext = str(jsontext).replace("'tooltip_context'", "\r\n''tooltip_context'")
if word2search not in jsontext:
    print('Not found!')
    exit()
    jsontext.split("\r\n")
for line in jsontext.split("\r\n"):
    if 'term_value' in line and 'metadata' not in line:
        line = line.replace('\',', '')
        line = line + '<br>'
        sys.stdout.buffer.write(line[14:].encode('utf-8'))
sys.stdout.buffer.write('<br><b>Contexts:</b><br>'.encode('utf-8'))
for line in jsontext.split("\r\n"):
    if 'context' in line and 'metadata' not in line:
        line = line.replace('\',', '')
        line = line.replace('\\xad', '')
        line = line + '<br>'
        sys.stdout.buffer.write(line[14:].encode('utf-8'))