Tuesday, September 10, 2024

Language detection API

curl 'https://platform.text.com/api/detect_language' -X POST -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:129.0) Gecko/20100101 Firefox/129.0' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.7,ro;q=0.3' -H 'Accept-Encoding: gzip, deflate, br, zstd' -H 'Referer: https://platform.text.com/tools/language-detector' -H 'Content-Type: text/plain;charset=UTF-8' -H 'Origin: https://platform.text.com' -H 'DNT: 1' -H 'Sec-GPC: 1' -H 'Connection: keep-alive' -H 'Cookie: metrics_session=true' -H 'Sec-Fetch-Dest: empty' -H 'Sec-Fetch-Mode: cors' -H 'Sec-Fetch-Site: same-origin' -H 'Priority: u=0' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' -H 'TE: trailers' --data-raw $'{"text":"Das ist ein Text.","threshold":0.7,"clean_up_whitespaces":true}'

curl 'https://api.openl.io/tools/detect-language' -X POST -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:129.0) Gecko/20100101 Firefox/129.0' -H 'Accept: application/json, text/plain, */*' -H 'Accept-Language: en-US,en;q=0.7,ro;q=0.3' -H 'Accept-Encoding: gzip, deflate, br, zstd' -H 'Content-Type: application/json' -H 'Origin: https://openl.io' -H 'DNT: 1' -H 'Sec-GPC: 1' -H 'Connection: keep-alive' -H 'Referer: https://openl.io/' -H 'Sec-Fetch-Dest: empty' -H 'Sec-Fetch-Mode: cors' -H 'Sec-Fetch-Site: same-site' -H 'Priority: u=0' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' -H 'TE: trailers' --data-raw $'{"role":"free","text":"Das ist ein Text."}' # with daily limit

1‍. fastText

fastText is a tool for easily learning about words and organizing sentences. Anybody can use fastText, including professionals, students, and people who aren't experts. fastText concentrates on arranging text and learning about words.

It's been developed so people can quickly try different methods and improve them without needing special equipment. fastText can process over one billion words very quickly on any multicore processor in a few minutes. It has pre-made models learned from Wikipedia in over 157 different languages.

‍2. lingua-py

This is a python open source library for language detection. A total of 75 languages can be detected.

‍3. LangDetect

This library is a Python version of Google's language-detection library, capable of identifying more than 50 languages. Developed by Nakatani Shuyo at Cybozu Labs, Inc.

4. langid.py

Langid.py is a simple language identification tool that is based on the Python programming language.

5. polyglot

Polyglot is a natural language pipeline that supports massive multilingual applications. It includes language identification as one of its components.

‍6. CLD2 (Compact Language Detector 2)

CLD2 is a library for language detection, optimized for speed and accuracy. It's developed by Google and used in various Google products.