Sunday, December 27, 2020

Open Source Machine Translation Systems

 

NMT

System Team Description Link Framework
Tensor2Tensor Google Brain Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research. https://github.com/tensorflow/tensor2tensor Tensorflow
Fairseq Facebook Research Facebook AI Research Sequence-to-Sequence Toolkit written in Python. https://github.com/pytorch/fairseq Pytorch
facebookresearch/fairseq Facebook Research Facebook AI Research Sequence-to-Sequence Toolkit https://github.com/facebookresearch/fairseq Lua
tensorflow/nmt Google Brain TensorFlow Neural Machine Translation Tutorial https://github.com/tensorflow/nmt Tensorflow
OpenNMT-tf OpenNMT Neural machine translation and sequence learning using TensorFlow https://github.com/OpenNMT/OpenNMT-tf Tensorflow
OpenNMT-py OpenNMT Open Source Neural Machine Translation in PyTorch https://github.com/OpenNMT/OpenNMT-py Pytorch
THUMT Tsinghua Natural Language Processing Group Transformer, Multi-GPU training & decoding, Distributed training https://github.com/THUNLP-MT/THUMT Tensorflow/Theano
NiuTrans.NMT NiuTrans Transformer and FFN-LM based on NiuTrans.Tensor by NiuTrans Team. https://github.com/NiuTrans/NiuTrans.Tensor C/C++
MARIANNMT Adam Mickiewicz Pure C++ with minimal dependencies, one engine for GPU/CPU training and decoding https://marian-nmt.github.io/ C++
Seq2Seq Britz Denny and Goldie, Anna and Luong Thang and Le Quoc A general-purpose encoder-decoder framework for Tensorflow https://github.com/google/seq2seq Tensorflow
NEMATUS The Natural Language Processing Group at the University of Edinburgh Support for RNN and Transformer architectures, multi-GPU support, server mode https://github.com/EdinburghNLP/nematus Tensorflow
Sockeye Awslabs A sequence-to-sequence framework for Neural Machine Translation https://awslabs.github.io/sockeye/ MXNet
CytonMT Wang, Xiaolin and Utiyama, Masao and Sumita, Eiichiro An Efficient Neural Machine Translation Open-source Toolkit Implemented in C++ https://github.com/arthurxlw/cytonMt C++
OpenSeq2Seq NVIDIA Modular architecture,support for mixed-precision training,fast Horovod-based distributed training https://nvidia.github.io/OpenSeq2Seq/html/index.html TensorFlow
nmtpytorch The Language and Speech Team of Le Mans University Various end-to-end neural architectures https://github.com/lium-lst/nmtpytorch Pytorch
DL4MT Cho Lab at NYU CS and CDS A multi-encoder, multi-decoder or a multi-way NMT model https://github.com/nyu-dl/dl4mt-multi Theano
ModerNMT Marco, Trombetti and Davide, Caroselli and Nicola, Bertoldi A context-aware, incremental and distributed general purpose Neural Machine Translation technology based on Fairseq Transformer model https://github.com/ModernMT/MMT PyTorch
UnsupervisedMT Facebook Research Seq2seq, biLSTM + attention, Transformer. Ability to share an arbitrary number of parameters. Denoising auto-encoder training. https://github.com/facebookresearch/UnsupervisedMT PyTorch

SMT

System Team Description Link Framework
Moses moses-smt A free software, statistical machine translation engine that can be used to train statistical models of text translation from a source language to a target language http://www.statmt.org/moses/ C++
GIZA++ moses-smt A SMT toolkit that is used to train IBM Models 1-5 and an HMM word alignment model https://github.com/moses-smt/giza-pp C++
NiuTrans.SMT NiuTrans NiuTrans.SMT is an open-source statistical machine translation system developed by a joint team from NLP Lab. at Northeastern University and the NiuTrans Team. The NiuTrans system is fully developed in C++ language. https://github.com/NiuTrans/NiuTrans.SMT C/C++
UCAM-SMT The MT group in Cambridge The Cambridge Statistical Machine Translation system http://ucam-smt.github.io/ C++
Jane The RWTH Aachen University Supports state-of-the-art techniques for phrase-based and hierarchical phrase-based machine translation http://www-i6.informatik.rwth-aachen.de/jane/ C++
Phrasal Stanford NLP Group A state-of-the-art statistical phrase-based machine translation system https://nlp.stanford.edu/phrasal/ Java
cdec The Language Technologies Institute in Carnegie Mellon University A decoder, aligner, and learning framework for SMT and similar structured prediction models http://www.cdec-decoder.org/ C++
JOSHUA Juri Ganitkevitch and Matt Post A SMT decoder for phrase-based, hierarchical, and syntax-based machine translation https://cwiki.apache.org/confluence/display/JOSHUA/ Java
Source: https://github.com/NiuTrans/MT-paper-lists