Sunday, December 27, 2020

Open Source Machine Translation Systems

NMT

System	Team	Description	Link	Framework
Tensor2Tensor	Google Brain	Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.	https://github.com/tensorflow/tensor2tensor	Tensorflow
Fairseq	Facebook Research	Facebook AI Research Sequence-to-Sequence Toolkit written in Python.	https://github.com/pytorch/fairseq	Pytorch
facebookresearch/fairseq	Facebook Research	Facebook AI Research Sequence-to-Sequence Toolkit	https://github.com/facebookresearch/fairseq	Lua
tensorflow/nmt	Google Brain	TensorFlow Neural Machine Translation Tutorial	https://github.com/tensorflow/nmt	Tensorflow
OpenNMT-tf	OpenNMT	Neural machine translation and sequence learning using TensorFlow	https://github.com/OpenNMT/OpenNMT-tf	Tensorflow
OpenNMT-py	OpenNMT	Open Source Neural Machine Translation in PyTorch	https://github.com/OpenNMT/OpenNMT-py	Pytorch
THUMT	Tsinghua Natural Language Processing Group	Transformer, Multi-GPU training & decoding, Distributed training	https://github.com/THUNLP-MT/THUMT	Tensorflow/Theano
NiuTrans.NMT	NiuTrans	Transformer and FFN-LM based on NiuTrans.Tensor by NiuTrans Team.	https://github.com/NiuTrans/NiuTrans.Tensor	C/C++
MARIANNMT	Adam Mickiewicz	Pure C++ with minimal dependencies, one engine for GPU/CPU training and decoding	https://marian-nmt.github.io/	C++
Seq2Seq	Britz Denny and Goldie, Anna and Luong Thang and Le Quoc	A general-purpose encoder-decoder framework for Tensorflow	https://github.com/google/seq2seq	Tensorflow
NEMATUS	The Natural Language Processing Group at the University of Edinburgh	Support for RNN and Transformer architectures, multi-GPU support, server mode	https://github.com/EdinburghNLP/nematus	Tensorflow
Sockeye	Awslabs	A sequence-to-sequence framework for Neural Machine Translation	https://awslabs.github.io/sockeye/	MXNet
CytonMT	Wang, Xiaolin and Utiyama, Masao and Sumita, Eiichiro	An Efficient Neural Machine Translation Open-source Toolkit Implemented in C++	https://github.com/arthurxlw/cytonMt	C++
OpenSeq2Seq	NVIDIA	Modular architecture,support for mixed-precision training,fast Horovod-based distributed training	https://nvidia.github.io/OpenSeq2Seq/html/index.html	TensorFlow
nmtpytorch	The Language and Speech Team of Le Mans University	Various end-to-end neural architectures	https://github.com/lium-lst/nmtpytorch	Pytorch
DL4MT	Cho Lab at NYU CS and CDS	A multi-encoder, multi-decoder or a multi-way NMT model	https://github.com/nyu-dl/dl4mt-multi	Theano
ModerNMT	Marco, Trombetti and Davide, Caroselli and Nicola, Bertoldi	A context-aware, incremental and distributed general purpose Neural Machine Translation technology based on Fairseq Transformer model	https://github.com/ModernMT/MMT	PyTorch
UnsupervisedMT	Facebook Research	Seq2seq, biLSTM + attention, Transformer. Ability to share an arbitrary number of parameters. Denoising auto-encoder training.	https://github.com/facebookresearch/UnsupervisedMT	PyTorch

SMT

System	Team	Description	Link	Framework
Moses	moses-smt	A free software, statistical machine translation engine that can be used to train statistical models of text translation from a source language to a target language	http://www.statmt.org/moses/	C++
GIZA++	moses-smt	A SMT toolkit that is used to train IBM Models 1-5 and an HMM word alignment model	https://github.com/moses-smt/giza-pp	C++
NiuTrans.SMT	NiuTrans	NiuTrans.SMT is an open-source statistical machine translation system developed by a joint team from NLP Lab. at Northeastern University and the NiuTrans Team. The NiuTrans system is fully developed in C++ language.	https://github.com/NiuTrans/NiuTrans.SMT	C/C++
UCAM-SMT	The MT group in Cambridge	The Cambridge Statistical Machine Translation system	http://ucam-smt.github.io/	C++
Jane	The RWTH Aachen University	Supports state-of-the-art techniques for phrase-based and hierarchical phrase-based machine translation	http://www-i6.informatik.rwth-aachen.de/jane/	C++
Phrasal	Stanford NLP Group	A state-of-the-art statistical phrase-based machine translation system	https://nlp.stanford.edu/phrasal/	Java
cdec	The Language Technologies Institute in Carnegie Mellon University	A decoder, aligner, and learning framework for SMT and similar structured prediction models	http://www.cdec-decoder.org/	C++
JOSHUA	Juri Ganitkevitch and Matt Post	A SMT decoder for phrase-based, hierarchical, and syntax-based machine translation	https://cwiki.apache.org/confluence/display/JOSHUA/	Java

Source: https://github.com/NiuTrans/MT-paper-lists