Running bergamot-translator on Linux
$ git clone git@github.com:browsermt/bergamot-translator.git
$ mkdir build
$ sudo apt install libpcre2-dev libopenblas-dev
$ cmake ..
$ make -jyaml file:bergamot-mode: native
models:
- firefox-translations-models/models/prod/esen/model.esen.intgemm.alphas.bin
vocabs:
- firefox-translations-models/models/prod/esen/vocab.esen.spm
- firefox-translations-models/models/prod/esen/vocab.esen.spm
shortlist:
- firefox-translations-models/models/prod/esen/lex.50.50.esen.s2t.bin
- false
beam-size: 1
normalize: 1.0
word-penalty: 0
max-length-break: 128
mini-batch-words: 1024
workspace: 128
max-length-factor: 2.0
skip-cost: true
cpu-threads: 0
quiet: false
quiet-translation: false
gemm-precision: int8shiftAlphaAll
alignment: soft where esen is the language pair for the translation, in this case es→en (Spanish to English).
The models/vocabs/shortlist files should be sourced from the firefox-translations-models repository, with git-lfs. There's some docs which still point to Google
cloud storage for downloads, but those are stale.
Pipe some data through bergamot-translator:
echo "Hola mundo" | ./bergamot-translator --model-config-paths config.yml---Requirement: Python <= 3.10 (wheels are not available for newer versions)
pip install bergamot
import bergamot
config = bergamot.ServiceConfig(numWorkers=4)
service = bergamot.Service(config)
model = service.modelFromConfigPath("bergamot.config.yml")
options = bergamot.ResponseOptions(
alignment=False, qualityScores=False, HTML=False
)
response = service.translate(model, bergamot.VectorString([
"In the last 3 months, over 80 arrestees were released from the Central Booking facility without being formally charged.",
"Since its inception, The Onion has become a veritable news parody empire.",
"The hostel’s guests were mostly citizens of the United Arab Emirates.",
]), options)
for r in response:
print(r.target.text)
bergamot.config.yml:
# To imitate production setting, these Marian options are set according to
# https://github.com/mozilla/firefox-translations/blob/main/extension/controller/translation/translationWorker.js
# For reference, see https://github.com/mozilla/firefox-translations-models/blob/main/evals/translators/bergamot.sh
bergamot-mode: wasm
models:
- ./model.enro.intgemm.alphas.bin
vocabs:
- ./vocab.enro.spm
- ./vocab.enro.spm
shortlist:
- ./lex.50.50.enro.s2t.bin
- false
beam-size: 1
normalize: 1.0
word-penalty: 0
max-length-break: 128
mini-batch-words: 1024
workspace: 128
max-length-factor: 2.0
skip-cost: true
cpu-threads: 4
quiet: false
quiet-translation: false
gemm-precision: int8shiftAlphaAll
alignment: soft
Translatelocally compatible models:
https://translatelocally.com/models.json
Firefox models:
https://github.com/mozilla/firefox-translations-models/tree/main/models
pip install bergamotimport bergamot
config = bergamot.ServiceConfig(numWorkers=4)
service = bergamot.Service(config)
model = service.modelFromConfigPath("bergamot.config.yml")
options = bergamot.ResponseOptions(
alignment=False, qualityScores=False, HTML=False
)
response = service.translate(model, bergamot.VectorString([
"Ovechkin’s first assist of the night was on the game-winning goal by rookie Nicklas Backstrom",
"In the last 3 months, over 80 arrestees were released from the Central Booking facility without being formally charged.",
"Since its inception, The Onion has become a veritable news parody empire, with a print edition, a website that drew 5,000,000 unique visitors in the month of October, personal ads, a 24 hour news network, podcasts, and a recently launched world atlas called Our Dumb World.",
"The hostel’s guests were mostly citizens of the United Arab Emirates.",
"It was developed by John Smith in the 1970s to help inexperienced folders or those with limited motor skills.",
"When people don’t see moose as potentially dangerous, they may approach too closely and put themselves at risk.",
]), options)
for r in response:
print(r.target.text) bergamot.config.yml:# These Marian options are set according to
# https://github.com/mozilla/firefox-translations/blob/main/extension/controller/translation/translationWorker.js
# to imitate production setting
# For reference, see https://github.com/mozilla/firefox-translations-models/blob/main/evals/translators/bergamot.sh
bergamot-mode: wasm
models:
- ./model.enit.intgemm.alphas.bin
vocabs:
- ./vocab.enit.spm
- ./vocab.enit.spm
shortlist:
- ./lex.50.50.enit.s2t.bin
- false
beam-size: 1
normalize: 1.0
word-penalty: 0
max-length-break: 128
mini-batch-words: 1024
workspace: 128
max-length-factor: 2.0
skip-cost: true
cpu-threads: 4
quiet: false
quiet-translation: false
gemm-precision: int8shiftAlphaAll
alignment: soft

