Note: The instructions below are meant for packagers.
Prerequisites
- Make sure you have git-lfs installed
git lfs installClone the repository
git clone https://github.com/mozilla/firefox-translations-modelsNote: This will take a long time as git-lfs needs to pull all the models (~1.5 GB) from remote.
Once the repository has been cloned, you'll find the models in firefox-translations-models/models like so:
firefox-translations-models/models/
├── tiny
│ ├── bsen
│ │ ├── lex.50.50.bsen.s2t.bin.gz
│ │ ├── model.bsen.intgemm.alphas.bin.gz
│ │ └── vocab.bsen.spm.gz
| <... Omitted for brevity ...>
│ └── nnen
│ ├── lex.50.50.nnen.s2t.bin.gz
│ ├── model.nnen.intgemm.alphas.bin.gz
│ └── vocab.nnen.spm.gz
└── base
├── bgen
│ ├── lex.50.50.bgen.s2t.bin.gz
│ ├── model.bgen.intgemm.alphas.bin.gz
│ └── vocab.bgen.spm.gz
<... Omitted for brevity ...>
└── zhen
├── lex.50.50.zhen.s2t.bin.gz
├── model.zhen.intgemm.alphas.bin.gz
└── vocab.zhen.spm.gz
71 directories, 206 files
Flatten the directory structure
We need all the models, vocabs and shortlists in a single directory.
We also need a registry.json associating models, vocabs and shortlists with language pairs.
Use the sripts/process-firefox-translations-models.py script provided with the source code to flatten the folder structure and also to
generate the registry.json file.
Copy that script into firefox-translations-models/process-firefox-translation-models.py
cd firefox-translations-models
chmod +x process-firefox-translation-models.py
./process-firefox-translation-models.pyThis should have created a new directory called flattened-models with all the models, vocabs and shortlists and a file in this directory called registry.json along with a copy of the LICENSE file.
The script also provides other options like --base-first and --silent. Run it with --help flag to get more information.
Verify your directory and registry.json
Go through the directory and registry.json and check if everything looks valid.
Create the archive
Archiving using these settings will take a long time. change the tar step if you wish to do a quicker compress.
mv flattened-models/ firefox/ # rename the directory
time tar -Jcvf firefox-models.tar.xz firefox/The tar step took about 30 mins on my thinkpad x270. But, it will take less than 5 mins on a good computer.
You'll find the archive ready in firefox-models.tar.xz.