@qvac/translation-nmtcpp

Migration Note (v1.0.0+): Opus/Marian model support has been removed. Only IndicTrans2 and Bergamot backends are supported. If you were using Opus models, migrate to Bergamot for European language pairs.

Overview

Bare module that adds support for translation in QVAC using either qvac-fabric-llm.cpp or Bergamot as the inference engine.

Models

You should load a model compatible with your chosen inference engine:

qvac-fabric-llm.cpp (default): IndicTrans2, converted to GGML. Model file format: *.bin.
Bergamot: Bergamot model bundle. Required files: model *.bin + vocab*.spm.

Installation

npm i @qvac/translation-nmtcpp

Quickstart

If you don't have Bare runtime, install it:

npm i -g bare

Create a new project:

mkdir qvac-translation-quickstart
cd qvac-translation-quickstart
npm init -y

Install dependencies:

npm i @qvac/translation-nmtcpp

Create example.js:

example.js


/**
 * Quickstart Example — Bergamot Backend
 *
 * This example demonstrates translation using the Bergamot backend
 * with local model files or auto-download via Firefox CDN (English to Italian).
 *
 * Usage:
 *   bare example.js
 *   BERGAMOT_MODEL_PATH=/path/to/bergamot/model bare example.js
 *
 * Enable verbose C++ logging:
 *   VERBOSE=1 bare example.js
 */

const TranslationNmtcpp = require('@qvac/translation-nmtcpp')
const {
  ensureBergamotModelFiles,
  getBergamotFileNames
} = require('@qvac/translation-nmtcpp/lib/bergamot-model-fetcher')
const path = require('bare-path')
const process = require('bare-process')

// ============================================================
// LOGGING CONFIGURATION
// Set VERBOSE=1 environment variable to enable C++ debug logs
// ============================================================
const VERBOSE = process.env.VERBOSE === '1' || process.env.VERBOSE === 'true'

const logger = VERBOSE
  ? {
      info: (msg) => console.log('[C++ INFO]', msg),
      warn: (msg) => console.warn('[C++ WARN]', msg),
      error: (msg) => console.error('[C++ ERROR]', msg),
      debug: (msg) => console.log('[C++ DEBUG]', msg)
    }
  : null // null = suppress all C++ logs

const text = 'Machine translation has revolutionized how we communicate across language barriers in the modern digital world.'

async function testBergamot () {
  console.log('\n=== Testing Bergamot Backend ===\n')

  const srcLang = 'en'
  const dstLang = 'it'

  // Use local model path if provided, otherwise auto-download
  const bergamotPath = process.env.BERGAMOT_MODEL_PATH || './model/bergamot/enit'

  // Ensure model files are present (downloads from Firefox CDN if not)
  const modelDir = await ensureBergamotModelFiles(srcLang, dstLang, bergamotPath)
  console.log('Model directory:', modelDir)

  const fileNames = getBergamotFileNames(srcLang, dstLang)

  console.log('Loading model...')

  // Create the model with resolved file paths
  const model = new TranslationNmtcpp({
    files: {
      model: path.join(modelDir, fileNames.modelName),
      srcVocab: path.join(modelDir, fileNames.srcVocabName),
      dstVocab: path.join(modelDir, fileNames.dstVocabName)
    },
    params: { mode: 'full', dstLang, srcLang },
    config: {
      modelType: TranslationNmtcpp.ModelTypes.Bergamot
    },
    logger // Pass the logger
  })

  // Load model
  await model.load()
  console.log('Model loaded successfully!')

  try {
    console.log('Running translation...')
    console.log('Input text:', text)

    // Run the Model
    const response = await model.run(text)

    await response
      .onUpdate(data => {
        console.log('Translation output:', data)
      })
      .await()

    console.log('Bergamot translation finished!')
  } finally {
    console.log('Unloading model...')
    await model.unload()
    console.log('Done!')
  }
}

async function main () {
  try {
    await testBergamot()

    console.log('\n=== All Tests Completed Successfully! ===\n')
  } catch (error) {
    console.error('Test failed:', error)
    throw error
  }
}

main()

Run example.js:

bare example.js

Usage

The library provides a straightforward and intuitive workflow for translating text. Irrespective of the chosen model, the workflow remains the same:

The model class is files-based: you pass it the resolved on-disk paths of the model weights (and, for Bergamot, the vocab files). The package ships fetcher helpers that download the files on demand and return the directory they were written to:

IndicTrans2 — ensureIndicTransModelFile() downloads the GGML model from the QVAC model registry (via @qvac/registry-client).
Bergamot — ensureBergamotModelFiles() downloads the model + vocab bundle from Mozilla's Firefox Remote Settings CDN.

Both helpers are idempotent: if a valid file already exists at the destination, they return immediately without re-downloading. You can also point the model class at files you have downloaded yourself — the fetchers are a convenience, not a requirement.

// IndicTrans2 — fetch from the QVAC model registry
const {
  ensureIndicTransModelFile,
  getIndicTransFileName
} = require('@qvac/translation-nmtcpp/lib/indictrans-model-fetcher')

const path = require('bare-path')

const modelPath = path.join('./model/indictrans', getIndicTransFileName())
await ensureIndicTransModelFile(modelPath) // downloads if not already present

// Bergamot — fetch from the Firefox Remote Settings CDN
const {
  ensureBergamotModelFiles,
  getBergamotFileNames
} = require('@qvac/translation-nmtcpp/lib/bergamot-model-fetcher')

const modelDir = await ensureBergamotModelFiles('en', 'it', './model/bergamot/enit')
const fileNames = getBergamotFileNames('en', 'it') // { modelName, srcVocabName, dstVocabName }

2. Create the `args` object

The model is constructed from a single options object: files (resolved paths from Step 1), params (languages and mode), config (model type and decoding options), and an optional logger.

The shape of files varies slightly depending on which backend you're using.

IndicTrans2

For Indic language translations (English ↔ Hindi, Bengali, Tamil, etc.) IndicTrans2 needs only the model weights file:

const args = {
  files: {
    model: modelPath // resolved path from Step 1
  },
  params: {
    mode: 'full',
    srcLang: 'eng_Latn',   // Source language (ISO 15924 code)
    dstLang: 'hin_Deva'    // Target language (ISO 15924 code)
  },
  config: {
    modelType: TranslationNmtcpp.ModelTypes.IndicTrans
  }
}

Key Parameters:

Parameter	Description	Example
`srcLang`	Source language (ISO 15924)	`'eng_Latn'`, `'hin_Deva'`, `'ben_Beng'`
`dstLang`	Target language (ISO 15924)	`'eng_Latn'`, `'hin_Deva'`, `'tam_Taml'`
`files.model`	Path to the model weights file	`'./model/indictrans/ggml-indictrans2-en-indic-dist-200M-q4_0.bin'`
`modelType`	Required in config: `TranslationNmtcpp.ModelTypes.IndicTrans`	-

IndicTrans2 model naming pattern:

ggml-indictrans2-{direction}-{size}.bin for q0f32 quantization
ggml-indictrans2-{direction}-{size}-q0f16.bin for q0f16 quantization
ggml-indictrans2-{direction}-{size}-q4_0.bin for q4_0 quantization

Where direction is en-indic, indic-en, or indic-indic, and size is dist-200M, dist-320M, or 1B.

Bergamot

Bergamot needs the model weights plus a source and target vocabulary file. Use the paths returned by ensureBergamotModelFiles() / getBergamotFileNames(), or point at files you downloaded yourself:

const path = require('bare-path')

const args = {
  files: {
    model: path.join(modelDir, fileNames.modelName),
    srcVocab: path.join(modelDir, fileNames.srcVocabName),
    dstVocab: path.join(modelDir, fileNames.dstVocabName)
  },
  params: {
    mode: 'full',
    srcLang: 'en',    // Source language (ISO 639-1 code)
    dstLang: 'it'     // Target language (ISO 639-1 code)
  },
  config: {
    modelType: TranslationNmtcpp.ModelTypes.Bergamot
  }
}

Bergamot Model Files by Language Pair:

Language Pair	Model File	Vocab File(s)
en→it	`model.enit.intgemm.alphas.bin`	`vocab.enit.spm`
it→en	`model.iten.intgemm.alphas.bin`	`vocab.iten.spm`
en→es	`model.enes.intgemm.alphas.bin`	`vocab.enes.spm`
es→en	`model.esen.intgemm.alphas.bin`	`vocab.esen.spm`
en→fr	`model.enfr.intgemm.alphas.bin`	`vocab.enfr.spm`
fr→en	`model.fren.intgemm.alphas.bin`	(see Firefox Translations models)
en→de	`model.ende.intgemm.alphas.bin`	`vocab.ende.spm`
en→ru	`model.enru.intgemm.alphas.bin`	`vocab.enru.spm`
ru→en	`model.ruen.intgemm.alphas.bin`	`vocab.ruen.spm`
en→zh	`model.enzh.intgemm.alphas.bin`	`srcvocab.enzh.spm`, `trgvocab.enzh.spm`
zh→en	`model.zhen.intgemm.alphas.bin`	`vocab.zhen.spm`
en→ja	`model.enja.intgemm.alphas.bin`	`srcvocab.enja.spm`, `trgvocab.enja.spm`
ja→en	`model.jaen.intgemm.alphas.bin`	`vocab.jaen.spm`

getBergamotFileNames(srcLang, dstLang) returns the correct modelName, srcVocabName, and dstVocabName for each pair (including the separate source/target vocabs used by CJK languages), so you normally don't need to hard-code these.

Key Parameters:

Parameter	Description	Example
`srcLang`	Source language (ISO 639-1)	`'en'`, `'es'`, `'de'`
`dstLang`	Target language (ISO 639-1)	`'it'`, `'fr'`, `'de'`
`files.model`	Path to the model weights file	`'./model/bergamot/enit/model.enit.intgemm.alphas.bin'`
`files.srcVocab`	Path to the source vocabulary file	`'./model/bergamot/enit/vocab.enit.spm'`
`files.dstVocab`	Path to the target vocabulary file	`'./model/bergamot/enit/vocab.enit.spm'`
`modelType`	Required in config: `TranslationNmtcpp.ModelTypes.Bergamot`	-

Bergamot model file naming convention:

model.{srctgt}.intgemm.alphas.bin - Model weights (e.g., model.enit.intgemm.alphas.bin)
vocab.{srctgt}.spm - Shared vocabulary for most language pairs
srcvocab.{srctgt}.spm + trgvocab.{srctgt}.spm - Separate vocabs for CJK languages (zh, ja)

Model directory layout

Use a unique directory per model to avoid file conflicts when using multiple models:

./model/indictrans for IndicTrans English→Hindi
./model/bergamot/enit for Bergamot English→Italian

The list of supported languages for the srcLang and dstLang parameters differ by model type.

3. Create the `config` object

The config object contains two types of parameters:

Model-specific parameters (required for some backends)
Generation/decoding parameters (optional, controls output quality)

Model-Specific Parameters

Parameter	IndicTrans2	Bergamot
`config.modelType`	Required	Required
`files.srcVocab`	Not needed	Required (passed via `files`, see Step 2)
`files.dstVocab`	Not needed	Required (passed via `files`, see Step 2)

Generation/Decoding Parameters (IndicTrans Only)

These parameters control how the model generates output. Note: Full parameter support is only available for IndicTrans2 models. Bergamot has limited parameter support.

// Generation parameters for IndicTrans2
const generationParams = {
  beamsize: 4,            // Beam search width (>=1). 1 disables beam search
  lengthpenalty: 0.6,     // Length normalization strength (>=0)
  maxlength: 128,         // Maximum generated tokens (>0)
  repetitionpenalty: 1.2, // Penalize previously generated tokens (0..2)
  norepeatngramsize: 2,   // Disallow repeating n-grams of this size (0..10)
  temperature: 0.8,       // Sampling temperature [0..2]
  topk: 40,               // Keep top-K logits [0..vocab_size]
  topp: 0.9               // Nucleus sampling threshold (0 < p <= 1)
}

4. Create Model Instance

Import TranslationNmtcpp and create an instance from the single options object built in Step 2. Decoding options from Step 3 are merged into args.config:

const TranslationNmtcpp = require('@qvac/translation-nmtcpp')

IndicTrans2

// IndicTrans - must specify modelType + generation parameters
const model = new TranslationNmtcpp({
  ...args, // files + params from Step 2
  config: {
    modelType: TranslationNmtcpp.ModelTypes.IndicTrans,
    ...generationParams,  // Spread generation params from Step 3
    maxlength: 256        // Override for longer outputs
  }
})

Bergamot

// Bergamot - vocab files are passed via `files` (see Step 2); limited generation params support
const model = new TranslationNmtcpp({
  ...args, // files (model + srcVocab + dstVocab) + params from Step 2
  config: {
    modelType: TranslationNmtcpp.ModelTypes.Bergamot,
    beamsize: 4 // Only beamsize supported for Bergamot
  }
})

Available Model Types:

TranslationNmtcpp.ModelTypes = {
  IndicTrans: 'IndicTrans', // Indic language models
  Bergamot: 'Bergamot'      // Firefox Translations models
}

5. Load Model

try {
  // Basic usage
  await model.load()
} catch (error) {
  console.error('Failed to load model:', error)
}

6. Run the Model

We can perform inference on the input text using the run() method. This method returns a QVACResponse object.

try {
  // Execute translation on input text
  const response = await model.run('Hello world! Welcome to the internet of peers!')

  // Process streamed output using callback
  await response
    .onUpdate(outputChunk => {
      // Handle each new piece of translated text
      console.log(outputChunk)
    })
    .await() // Wait for translation to complete

  // Access performance statistics (if enabled with opts.stats)
  if (response.stats) {
    console.log('Translation completed in:', response.stats.totalTime, 'ms')
  }
} catch (error) {
  console.error('Translation failed:', error)
}

7. Batch Translation (Bergamot Only)

For translating multiple texts efficiently, use the runBatch() method instead of calling run() multiple times.

runBatch() is only available with the Bergamot backend. IndicTrans2 models should use sequential run() calls.

// Array of texts to translate (English)
const textsToTranslate = [
  'Hello world!',
  'How are you today?',
  'Machine translation has revolutionized communication.'
]

try {
  // Batch translation - returns array of translated strings
  const translations = await model.runBatch(textsToTranslate)

  // Output each translation
  translations.forEach((translatedText, index) => {
    console.log(`Original: ${textsToTranslate[index]}`)
    console.log(`Translated: ${translatedText}\n`)
  })
} catch (error) {
  console.error('Batch translation failed:', error)
}

runBatch() vs run():

Method	Input	Output	Backend Support
`run(text)`	Single string	`QVACResponse` with streaming	All (IndicTrans, Bergamot)
`runBatch(texts)`	Array of strings	Array of strings	Bergamot only

runBatch() is significantly faster when translating multiple texts as it processes them in a single batch operation.

8. Unload the Model

// Always unload the model when finished to free memory
try {
  await model.unload()
} catch (error) {
  console.error('Failed to unload model:', error)
}

Supported Languages

IndicTrans2 Models (QVAC registry)

IndicTrans2 supports translation between English and 22 Indic languages. The following directions are available via the QVAC model registry:

Direction	Available	Sizes
English → Indic	Yes	200M, 1B
Indic → English	Yes	200M, 1B
Indic → Indic	Yes	320M, 1B

Supported Indic Languages:

Assamese (asm_Beng)	Kashmiri (Arabic) (kas_Arab)	Punjabi (pan_Guru)
Bengali (ben_Beng)	Kashmiri (Devanagari) (kas_Deva)	Sanskrit (san_Deva)
Bodo (brx_Deva)	Maithili (mai_Deva)	Santali (sat_Olck)
Dogri (doi_Deva)	Malayalam (mal_Mlym)	Sindhi (Arabic) (snd_Arab)
English (eng_Latn)	Marathi (mar_Deva)	Sindhi (Devanagari) (snd_Deva)
Konkani (gom_Deva)	Manipuri (Bengali) (mni_Beng)	Tamil (tam_Taml)
Gujarati (guj_Gujr)	Manipuri (Meitei) (mni_Mtei)	Telugu (tel_Telu)
Hindi (hin_Deva)	Nepali (npi_Deva)	Urdu (urd_Arab)
Kannada (kan_Knda)	Odia (ory_Orya)

Bergamot Models (Firefox Translations)

Language pairs available via the Firefox Remote Settings CDN:

Language	Code	en→X	X→en
Arabic	ar	Yes	Yes
Czech	cs	Yes	Yes
Spanish	es	Yes	Yes
French	fr	Yes	Yes
Italian	it	Yes	Yes
Japanese	ja	Yes	Yes
Portuguese	pt	Yes	Yes
Russian	ru	Yes	Yes
Chinese	zh	Yes	Yes

The Bergamot backend supports all language pairs available in Firefox Translations. See the Firefox Translations models repository for the complete and up-to-date list of supported language pairs.

ModelClasses and Packages

ModelClass

The main class exported by this library is TranslationNmtcpp, which supports multiple translation backends:

const TranslationNmtcpp = require('@qvac/translation-nmtcpp')

// Available model types
TranslationNmtcpp.ModelTypes = {
  IndicTrans: 'IndicTrans',  // For Indic language translations
  Bergamot: 'Bergamot'       // For Bergamot/Firefox translations
}

Available Packages

Main Package

Package	Description	Backends	Languages
`@qvac/translation-nmtcpp`	Main translation package	Bergamot, IndicTrans	See Supported Languages

The main package supports both backends and all their respective languages. See Supported Languages for the complete list.

Logging

The library supports configurable logging for both JavaScript and C++ (native) components. By default, C++ logs are suppressed for cleaner output.

Enabling C++ Logs

To enable verbose C++ logging, pass a logger object in the options:

// Enable C++ logging
const logger = {
  info: (msg) => console.log('[C++ INFO]', msg),
  warn: (msg) => console.warn('[C++ WARN]', msg),
  error: (msg) => console.error('[C++ ERROR]', msg),
  debug: (msg) => console.log('[C++ DEBUG]', msg)
}

const args = {
  files: { model: modelPath, srcVocab, dstVocab }, // resolved paths (see Step 1)
  params: { mode: 'full', srcLang: 'en', dstLang: 'it' },
  config: { modelType: TranslationNmtcpp.ModelTypes.Bergamot },
  logger  // Pass logger to enable C++ logs
}

Disabling C++ Logs

To suppress all C++ logs, either omit the logger parameter or set it to null:

const args = {
  files: { model: modelPath, srcVocab, dstVocab }, // resolved paths (see Step 1)
  params: { mode: 'full', srcLang: 'en', dstLang: 'it' },
  config: { modelType: TranslationNmtcpp.ModelTypes.Bergamot }
  // No logger = suppress C++ logs
}

Using Environment Variables (Recommended for Examples)

All examples support the VERBOSE environment variable:

# Run with C++ logging disabled (default)
bare examples/quickstart.js

# Run with C++ logging enabled
VERBOSE=1 bare examples/quickstart.js

Log Levels

The C++ backend supports these log levels (mapped from native priority):

Priority	Level	Description
0	`error`	Critical errors
1	`warn`	Warnings
2	`info`	Informational messages
3	`debug`	Debug/trace messages

More resources

Package at npm

@qvac/translation-nmtcpp

Overview

Models

Requirement

Installation

Quickstart

Usage

1. Obtain the model files

2. Create the `args` object

IndicTrans2

Bergamot

3. Create the `config` object

Model-Specific Parameters

Generation/Decoding Parameters (IndicTrans Only)

4. Create Model Instance

IndicTrans2

Bergamot

5. Load Model

6. Run the Model

7. Batch Translation (Bergamot Only)

8. Unload the Model

Supported Languages

IndicTrans2 Models (QVAC registry)

Bergamot Models (Firefox Translations)

ModelClasses and Packages

ModelClass

Available Packages

Main Package

Logging

Enabling C++ Logs

Disabling C++ Logs

Using Environment Variables (Recommended for Examples)

Log Levels

More resources

On this page