SDK Release Notes — v0.8.0

📦 NPM: https://www.npmjs.com/package/@qvac/sdk/v/0.8.0

This release adds Parakeet as a new transcription engine alongside Whisper, decouples builtin plugins from the core load path for a cleaner architecture, adds CTC and Sortformer transcription models, introduces a built-in profiler for diagnosing performance, and brings Bergamot pivot translation support. Several addon logging and delegation bugs have also been resolved.

Breaking Changes

@qvac/sdk

Embedding Model Config Uses Structured Fields

The embedding addon no longer accepts the raw tab-delimited config string. Use the structured fields instead.

Before:

await loadModel({
  modelSrc: EMBEDDINGS_NOMIC_EMBED_TEXT_V1_5,
  modelType: "embeddings",
  modelConfig: {
    rawConfig: "-ngl\t99\n-dev\tgpu\n--batch_size\t1024",
  },
});

After:

await loadModel({
  modelSrc: EMBEDDINGS_NOMIC_EMBED_TEXT_V1_5,
  modelType: "embeddings",
  modelConfig: {
    gpuLayers: 99,
    device: "gpu",
    batchSize: 1024,
  },
});

Companion Model Sources Moved Into `modelConfig`

Top-level companion source fields (projectionModelSrc, vadModelSrc, srcVocabSrc, dstVocabSrc) have been removed from loadModel. They now live inside modelConfig. The unused toolFormat field has also been removed.

Before:

await loadModel({
  modelType: "llm",
  modelSrc: ".../model.gguf",
  projectionModelSrc: ".../mmproj.gguf",
});

await loadModel({
  modelType: "whisper",
  modelSrc: ".../model.bin",
  vadModelSrc: ".../vad.bin",
});

await loadModel({
  modelType: "nmt",
  modelSrc: ".../model.intgemm.alphas.bin",
  srcVocabSrc: ".../srcvocab.spm",
  dstVocabSrc: ".../trgvocab.spm",
});

After:

await loadModel({
  modelType: "llm",
  modelSrc: ".../model.gguf",
  modelConfig: {
    projectionModelSrc: ".../mmproj.gguf",
  },
});

await loadModel({
  modelType: "whisper",
  modelSrc: ".../model.bin",
  modelConfig: {
    vadModelSrc: ".../vad.bin",
  },
});

await loadModel({
  modelType: "nmt",
  modelSrc: ".../model.intgemm.alphas.bin",
  modelConfig: {
    srcVocabSrc: ".../srcvocab.spm",
    dstVocabSrc: ".../trgvocab.spm",
  },
});

Additionally, custom plugins must now provide a loadConfigSchema. For plugins that accept any config, use a passthrough schema:

const myPlugin: QvacPlugin = {
  modelType: "my-plugin",
  displayName: "My Plugin",
  addonPackage: "@my/addon",
  loadConfigSchema: z.object({}).catchall(z.unknown()),
  createModel: (params) => ({ model, loader }),
  handlers: {},
};

Parakeet Model Constants Renamed

All existing Parakeet model constants now include a variant prefix (TDT_) to distinguish them from the new CTC and Sortformer variants. This is a find-and-replace migration:

Old Constant	New Constant
`PARAKEET_ENCODER_*`	`PARAKEET_TDT_ENCODER_*`
`PARAKEET_DECODER_*`	`PARAKEET_TDT_DECODER_*`
`PARAKEET_VOCAB`	`PARAKEET_TDT_VOCAB`
`PARAKEET_PREPROCESSOR_*`	`PARAKEET_TDT_PREPROCESSOR_*`

Features

@qvac/sdk

Parakeet Transcription Plugin

NVIDIA Parakeet ONNX models are now supported as an alternative transcription engine alongside Whisper.cpp. Parakeet uses a multi-file model architecture (encoder, encoder-data, decoder, vocab, preprocessor) resolved through the QVAC Registry.

import {
  PARAKEET_TDT_ENCODER_FP32,
  PARAKEET_TDT_ENCODER_DATA_FP32,
  PARAKEET_TDT_DECODER_FP32,
  PARAKEET_TDT_VOCAB,
  PARAKEET_TDT_PREPROCESSOR_FP32,
} from "@qvac/sdk";

const modelId = await loadModel({
  modelType: "parakeet",
  modelSrc: PARAKEET_TDT_ENCODER_FP32,
  modelConfig: {
    parakeetEncoderSrc: PARAKEET_TDT_ENCODER_FP32,
    parakeetEncoderDataSrc: PARAKEET_TDT_ENCODER_DATA_FP32,
    parakeetDecoderSrc: PARAKEET_TDT_DECODER_FP32,
    parakeetVocabSrc: PARAKEET_TDT_VOCAB,
    parakeetPreprocessorSrc: PARAKEET_TDT_PREPROCESSOR_FP32,
  },
});

const stream = transcribeStream({ modelId, audioInput: "./audio.mp3" });
for await (const chunk of stream) {
  process.stdout.write(chunk);
}

Supports MP3, WAV, OGG, M4A, RAW, and FLAC audio formats. The plugin filters [No speech detected] chunks automatically, mirroring Whisper's [BLANK_AUDIO] filtering.

Bergamot Pivot Translation

Translate between language pairs that don't have a direct model by chaining through a pivot language (typically English). Configure the pivot model inside modelConfig:

await loadModel({
  modelSrc: BERGAMOT_ES_EN,
  modelType: "nmt",
  modelConfig: {
    engine: "Bergamot",
    from: "es",
    to: "it",
    pivotModel: {
      modelSrc: BERGAMOT_EN_IT,
      beamsize: 4,
      temperature: 0.3,
      topk: 100,
      normalize: 1,
      lengthpenalty: 1.2,
    },
  },
});

SDK Profiler

A built-in profiler lets you measure SDK operation performance at runtime. Enable it globally or on a per-call basis:

import { profiler } from "@qvac/sdk";

profiler.enable({
  mode: "verbose",
  includeServerBreakdown: true,
});

// Run operations, then export results
console.log(profiler.exportSummary());
console.log(profiler.exportTable());
console.log(profiler.exportJSON({ includeRecentEvents: true }));

profiler.disable();

Per-call profiling control is also available on embed, completion, transcribe, translate, ragSearch, and invokePlugin:

await embed(
  { modelId, text: "hello" },
  { profiling: { enabled: true } },
);

@qvac/sdk

CTC and Sortformer Transcription Models

The SDK now supports two additional Parakeet model variants beyond the existing TDT models:

CTC transcription for fast, streaming-friendly speech-to-text:

const modelId = await loadModel({
  modelSrc: PARAKEET_CTC_FP32_1,
  modelType: "parakeet",
  modelConfig: {
    modelType: "ctc",
    parakeetCtcModelSrc: PARAKEET_CTC_FP32_1,
    parakeetCtcModelDataSrc: PARAKEET_CTC_DATA_FP32_1,
    parakeetTokenizerSrc: PARAKEET_CTC_TOKENIZER,
  },
});
const text = await transcribe({ modelId, audioChunk: "/path/to/audio.wav" });

Sortformer diarization for speaker identification:

const modelId = await loadModel({
  modelSrc: PARAKEET_SORTFORMER_FP32,
  modelType: "parakeet",
  modelConfig: {
    modelType: "sortformer",
    parakeetSortformerSrc: PARAKEET_SORTFORMER_FP32,
  },
});
const diarization = await transcribe({ modelId, audioChunk: "/path/to/audio.wav" });
// Returns: "Speaker 0: 0.00s - 4.24s\nSpeaker 1: 4.65s - 14.32s\n..."

Addon logging fixed across all plugins — Logging callbacks were not being properly attached to some addon plugins, resulting in missing addon-level logs. All plugins now correctly route native addon logs through the SDK logging system.
Parakeet addon logger now uses the correct params.modelId for log routing.
RPC race condition in getRPCInstance resolved — concurrent initialization calls no longer create duplicate worker instances.
Delegated model unload now correctly unloads the model on the provider device instead of only removing the local registry entry.
Stale delegated model entries are replaced on re-registration instead of accumulating.

📦 Model Changes

Added Models

OCR:

OCR_DETECTOR_DB_MOBILENET_V3_LARGE
OCR_DETECTOR_DB_RESNET50
OCR_RECOGNIZER_CRNN_MOBILENET_V3_SMALL
OCR_RECOGNIZER_PARSEQ

Parakeet (CTC):

PARAKEET_CTC_FP32, PARAKEET_CTC_DATA_FP32, PARAKEET_CTC_VOCAB, PARAKEET_CTC_INT8, PARAKEET_CTC_CONFIG
PARAKEET_CTC_FP32_1, PARAKEET_CTC_DATA_FP32_1, PARAKEET_CTC_TOKENIZER

Parakeet (Sortformer / EOU):

PARAKEET_SORTFORMER_FP32
PARAKEET_EOU_ENCODER_FP32, PARAKEET_EOU_DECODER_FP32, PARAKEET_EOU_TOKENIZER

Parakeet (TDT — renamed from unprefixed):

PARAKEET_TDT_ENCODER_FP32, PARAKEET_TDT_ENCODER_DATA_FP32, PARAKEET_TDT_DECODER_FP32, PARAKEET_TDT_VOCAB, PARAKEET_TDT_PREPROCESSOR_FP32
PARAKEET_TDT_ENCODER_INT8, PARAKEET_TDT_DECODER_INT8, PARAKEET_TDT_PREPROCESSOR_INT8

Removed Models

The following unprefixed Parakeet constants were replaced by their TDT_ equivalents (see breaking changes):

PARAKEET_ENCODER_FP32, PARAKEET_ENCODER_DATA_FP32, PARAKEET_DECODER_FP32, PARAKEET_VOCAB, PARAKEET_PREPROCESSOR_FP32
PARAKEET_ENCODER_INT8, PARAKEET_DECODER_INT8, PARAKEET_PREPROCESSOR_INT8

🧹 Other Changes

Consolidated transcription schemas and shared ops into a unified structure.
Updated @qvac/tts-onnx to v0.6.1 and @qvac/transcription-whispercpp addon.
Aligned TTS plugin artifact patterns.
Ported Android E2E test workflow to the monorepo.
Improved changelog aggregation to prefer CHANGELOG_LLM.md when available.
Reorganized monorepo documentation structure.
Ported logging executor to direct assertions and added missing test handlers.
Enabled profiling in test-qvac desktop and mobile consumers.

SDK Release Notes — v0.8.0

Breaking Changes

@qvac/sdk

Embedding Model Config Uses Structured Fields

Companion Model Sources Moved Into `modelConfig`

Parakeet Model Constants Renamed

Features

@qvac/sdk

Parakeet Transcription Plugin

Bergamot Pivot Translation

SDK Profiler

@qvac/sdk

CTC and Sortformer Transcription Models

AfriqueGemma Support

Profiler Operation Transport and Metrics

Bug Fixes

@qvac/sdk

📦 Model Changes

Added Models

Removed Models

🧹 Other Changes

On this page