QVAC Logo

Logging

Visibility into what's happening during loading, inference, and other operations.

Overview

QVAC provides three complementary logging primitives:

  • subscribeServerLogs(): subscribe to every server-side log (SDK server, all loaded models, RAG) through a single stream, without tracking individual stream IDs. Returns an unsubscribe function.
  • loggingStream(): stream real-time logs for a single source — the SDK server (SDK_LOG_ID) or one model (its model ID). You decide what to do with each log line (print, persist, filter).
  • getLogger(): create a logger for your own application code (namespaced, configurable level, optional transports).

Functions

  1. getLogger() — create a logger
  2. loadModel() — pass logger via logger option
  3. loggingStream() — stream real-time logs from a single model or the SDK server
  4. subscribeServerLogs() — stream all server-side logs through one subscription

For how to use each function, see SDK — API reference.

Flow

  1. Pass a logger when loading your model.
  2. When logging is enabled, you'll see real-time logs from the underlying model libraries:
[DEBUG] llamacpp:llm: Loading model weights...
[INFO] llamacpp:llm: Model loaded successfully, vocab_size=32000
[DEBUG] llamacpp:llm: Starting inference...
[DEBUG] llamacpp:llm: Inference completed, tokens=12

Features

  • Streaming API (loggingStream) — Consume real-time logs from one source programmatically. Stream either:

    • SDK server logs using SDK_LOG_ID, or
    • per-model addon logs using the model ID returned by loadModel().
  • Global subscription (subscribeServerLogs) — Receive every server-side log (SDK server, all models, RAG) through a single subscription, without knowing the stream IDs ahead of time. The handler is called once per log line, and each log carries its origin in log.id (SDK_LOG_ID, a model ID, or a RAG workspace key) so you can still tell the sources apart. It is built on loggingStream and the reserved SDK_ALL_LOG_ID stream that the worker fans all logs into. The call returns an unsubscribe() function — invoke it to stop receiving logs.

  • Logger API (getLogger) — Create loggers for your application code with custom transports. Console output enabled by default; set enableConsole: false to use only custom transports.

It works for all model types (LLM, Whisper, NMT, Embeddings) and provides valuable insight into model performance and behavior.

Configuration

The SDK's own server and client logs are silent on the console by default. They are still delivered to loggingStream() and custom transports, so you opt in to seeing them.

To configure global logging, use a config file (qvac.config.json, qvac.config.js, or qvac.config.ts) and set:

  • loggerConsoleOutput: boolean — print the SDK's own logs to the console. Defaults to false; set to true to see them.
  • loggerLevel: "error" | "warn" | "info" | "debug" | "off" — minimum level for SDK loggers. Defaults to "info"; use "off" to suppress streams and transports too.

Example

The following script shows an example of streaming logs from loaded models:

logging.js
import { loadModel, completion, unloadModel, loggingStream, SDK_LOG_ID, LLAMA_3_2_1B_INST_Q4_0, GTE_LARGE_FP16, VERBOSITY, embed, } from "@qvac/sdk";
try {
    console.log("▸ Streaming SDK and model logs");
    // Note: To configure logging (level and console output), use config file:
    // { "loggerLevel": "debug", "loggerConsoleOutput": false } in qvac.config.json/js/ts
    // Subscribe to SDK server logs in background
    console.log("▸ Subscribing to SDK server logs");
    (async () => {
        for await (const log of loggingStream({ id: SDK_LOG_ID })) {
            console.log(`[SDK] [${log.level.toUpperCase()}] [${log.namespace}] ${log.message}`);
        }
    })().catch(() => {
        // Stream terminated - normal on shutdown
    });
    // Load models
    console.log("▸ Loading models");
    const llmModelId = await loadModel({
        modelSrc: LLAMA_3_2_1B_INST_Q4_0,
        modelConfig: {
            ctx_size: 2048,
            temp: 0.7,
            verbosity: VERBOSITY.ERROR, // Only log errors, remaining logs are captured by loggingStream
        },
    });
    const embedModelId = await loadModel({
        modelSrc: GTE_LARGE_FP16,
    });
    console.log("▸ Subscribing to model logs");
    (async () => {
        for await (const log of loggingStream({ id: llmModelId })) {
            const timestamp = new Date(log.timestamp).toISOString();
            console.log(`[LLM] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`);
        }
    })().catch(() => {
        // Stream terminated - this is normal when model unloads
    });
    (async () => {
        for await (const log of loggingStream({ id: embedModelId })) {
            const timestamp = new Date(log.timestamp).toISOString();
            console.log(`[EMBED] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`);
        }
    })().catch(() => {
        // Stream terminated - this is normal when model unloads
    });
    const messages = [
        { role: "user", content: "Count from 1 to 5 and explain each number." },
    ];
    const result = completion({
        modelId: llmModelId,
        history: messages,
        stream: true,
    });
    const { embedding } = await embed({
        modelId: embedModelId,
        text: messages[0]?.content ?? "Hello, world!",
    });
    console.log("▸ Response");
    for await (const token of result.tokenStream) {
        process.stdout.write(token);
    }
    console.log("Embedding (first 20 elements)", embedding.slice(0, 20));
    console.log("Embeddings length", embedding.length);
    console.log("▸ Three streams running: [SDK] server, [LLM] inference, [EMBED] embedding");
    await unloadModel({ modelId: llmModelId, clearStorage: false });
    await unloadModel({ modelId: embedModelId, clearStorage: false });
}
catch (error) {
    console.error("✖", error);
    process.exit(1);
}

To capture logs from every source (SDK server, all models, RAG) with a single subscription, use subscribeServerLogs:

logging-global.js
// loggerConsoleOutput: false in this config disables the default console
// transport on both the client and the worker, so the only logs printed are
// the ones our handler below receives.
const configDir = import.meta.dirname ?? process.cwd();
process.env["QVAC_CONFIG_PATH"] = `${configDir}/config/logging/logging.config.json`;
const { loadModel, completion, unloadModel, subscribeServerLogs, LLAMA_3_2_1B_INST_Q4_0 } = await import("@qvac/sdk");
try {
    console.log("▸ Starting global logging demo...\n");
    // One subscription captures every server-side log (SDK, models, RAG, …)
    // without having to know any stream IDs ahead of time.
    const unsubscribe = subscribeServerLogs((log) => {
        console.log(`[SDK] [${log.level.toUpperCase()}] [${log.namespace}] ${log.message}`);
    });
    const modelId = await loadModel({
        modelSrc: LLAMA_3_2_1B_INST_Q4_0,
        modelConfig: { ctx_size: 2048 },
    });
    const result = completion({
        modelId,
        history: [{ role: "user", content: "Count from 1 to 5." }],
        stream: true,
    });
    console.log("▸ Response:\n");
    for await (const token of result.tokenStream) {
        process.stdout.write(token);
    }
    await unloadModel({ modelId, clearStorage: false });
    unsubscribe();
}
catch (error) {
    console.error("✖", error);
    process.exit(1);
}
export {};

Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.

On this page

Ask anything about QVAC.