Logging
Visibility into what's happening during loading, inference, and other operations.
Overview
QVAC provides three complementary logging primitives:
subscribeServerLogs(): subscribe to every server-side log (SDK server, all loaded models, RAG) through a single stream, without tracking individual stream IDs. Returns an unsubscribe function.loggingStream(): stream real-time logs for a single source — the SDK server (SDK_LOG_ID) or one model (its model ID). You decide what to do with each log line (print, persist, filter).getLogger(): create a logger for your own application code (namespaced, configurable level, optional transports).
Functions
getLogger()— create a loggerloadModel()— pass logger vialoggeroptionloggingStream()— stream real-time logs from a single model or the SDK serversubscribeServerLogs()— stream all server-side logs through one subscription
For how to use each function, see SDK — API reference.
Flow
- Pass a logger when loading your model.
- When logging is enabled, you'll see real-time logs from the underlying model libraries:
[DEBUG] llamacpp:llm: Loading model weights...
[INFO] llamacpp:llm: Model loaded successfully, vocab_size=32000
[DEBUG] llamacpp:llm: Starting inference...
[DEBUG] llamacpp:llm: Inference completed, tokens=12Features
-
Streaming API (
loggingStream) — Consume real-time logs from one source programmatically. Stream either:- SDK server logs using
SDK_LOG_ID, or - per-model addon logs using the model ID returned by
loadModel().
- SDK server logs using
-
Global subscription (
subscribeServerLogs) — Receive every server-side log (SDK server, all models, RAG) through a single subscription, without knowing the stream IDs ahead of time. The handler is called once per log line, and each log carries its origin inlog.id(SDK_LOG_ID, a model ID, or a RAG workspace key) so you can still tell the sources apart. It is built onloggingStreamand the reservedSDK_ALL_LOG_IDstream that the worker fans all logs into. The call returns anunsubscribe()function — invoke it to stop receiving logs. -
Logger API (
getLogger) — Create loggers for your application code with custom transports. Console output enabled by default; setenableConsole: falseto use only custom transports.
It works for all model types (LLM, Whisper, NMT, Embeddings) and provides valuable insight into model performance and behavior.
Configuration
The SDK's own server and client logs are silent on the console by default. They
are still delivered to loggingStream() and custom transports, so you opt in to
seeing them.
To configure global logging, use a config file (qvac.config.json,
qvac.config.js, or qvac.config.ts) and set:
loggerConsoleOutput:boolean— print the SDK's own logs to the console. Defaults tofalse; set totrueto see them.loggerLevel:"error" | "warn" | "info" | "debug" | "off"— minimum level for SDK loggers. Defaults to"info"; use"off"to suppress streams and transports too.
Example
The following script shows an example of streaming logs from loaded models:
import { loadModel, completion, unloadModel, loggingStream, SDK_LOG_ID, LLAMA_3_2_1B_INST_Q4_0, GTE_LARGE_FP16, VERBOSITY, embed, } from "@qvac/sdk";
try {
console.log("▸ Streaming SDK and model logs");
// Note: To configure logging (level and console output), use config file:
// { "loggerLevel": "debug", "loggerConsoleOutput": false } in qvac.config.json/js/ts
// Subscribe to SDK server logs in background
console.log("▸ Subscribing to SDK server logs");
(async () => {
for await (const log of loggingStream({ id: SDK_LOG_ID })) {
console.log(`[SDK] [${log.level.toUpperCase()}] [${log.namespace}] ${log.message}`);
}
})().catch(() => {
// Stream terminated - normal on shutdown
});
// Load models
console.log("▸ Loading models");
const llmModelId = await loadModel({
modelSrc: LLAMA_3_2_1B_INST_Q4_0,
modelConfig: {
ctx_size: 2048,
temp: 0.7,
verbosity: VERBOSITY.ERROR, // Only log errors, remaining logs are captured by loggingStream
},
});
const embedModelId = await loadModel({
modelSrc: GTE_LARGE_FP16,
});
console.log("▸ Subscribing to model logs");
(async () => {
for await (const log of loggingStream({ id: llmModelId })) {
const timestamp = new Date(log.timestamp).toISOString();
console.log(`[LLM] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`);
}
})().catch(() => {
// Stream terminated - this is normal when model unloads
});
(async () => {
for await (const log of loggingStream({ id: embedModelId })) {
const timestamp = new Date(log.timestamp).toISOString();
console.log(`[EMBED] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`);
}
})().catch(() => {
// Stream terminated - this is normal when model unloads
});
const messages = [
{ role: "user", content: "Count from 1 to 5 and explain each number." },
];
const result = completion({
modelId: llmModelId,
history: messages,
stream: true,
});
const { embedding } = await embed({
modelId: embedModelId,
text: messages[0]?.content ?? "Hello, world!",
});
console.log("▸ Response");
for await (const token of result.tokenStream) {
process.stdout.write(token);
}
console.log("Embedding (first 20 elements)", embedding.slice(0, 20));
console.log("Embeddings length", embedding.length);
console.log("▸ Three streams running: [SDK] server, [LLM] inference, [EMBED] embedding");
await unloadModel({ modelId: llmModelId, clearStorage: false });
await unloadModel({ modelId: embedModelId, clearStorage: false });
}
catch (error) {
console.error("✖", error);
process.exit(1);
}To capture logs from every source (SDK server, all models, RAG) with a single subscription, use subscribeServerLogs:
// loggerConsoleOutput: false in this config disables the default console
// transport on both the client and the worker, so the only logs printed are
// the ones our handler below receives.
const configDir = import.meta.dirname ?? process.cwd();
process.env["QVAC_CONFIG_PATH"] = `${configDir}/config/logging/logging.config.json`;
const { loadModel, completion, unloadModel, subscribeServerLogs, LLAMA_3_2_1B_INST_Q4_0 } = await import("@qvac/sdk");
try {
console.log("▸ Starting global logging demo...\n");
// One subscription captures every server-side log (SDK, models, RAG, …)
// without having to know any stream IDs ahead of time.
const unsubscribe = subscribeServerLogs((log) => {
console.log(`[SDK] [${log.level.toUpperCase()}] [${log.namespace}] ${log.message}`);
});
const modelId = await loadModel({
modelSrc: LLAMA_3_2_1B_INST_Q4_0,
modelConfig: { ctx_size: 2048 },
});
const result = completion({
modelId,
history: [{ role: "user", content: "Count from 1 to 5." }],
stream: true,
});
console.log("▸ Response:\n");
for await (const token of result.tokenStream) {
process.stdout.write(token);
}
await unloadModel({ modelId, clearStorage: false });
unsubscribe();
}
catch (error) {
console.error("✖", error);
process.exit(1);
}
export {};Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.