# Profiler (/runtime/profiler)



## Overview

`@qvac/sdk` npm package exposes a `profiler` object that you can import and use to measure and analyze how long SDK operations take in your application. You can enable profiling in two ways:

* [**Global:**](#enable-global) call `profiler.enable()` to profile all subsequent operations until `profiler.disable()` is called.
* [**Per-call:**](#enable-per-call) pass `{ profiling: { enabled: true } }` in the options of an individual function call.

You can also enable profiling globally and use a per-call override to [opt out](#opt-out) of specific calls by passing `{ profiling: { enabled: false } }`. Data collected through both modes is stored in and exported from the same `profiler` singleton. See [Support](#support) for the list of SDK operations that support profiling.

<Callout type="info">
  `profiler` is a process-wide singleton — its state persists until the process exits.
</Callout>

## `profiler`

The `profiler` object provides the following methods:

1. `profiler.enable(options?)` — enable profiling globally
2. `profiler.isEnabled()` — return whether profiling is enabled
3. `profiler.onRecord(callback)` — subscribe to profiling events in real time
4. Export collected data using any of the following:
   * `profiler.exportSummary()` — return a high-level summary string
   * `profiler.exportTable()` — return a detailed table of aggregated metrics
   * `profiler.exportJSON(options?)` — return a full export as structured JSON
5. `profiler.disable()` — disable profiling
6. `profiler.clear()` — clear collected data

See [API — `profiler`](/reference/api#profiler) for the complete reference, including the full metrics catalog for each export function and detailed usage.

## Enable profiling

By default, profiling is disabled. You can enable it [globally](#enable-global) or [per-call](#enable-per-call). If you enable profiling globally, you can [opt out](#opt-out) of individual function calls.

### Global

Enable profiling globally by calling `profiler.enable()`:

```ts
import { profiler } from "@qvac/sdk";

profiler.enable({
  mode: "verbose",                 // "summary" (default) | "verbose"
  includeServerBreakdown: true,    // include server-side timing in responses
  operationFilters: ["completion"], // only profile these operations (empty = all)
});
```

### Per-call

To profile a single call, pass the `profiling` option when invoking the function:

```ts
await embed(
  { modelId, text: "hello" },
  {
    profiling: {
      enabled: true,
      includeServerBreakdown: true,
      mode: "verbose",
    },
  },
);
```

See [Support](#support) for the list of functions that accept the `profiling` option.

### Opt-out

If you enabled profiling [globally](#enable-global), you can pass the `profiling` option to opt out of a specific call:

```ts
await embed(
  { modelId, text: "hello" },
  { profiling: { enabled: false } },
);
```

## Support

The following SDK operations support profiling:

[`completion()`](/reference/api#completion) | [`downloadAsset()`](/reference/api#downloadasset) | [`embed()`](/reference/api#embed) | [`invokePlugin()`](/reference/api#invokeplugin) | [`invokePluginStream()`](/reference/api#invokepluginstream) | [`loadModel()`](/reference/api#loadmodel) | [`ocr()`](/reference/api#ocr) | [`ragChunk()`](/reference/api#ragchunk) | [`ragCloseWorkspace()`](/reference/api#ragcloseworkspace) | [`ragDeleteEmbeddings()`](/reference/api#ragdeleteembeddings) | [`ragDeleteWorkspace()`](/reference/api#ragdeleteworkspace) | [`ragIngest()`](/reference/api#ragingest) | [`ragListWorkspaces()`](/reference/api#raglistworkspaces) | [`ragReindex()`](/reference/api#ragreindex) | [`ragSaveEmbeddings()`](/reference/api#ragsaveembeddings) | [`ragSearch()`](/reference/api#ragsearch) | [`textToSpeech()`](/reference/api#texttospeech) | [`transcribe()`](/reference/api#transcribe) | [`transcribeStream()`](/reference/api#transcribestream) | [`translate()`](/reference/api#translate)

## Examples

### Global

The following script enables profiling globally, loads a model, runs a completion, and exports timing data in all available formats:

<Tabs>
  <Tab value="js" label="JavaScript" default>
    <WrapCode>
      ```js file=<rootDir>/packages/sdk/dist/examples/profiling/basic.js title="profiling-basic.js" lineNumbers
      import { completion, loadModel, unloadModel, LLAMA_3_2_1B_INST_Q4_0, profiler, } from "@qvac/sdk";
      try {
          // Enable profiling globally
          profiler.enable({
              mode: "verbose",
              includeServerBreakdown: true,
          });
          console.log("▸ Profiler enabled:", profiler.isEnabled());
          const modelId = await loadModel({
              modelSrc: LLAMA_3_2_1B_INST_Q4_0,
              onProgress: (p) => {
                  const mb = (n) => (n / 1e6).toFixed(1);
                  const line = `▸ Downloading ${p.percentage.toFixed(0)}% (${mb(p.downloaded)}/${mb(p.total)} MB)`;
                  process.stderr.write(process.stderr.isTTY ? `\r${line}` : `${line}\n`);
                  if (p.percentage >= 100)
                      process.stderr.write("\n");
              },
          });
          console.log("▸ Model loaded:", modelId);
          console.log("\n▸ Running completion...");
          const result = completion({
              modelId,
              history: [{ role: "user", content: "Say hello in one sentence." }],
              stream: true,
          });
          for await (const token of result.tokenStream) {
              process.stdout.write(token);
          }
          console.log();
          await unloadModel({ modelId });
          // Export profiling data
          console.log("\n▸ Profiler Summary");
          console.log(profiler.exportSummary());
          console.log("\n▸ Profiler Table");
          console.log(profiler.exportTable());
          const json = profiler.exportJSON();
          console.log("\n▸ Load Model Metrics");
          // Filter for operation-level event (kind: "handler"), not RPC phase events
          const loadModelEvent = json.recentEvents?.find((e) => e.op === "loadModel" && e.kind === "handler");
          if (loadModelEvent) {
              const tags = loadModelEvent.tags ?? {};
              const gauges = loadModelEvent.gauges ?? {};
              console.log("  sourceType:", tags["sourceType"] ?? "(not set)");
              console.log("  cacheHit:", tags["cacheHit"] ?? "(not set)");
              console.log("  totalLoadTime:", gauges["totalLoadTime"], "ms");
              console.log("  modelInitializationTime:", gauges["modelInitializationTime"], "ms");
              if (tags["cacheHit"] !== "true") {
                  console.log("  downloadTime:", gauges["downloadTime"] ?? "(cached)", "ms");
                  console.log("  totalBytesDownloaded:", gauges["totalBytesDownloaded"] ?? "(cached)");
                  console.log("  downloadSpeedBps:", gauges["downloadSpeedBps"] ?? "(cached)");
              }
              else {
                  console.log("  (download metrics omitted - cache hit)");
              }
              if (gauges["checksumValidationTime"] !== undefined) {
                  console.log("  checksumValidationTime:", gauges["checksumValidationTime"], "ms");
              }
          }
          else {
              console.log("  (no loadModel handler event captured)");
              // Debug: show what ops are available
              const ops = [
                  ...new Set(json.recentEvents?.map((e) => `${e.op}:${e.kind}`) ?? []),
              ];
              console.log("  Available ops:", ops.join(", "));
          }
          console.log("\n▸ Profiler JSON (structure)");
          console.log("  aggregates:", Object.keys(json.aggregates).length, "metrics");
          console.log("  recentEvents:", json.recentEvents?.length ?? 0, "events");
          console.log("  config:", json.config);
          // Disable profiling
          profiler.disable();
          console.log("\n▸ Profiler disabled:", !profiler.isEnabled());
      }
      catch (error) {
          console.error("✖", error);
          process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>

  <Tab value="ts" label="TypeScript">
    <WrapCode>
      ```ts file=<rootDir>/packages/sdk/examples/profiling/basic.ts title="profiling-basic.ts" lineNumbers
      import {
        completion,
        loadModel,
        unloadModel,
        LLAMA_3_2_1B_INST_Q4_0,
        profiler,
      } from "@qvac/sdk";

      try {
        // Enable profiling globally
        profiler.enable({
          mode: "verbose",
          includeServerBreakdown: true,
        });
        console.log("▸ Profiler enabled:", profiler.isEnabled());

        const modelId = await loadModel({
          modelSrc: LLAMA_3_2_1B_INST_Q4_0,
          onProgress: (p) => {
            const mb = (n: number) => (n / 1e6).toFixed(1);
            const line = `▸ Downloading ${p.percentage.toFixed(0)}% (${mb(p.downloaded)}/${mb(p.total)} MB)`;
            process.stderr.write(process.stderr.isTTY ? `\r${line}` : `${line}\n`);
            if (p.percentage >= 100) process.stderr.write("\n");
          },
        });
        console.log("▸ Model loaded:", modelId);

        console.log("\n▸ Running completion...");
        const result = completion({
          modelId,
          history: [{ role: "user", content: "Say hello in one sentence." }],
          stream: true,
        });

        for await (const token of result.tokenStream) {
          process.stdout.write(token);
        }
        console.log();

        await unloadModel({ modelId });

        // Export profiling data
        console.log("\n▸ Profiler Summary");
        console.log(profiler.exportSummary());

        console.log("\n▸ Profiler Table");
        console.log(profiler.exportTable());

        const json = profiler.exportJSON();
        console.log("\n▸ Load Model Metrics");
        // Filter for operation-level event (kind: "handler"), not RPC phase events
        const loadModelEvent = json.recentEvents?.find(
          (e) => e.op === "loadModel" && e.kind === "handler",
        );
        if (loadModelEvent) {
          const tags = loadModelEvent.tags ?? {};
          const gauges = loadModelEvent.gauges ?? {};
          console.log("  sourceType:", tags["sourceType"] ?? "(not set)");
          console.log("  cacheHit:", tags["cacheHit"] ?? "(not set)");
          console.log("  totalLoadTime:", gauges["totalLoadTime"], "ms");
          console.log(
            "  modelInitializationTime:",
            gauges["modelInitializationTime"],
            "ms",
          );
          if (tags["cacheHit"] !== "true") {
            console.log(
              "  downloadTime:",
              gauges["downloadTime"] ?? "(cached)",
              "ms",
            );
            console.log(
              "  totalBytesDownloaded:",
              gauges["totalBytesDownloaded"] ?? "(cached)",
            );
            console.log(
              "  downloadSpeedBps:",
              gauges["downloadSpeedBps"] ?? "(cached)",
            );
          } else {
            console.log("  (download metrics omitted - cache hit)");
          }
          if (gauges["checksumValidationTime"] !== undefined) {
            console.log(
              "  checksumValidationTime:",
              gauges["checksumValidationTime"],
              "ms",
            );
          }
        } else {
          console.log("  (no loadModel handler event captured)");
          // Debug: show what ops are available
          const ops = [
            ...new Set(json.recentEvents?.map((e) => `${e.op}:${e.kind}`) ?? []),
          ];
          console.log("  Available ops:", ops.join(", "));
        }

        console.log("\n▸ Profiler JSON (structure)");
        console.log("  aggregates:", Object.keys(json.aggregates).length, "metrics");
        console.log("  recentEvents:", json.recentEvents?.length ?? 0, "events");
        console.log("  config:", json.config);

        // Disable profiling
        profiler.disable();
        console.log("\n▸ Profiler disabled:", !profiler.isEnabled());
      } catch (error) {
        console.error("✖", error);
        process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>
</Tabs>

### Per-call

The following script keeps the profiler disabled globally and selectively profiles individual `embed()` calls using the per-call option:

<Tabs>
  <Tab value="js" label="JavaScript" default>
    <WrapCode>
      ```js file=<rootDir>/packages/sdk/dist/examples/profiling/per-call.js title="profiling-per-call.js" lineNumbers
      import { embed, loadModel, unloadModel, GTE_LARGE_FP16, profiler, } from "@qvac/sdk";
      try {
          profiler.disable();
          console.log("▸ Profiler globally enabled:", profiler.isEnabled());
          const modelId = await loadModel({
              modelSrc: GTE_LARGE_FP16,
              onProgress: (p) => {
                  const mb = (n) => (n / 1e6).toFixed(1);
                  const line = `▸ Downloading ${p.percentage.toFixed(0)}% (${mb(p.downloaded)}/${mb(p.total)} MB)`;
                  process.stderr.write(process.stderr.isTTY ? `\r${line}` : `${line}\n`);
                  if (p.percentage >= 100)
                      process.stderr.write("\n");
              },
          });
          console.log("▸ Model loaded:", modelId);
          console.log("\n▸ Embed with per-call profiling");
          const { embedding: embedding1 } = await embed({ modelId, text: "Profile this specific call" }, { profiling: { enabled: true, includeServerBreakdown: true } });
          console.log("▸ Embedding dimensions:", embedding1.length);
          console.log("\n▸ Embed without profiling");
          const { embedding: embedding2 } = await embed({
              modelId,
              text: "This call is not profiled",
          });
          console.log("▸ Embedding dimensions:", embedding2.length);
          console.log("\n▸ Embed with profiling explicitly disabled");
          const { embedding: embedding3 } = await embed({ modelId, text: "Profiling explicitly disabled for this call" }, { profiling: { enabled: false } });
          console.log("▸ Embedding dimensions:", embedding3.length);
          await unloadModel({ modelId });
          console.log("\n▸ Profiler Summary (per-call data only)");
          console.log(profiler.exportSummary());
      }
      catch (error) {
          console.error("✖", error);
          process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>

  <Tab value="ts" label="TypeScript">
    <WrapCode>
      ```ts file=<rootDir>/packages/sdk/examples/profiling/per-call.ts title="profiling-per-call.ts" lineNumbers
      import {
        embed,
        loadModel,
        unloadModel,
        GTE_LARGE_FP16,
        profiler,
      } from "@qvac/sdk";

      try {
        profiler.disable();
        console.log("▸ Profiler globally enabled:", profiler.isEnabled());

        const modelId = await loadModel({
          modelSrc: GTE_LARGE_FP16,
          onProgress: (p) => {
            const mb = (n: number) => (n / 1e6).toFixed(1);
            const line = `▸ Downloading ${p.percentage.toFixed(0)}% (${mb(p.downloaded)}/${mb(p.total)} MB)`;
            process.stderr.write(process.stderr.isTTY ? `\r${line}` : `${line}\n`);
            if (p.percentage >= 100) process.stderr.write("\n");
          },
        });
        console.log("▸ Model loaded:", modelId);

        console.log("\n▸ Embed with per-call profiling");
        const { embedding: embedding1 } = await embed(
          { modelId, text: "Profile this specific call" },
          { profiling: { enabled: true, includeServerBreakdown: true } },
        );
        console.log("▸ Embedding dimensions:", embedding1.length);

        console.log("\n▸ Embed without profiling");
        const { embedding: embedding2 } = await embed({
          modelId,
          text: "This call is not profiled",
        });
        console.log("▸ Embedding dimensions:", embedding2.length);

        console.log("\n▸ Embed with profiling explicitly disabled");
        const { embedding: embedding3 } = await embed(
          { modelId, text: "Profiling explicitly disabled for this call" },
          { profiling: { enabled: false } },
        );
        console.log("▸ Embedding dimensions:", embedding3.length);

        await unloadModel({ modelId });

        console.log("\n▸ Profiler Summary (per-call data only)");
        console.log(profiler.exportSummary());
      } catch (error) {
        console.error("✖", error);
        process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>
</Tabs>

<Callout type="success">
  **Tip:** all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see [SDK quickstart](/quickstart).
</Callout>
