# Delegated inference (/p2p-capabilities/delegated-inference)



## Overview

Lets a **consumer** delegate inference requests to a remote **provider** in a P2P manner over the [Hyperswarm](https://docs.pears.com/building-blocks/hyperswarm) DHT. Use it when an inference requires more resources than the local device is able to provide.

Connectivity is direct: the consumer opens a `dht.connect(providerPublicKey)` connection straight to the provider — there is no topic or discovery phase. The provider exposes itself on the DHT via its keypair, and consumers connect by public key.

Delegation is configured at model-load time by passing a `delegate` object to [`loadModel()`](/reference/api#loadmodel). The provider is started separately using [`startQVACProvider()`](/reference/api#startqvacprovider).

## Functions

**Provider:**

1. [`startQVACProvider()`](/reference/api#startqvacprovider) — bind the DHT server on the provider's keypair and start accepting delegated requests
2. [`stopQVACProvider()`](/reference/api#stopqvacprovider) — stop accepting requests

**Consumer:**

1. [`loadModel()`](/reference/api#loadmodel) — with `delegate` option
2. [`completion()`](/reference/api#completion) / [`transcribe()`](/reference/api#transcribe) / [`translate()`](/reference/api#translate) / etc. — same as local
3. [`unloadModel()`](/reference/api#unloadmodel)

For how to use each function, see [SDK — API reference](/reference/api/).

## Provider

Binds the DHT server on its keypair and serves delegated requests. It publishes its public key; consumers use that key to connect directly.

## Consumer

Creates a delegated model via `loadModel({ delegate: ... })`.
`delegate` main options:

* `providerPublicKey`: provider public key (required)
* `timeout`: request timeout in ms (optional). Use a generous value (e.g. `60_000`) on the first call — cold-DHT bootstrap can take 15–45s. Subsequent calls reuse the open socket and are sub-second.
* `fallbackToLocal`: if `true`, run locally when delegation fails (optional)
* `forceNewConnection`: if `true`, do not reuse cached connections (optional)

## Examples

### Consumer

The following script shows an example of a consumer that delegates `completion()` requests to a provider:

<Tabs>
  <Tab value="js" label="JavaScript" default>
    <WrapCode>
      ```js file=<rootDir>/packages/sdk/dist/examples/delegated-inference/consumer.js title="delegated-inference-consumer.js" lineNumbers
      import { completion, LLAMA_3_2_1B_INST_Q4_0, loadModel, close, } from "@qvac/sdk";
      const providerPublicKey = process.argv[2];
      if (!providerPublicKey) {
          console.error("✖ Provider public key is required. Usage: node consumer.ts <provider-public-key> [consumer-seed]");
          process.exit(1);
      }
      try {
          // Optional: Consumer seed for deterministic consumer identity (for firewall testing)
          const consumerSeed = process.argv[3];
          process.env["QVAC_HYPERSWARM_SEED"] = consumerSeed;
          console.log(`▸ Testing delegated inference`);
          console.log(`▸ Provider: ${providerPublicKey}`);
          if (consumerSeed) {
              console.log(`▸ Consumer seed: ${consumerSeed.substring(0, 16)}... (deterministic identity)`);
          }
          else {
              console.log(`▸ No consumer seed provided (random identity)`);
          }
          const modelId = await loadModel({
              modelSrc: LLAMA_3_2_1B_INST_Q4_0,
              delegate: {
                  providerPublicKey,
                  // Generous timeout for the first call on a cold DHT: bootstrapping
                  // hyperdht and looking up the provider's key can take 15–45s on the
                  // very first run. Subsequent connections in the same process are
                  // sub-second because the DHT is already warm.
                  timeout: 60_000,
                  fallbackToLocal: true, // Optional: Fall back to local inference if delegation fails
                  // forceNewConnection: true, // Optional: Force a new connection instead of reusing cached one
              },
              onProgress: (p) => {
                  const mb = (n) => (n / 1e6).toFixed(1);
                  const line = `▸ Downloading ${p.percentage.toFixed(0)}% (${mb(p.downloaded)}/${mb(p.total)} MB)`;
                  process.stderr.write(process.stderr.isTTY ? `\r${line}` : `${line}\n`);
                  if (p.percentage >= 100)
                      process.stderr.write("\n");
              },
          });
          console.log(`▸ Delegated model registered: ${modelId}`);
          const response = completion({
              modelId,
              history: [{ role: "user", content: "Hello!" }],
              stream: true,
          });
          for await (const token of response.tokenStream) {
              process.stdout.write(token);
          }
          console.log("\n▸ Stats:", await response.stats);
          console.log("▸ Delegation infrastructure working! Server correctly detected and routed the delegated request.");
          void close();
      }
      catch (error) {
          console.error("✖", error);
          process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>

  <Tab value="ts" label="TypeScript">
    <WrapCode>
      ```ts file=<rootDir>/packages/sdk/examples/delegated-inference/consumer.ts title="delegated-inference-consumer.ts" lineNumbers
      import {
        completion,
        LLAMA_3_2_1B_INST_Q4_0,
        loadModel,
        close,
      } from "@qvac/sdk";

      const providerPublicKey = process.argv[2];
      if (!providerPublicKey) {
        console.error(
          "✖ Provider public key is required. Usage: node consumer.ts <provider-public-key> [consumer-seed]",
        );
        process.exit(1);
      }

      try {
        // Optional: Consumer seed for deterministic consumer identity (for firewall testing)
        const consumerSeed = process.argv[3];

        process.env["QVAC_HYPERSWARM_SEED"] = consumerSeed;

        console.log(`▸ Testing delegated inference`);
        console.log(`▸ Provider: ${providerPublicKey}`);
        if (consumerSeed) {
          console.log(
            `▸ Consumer seed: ${consumerSeed.substring(0, 16)}... (deterministic identity)`,
          );
        } else {
          console.log(`▸ No consumer seed provided (random identity)`);
        }

        const modelId = await loadModel({
          modelSrc: LLAMA_3_2_1B_INST_Q4_0,
          delegate: {
            providerPublicKey,
            // Generous timeout for the first call on a cold DHT: bootstrapping
            // hyperdht and looking up the provider's key can take 15–45s on the
            // very first run. Subsequent connections in the same process are
            // sub-second because the DHT is already warm.
            timeout: 60_000,
            fallbackToLocal: true, // Optional: Fall back to local inference if delegation fails
            // forceNewConnection: true, // Optional: Force a new connection instead of reusing cached one
          },
          onProgress: (p) => {
            const mb = (n: number) => (n / 1e6).toFixed(1);
            const line = `▸ Downloading ${p.percentage.toFixed(0)}% (${mb(p.downloaded)}/${mb(p.total)} MB)`;
            process.stderr.write(process.stderr.isTTY ? `\r${line}` : `${line}\n`);
            if (p.percentage >= 100) process.stderr.write("\n");
          },
        });

        console.log(`▸ Delegated model registered: ${modelId}`);

        const response = completion({
          modelId,
          history: [{ role: "user", content: "Hello!" }],
          stream: true,
        });

        for await (const token of response.tokenStream) {
          process.stdout.write(token);
        }

        console.log("\n▸ Stats:", await response.stats);

        console.log(
          "▸ Delegation infrastructure working! Server correctly detected and routed the delegated request.",
        );

        void close();
      } catch (error) {
        console.error("✖", error);
        process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>
</Tabs>

### Provider

The following script shows an example of starting a provider and printing its `publicKey` for consumers:

<Tabs>
  <Tab value="js" label="JavaScript" default>
    <WrapCode>
      ```js file=<rootDir>/packages/sdk/dist/examples/delegated-inference/provider.js title="delegated-inference-provider.js" lineNumbers
      import { startQVACProvider } from "@qvac/sdk";
      // Optional: Seed for deterministic provider identity (64-character hex string)
      const seed = process.argv[2];
      if (seed) {
          process.env["QVAC_HYPERSWARM_SEED"] = seed;
      }
      // Optional: Consumer public key for firewall (allow only this consumer)
      const allowedConsumerPublicKey = process.argv[3];
      console.log(`▸ Starting provider service...`);
      try {
          if (allowedConsumerPublicKey) {
              console.log(`▸ Firewall enabled: only allowing consumer ${allowedConsumerPublicKey}`);
          }
          const response = await startQVACProvider({
              firewall: allowedConsumerPublicKey
                  ? {
                      mode: "allow",
                      publicKeys: [allowedConsumerPublicKey],
                  }
                  : undefined,
          });
          console.log("▸ Provider service started successfully!");
          console.log("▸ Provider is now available for delegated inference requests");
          console.log("");
          console.log("▸ Connection Details:");
          console.log(`▸ Provider Public Key: ${response.publicKey}`);
          console.log("");
          console.log("▸ Consumer command:");
          console.log(`   node consumer.ts ${response.publicKey}`);
          console.log("");
          console.log("▸ To reproduce this provider identity:");
          console.log(`   node provider.ts ${seed || "<random-seed>"}`);
          if (!seed) {
              console.log("   (Note: seed was random this time, set one for reproducible identity)");
          }
          console.log("");
          console.log("▸ For firewall testing:");
          console.log("   1. Generate a consumer seed (64-char hex)");
          console.log("   2. Get consumer public key: getConsumerPublicKey(consumerSeed)");
          console.log("   3. Restart provider with consumer public key as 2nd argument");
          console.log(`   4. Run consumer with: node consumer.ts ${response.publicKey} <consumer-seed>`);
          console.log("▸ Provider is running... Press Ctrl+C to stop");
          process.on("SIGINT", () => {
              console.log("\n▸ Provider service stopped");
              process.exit(0);
          });
          process.stdin.resume();
      }
      catch (error) {
          console.error("✖", error);
          process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>

  <Tab value="ts" label="TypeScript">
    <WrapCode>
      ```ts file=<rootDir>/packages/sdk/examples/delegated-inference/provider.ts title="delegated-inference-provider.ts" lineNumbers
      import { startQVACProvider } from "@qvac/sdk";

      // Optional: Seed for deterministic provider identity (64-character hex string)
      const seed: string | undefined = process.argv[2];

      if (seed) {
        process.env["QVAC_HYPERSWARM_SEED"] = seed;
      }

      // Optional: Consumer public key for firewall (allow only this consumer)
      const allowedConsumerPublicKey: string | undefined = process.argv[3];

      console.log(`▸ Starting provider service...`);

      try {
        if (allowedConsumerPublicKey) {
          console.log(
            `▸ Firewall enabled: only allowing consumer ${allowedConsumerPublicKey}`,
          );
        }

        const response = await startQVACProvider({
          firewall: allowedConsumerPublicKey
            ? {
                mode: "allow" as const,
                publicKeys: [allowedConsumerPublicKey],
              }
            : undefined,
        });

        console.log("▸ Provider service started successfully!");
        console.log("▸ Provider is now available for delegated inference requests");
        console.log("");
        console.log("▸ Connection Details:");
        console.log(`▸ Provider Public Key: ${response.publicKey}`);
        console.log("");
        console.log("▸ Consumer command:");
        console.log(`   node consumer.ts ${response.publicKey}`);
        console.log("");
        console.log("▸ To reproduce this provider identity:");
        console.log(`   node provider.ts ${seed || "<random-seed>"}`);
        if (!seed) {
          console.log(
            "   (Note: seed was random this time, set one for reproducible identity)",
          );
        }
        console.log("");
        console.log("▸ For firewall testing:");
        console.log("   1. Generate a consumer seed (64-char hex)");
        console.log(
          "   2. Get consumer public key: getConsumerPublicKey(consumerSeed)",
        );
        console.log(
          "   3. Restart provider with consumer public key as 2nd argument",
        );
        console.log(
          `   4. Run consumer with: node consumer.ts ${response.publicKey} <consumer-seed>`,
        );

        console.log("▸ Provider is running... Press Ctrl+C to stop");
        process.on("SIGINT", () => {
          console.log("\n▸ Provider service stopped");
          process.exit(0);
        });

        process.stdin.resume();
      } catch (error) {
        console.error("✖", error);
        process.exit(1);
      }
      ```
    </WrapCode>
  </Tab>
</Tabs>

<Callout type="success">
  **Tip:** all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see [SDK quickstart](/quickstart).
</Callout>

## Notes

* Consumers do not handle reconnection automatically yet. If the provider restarts, restart the consumer.
* To stop a running provider, call [`stopQVACProvider()`](/reference/api#stopqvacprovider).
* When starting the provider, you can optionally set a firewall rule to allow/deny specific consumer public keys.
* Cold-start DHT bootstrap on the first connect can take 15–45s; subsequent connections in the same process are sub-second.
