QVAC Logo

Image generation

Text-to-image and image-to-image generation using a customized Diffusion engine.

Overview

Image generation runs on a customized Diffusion engine (qvac-ext-stable-diffusion.cpp). Load a supported model using modelType: "diffusion". Then, provide a text prompt describing the image to generate.

For image-to-image, also pass init_image (a Uint8Array of PNG bytes) — the model transforms the input guided by the prompt instead of starting from noise.

diffusion() returns one or more PNG images as Uint8Array buffers. Use progressStream to track generation progress step-by-step.

Functions

Use the following sequence of function calls:

  1. loadModel()
  2. diffusion()
  3. unloadModel()

For how to use each function, see SDK — API reference.

Models

Supported model families and their file layouts:

  • FLUX.2-klein: split layout — diffusion model *.gguf + LLM text encoder *.gguf (via llmModelSrc) + VAE *.safetensors (via vaeModelSrc).
  • SD1.x, SD2.x: single all-in-one *.gguf file. No companion files needed.
  • SDXL, SD3: may require separate CLIP/T5 text encoder files (clipLModelSrc, clipGModelSrc, t5XxlModelSrc) in modelConfig depending on the model variant.

For models available as constants, see SDK — Models.

Examples

FLUX.2-klein

The following script shows text-to-image generation using FLUX.2-klein with its split-layout model (separate diffusion model, LLM text encoder, and VAE):

diffusion-flux2-klein.js
import { loadModel, unloadModel, diffusion, FLUX_2_KLEIN_4B_Q4_0, FLUX_2_KLEIN_4B_VAE, QWEN3_4B_Q4_K_M } from "@qvac/sdk";
import fs from "fs";
import path from "path";
// FLUX.2 [klein] uses a split-layout: separate diffusion model + LLM text encoder + VAE
const diffusionModelSrc = process.argv[2] || FLUX_2_KLEIN_4B_Q4_0;
const llmModelSrc = process.argv[3] || QWEN3_4B_Q4_K_M;
const vaeModelSrc = process.argv[4] || FLUX_2_KLEIN_4B_VAE;
const prompt = process.argv[5] || "a futuristic city at sunset, photorealistic";
const outputDir = process.argv[6] || ".";
console.log("Loading FLUX.2 [klein] split-layout model...");
const modelId = await loadModel({
    modelSrc: diffusionModelSrc,
    modelType: "diffusion",
    modelConfig: {
        device: "gpu",
        threads: 4,
        llmModelSrc,
        vaeModelSrc,
    },
    onProgress: (p) => console.log(`Loading: ${p.percentage.toFixed(1)}%`),
});
console.log(`Model loaded: ${modelId}`);
console.log(`\nGenerating: "${prompt}"`);
const { progressStream, outputs, stats } = diffusion({
    modelId,
    prompt,
    width: 512,
    height: 512,
    steps: 20,
    guidance: 3.5,
    cfg_scale: 1,
    seed: -1,
});
for await (const { step, totalSteps } of progressStream) {
    process.stdout.write(`\rStep ${step}/${totalSteps}`);
}
console.log();
const buffers = await outputs;
for (let i = 0; i < buffers.length; i++) {
    const outputPath = path.join(outputDir, `flux2_${i}.png`);
    fs.writeFileSync(outputPath, buffers[i]);
    console.log(`Saved: ${outputPath}`);
}
console.log("\nStats:", await stats);
await unloadModel({ modelId, clearStorage: false });
console.log("Done.");
process.exit(0);

Stable Diffusion

The following script shows a minimal text-to-image generation example using a single all-in-one SD 2.1 model:

diffusion-simple.js
import { loadModel, unloadModel, diffusion, SD_V2_1_1B_Q8_0 } from "@qvac/sdk";
import fs from "fs";
// Minimal diffusion example — single GGUF model, no companion files needed.
// Works with SD 1.x / 2.x all-in-one models.
const modelSrc = process.argv[2] || SD_V2_1_1B_Q8_0;
const prompt = process.argv[3] || "a photo of a cat sitting on a windowsill";
const modelId = await loadModel({
    modelSrc,
    modelType: "diffusion",
    modelConfig: { prediction: "v" },
});
const { outputs } = diffusion({ modelId, prompt });
const buffers = await outputs;
fs.writeFileSync("output.png", buffers[0]);
console.log("Saved: output.png");
await unloadModel({ modelId, clearStorage: false });
process.exit(0);

Image-to-image

Pass init_image to transform an existing image guided by a text prompt. Behavior depends on the model family:

  • FLUX.2: in-context conditioning. Requires prediction: "flux2_flow" in modelConfig at loadModel() time; strength is ignored on this path.
  • SD / SDXL / SD3: SDEdit-style. Use strength to control how much the source is preserved (0 = keep source, 1 = ignore source).

The following script loads FLUX.2-klein in split-layout and transforms an input image using in-context conditioning (prediction: "flux2_flow"):

diffusion-flux2-klein-img2img.js
import { loadModel, unloadModel, diffusion, FLUX_2_KLEIN_4B_Q4_0, FLUX_2_KLEIN_4B_VAE, QWEN3_4B_Q4_K_M, } from "@qvac/sdk";
import fs from "fs";
import path from "path";
// img2img with FLUX.2 [klein] split-layout — uses in-context conditioning ("flux2_flow").
const inputPath = process.argv[2];
const prompt = process.argv[3] || "oil painting style, vibrant colors";
const outputDir = process.argv[4] || ".";
const diffusionModelSrc = process.argv[5] || FLUX_2_KLEIN_4B_Q4_0;
const llmModelSrc = process.argv[6] || QWEN3_4B_Q4_K_M;
const vaeModelSrc = process.argv[7] || FLUX_2_KLEIN_4B_VAE;
if (!inputPath) {
    console.error("❌ Error: input image path is required");
    console.error("Usage: bun run bare:example dist/examples/diffusion-flux2-klein-img2img.js <inputImage> [prompt] [outputDir] [diffusionModelSrc] [llmModelSrc] [vaeModelSrc]");
    process.exit(1);
}
try {
    console.log("Loading FLUX.2 [klein] split-layout model...");
    const modelId = await loadModel({
        modelSrc: diffusionModelSrc,
        modelType: "diffusion",
        modelConfig: {
            device: "gpu",
            threads: 4,
            llmModelSrc,
            vaeModelSrc,
            prediction: "flux2_flow",
        },
        onProgress: (p) => console.log(`Loading: ${p.percentage.toFixed(1)}%`),
    });
    console.log(`Model loaded: ${modelId}`);
    const init_image = new Uint8Array(fs.readFileSync(inputPath));
    console.log(`\nTransforming "${inputPath}" with prompt: "${prompt}"`);
    const { progressStream, outputs, stats } = diffusion({
        modelId,
        prompt,
        init_image,
        steps: 20,
        guidance: 3.5,
        cfg_scale: 1,
        seed: -1,
    });
    for await (const { step, totalSteps } of progressStream) {
        process.stdout.write(`\rStep ${step}/${totalSteps}`);
    }
    console.log();
    const buffers = await outputs;
    for (let i = 0; i < buffers.length; i++) {
        const outputPath = path.join(outputDir, `flux2_img2img_${i}.png`);
        fs.writeFileSync(outputPath, buffers[i]);
        console.log(`Saved: ${outputPath}`);
    }
    console.log("\nStats:", await stats);
    await unloadModel({ modelId, clearStorage: false });
    console.log("Done.");
    process.exit(0);
}
catch (error) {
    console.error("❌ Error:", error);
    process.exit(1);
}

Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.

On this page