# OCR (/ai-capabilities/ocr)


## Overview

OCR uses **ONNX runtime** as the inference engine. It runs a two-stage pipeline and requires compatible models for both stages:

* **Text detection**: locate text regions in an image
* **Text recognition**: decode characters in detected regions

Load supported models using `modelType: "ocr"`. Then, provide an image as either a file path (string) or an in-memory buffer. Each OCR block contains extracted text and may include `bbox` (bounding box coordinates) and `confidence` (recognition score).

## Functions

Use the following sequence of function calls:

1. [`loadModel()`](/reference/api#loadmodel)
2. [`ocr()`](/reference/api#ocr)
3. [`unloadModel()`](/reference/api#unloadmodel)

For how to use each function, see [SDK — API reference](/reference/api/).

## Models

You can load any ONNX Runtime-compatible OCR pipeline. Required files: `detector_craft.onnx` + `recognizer_<lang>.onnx` (file format: `*.onnx`).

For models available as constants, see [SDK — Models](/introduction#models).

## Example

The following script shows an example of OCR:

<Tabs>
  <Tab value="js" label="JavaScript" default>
    <WrapCode>
      ```js file=<rootDir>/packages/sdk/dist/examples/ocr-fasttext.js title="ocr.js" lineNumbers
      /**
       * OCR example using the QVAC SDK.
       *
       * Usage:
       *   bun examples/ocr-fasttext.ts [path-to-image]
       *
       * This example requires a test image (default: examples/image/basic_test.bmp).
       * Sample images are available in the QVAC source repository, but not included in the published npm package.
       * Pass a custom image path, or download the default image into examples/image/:
       *   https://github.com/tetherto/qvac/blob/main/packages/sdk/examples/image/basic_test.bmp
       */
      import { close, loadModel, ocr, OCR_LATIN_RECOGNIZER_1, unloadModel, } from "@qvac/sdk";
      import path from "path";
      import { fileURLToPath } from "url";
      const __dirname = path.dirname(fileURLToPath(import.meta.url));
      const imagePath = process.argv[2] || path.join(__dirname, "image/basic_test.bmp");
      try {
          console.log("▸ Loading OCR model...");
          const modelId = await loadModel({
              modelSrc: OCR_LATIN_RECOGNIZER_1,
              modelConfig: {
                  langList: ["en"],
                  useGPU: true,
                  timeout: 30000,
                  magRatio: 1.5,
                  defaultRotationAngles: [90, 180, 270],
                  contrastRetry: false,
                  lowConfidenceThreshold: 0.5,
                  recognizerBatchSize: 1,
              },
          });
          console.log(`▸ Model loaded successfully! Model ID: ${modelId}`);
          console.log(`\n▸ Running OCR on: ${imagePath}`);
          const { blocks } = ocr({
              modelId,
              image: imagePath,
              options: {
                  paragraph: false,
              },
          });
          const result = await blocks;
          console.log("\n▸ OCR Results:");
          console.log("▸ ================================");
          for (const block of result) {
              console.log(block.text);
              if (block.bbox) {
                  console.log(`▸ BBox: [${block.bbox.join(", ")}]`);
              }
              if (block.confidence !== undefined) {
                  console.log(`▸ Confidence: ${block.confidence}`);
              }
          }
          console.log("\n▸ ================================");
          console.log("\n▸ Unloading model...");
          await unloadModel({ modelId, clearStorage: false });
          console.log("▸ Model unloaded successfully.");
          process.exit(0);
      }
      catch (error) {
          console.error("✖", error);
          await close();
      }
      ```
    </WrapCode>
  </Tab>

  <Tab value="ts" label="TypeScript">
    <WrapCode>
      ```ts file=<rootDir>/packages/sdk/examples/ocr-fasttext.ts title="ocr.ts" lineNumbers
      /**
       * OCR example using the QVAC SDK.
       *
       * Usage:
       *   bun examples/ocr-fasttext.ts [path-to-image]
       *
       * This example requires a test image (default: examples/image/basic_test.bmp).
       * Sample images are available in the QVAC source repository, but not included in the published npm package.
       * Pass a custom image path, or download the default image into examples/image/:
       *   https://github.com/tetherto/qvac/blob/main/packages/sdk/examples/image/basic_test.bmp
       */
      import {
        close,
        loadModel,
        ocr,
        OCR_LATIN_RECOGNIZER_1,
        unloadModel,
      } from "@qvac/sdk";
      import path from "path";
      import { fileURLToPath } from "url";

      const __dirname = path.dirname(fileURLToPath(import.meta.url));
      const imagePath =
        process.argv[2] || path.join(__dirname, "image/basic_test.bmp");

      try {
        console.log("▸ Loading OCR model...");
        const modelId = await loadModel({
          modelSrc: OCR_LATIN_RECOGNIZER_1,
          modelConfig: {
            langList: ["en"],
            useGPU: true,
            timeout: 30000,
            magRatio: 1.5,
            defaultRotationAngles: [90, 180, 270],
            contrastRetry: false,
            lowConfidenceThreshold: 0.5,
            recognizerBatchSize: 1,
          },
        });
        console.log(`▸ Model loaded successfully! Model ID: ${modelId}`);

        console.log(`\n▸ Running OCR on: ${imagePath}`);
        const { blocks } = ocr({
          modelId,
          image: imagePath,
          options: {
            paragraph: false,
          },
        });

        const result = await blocks;

        console.log("\n▸ OCR Results:");
        console.log("▸ ================================");
        for (const block of result) {
          console.log(block.text);
          if (block.bbox) {
            console.log(`▸ BBox: [${block.bbox.join(", ")}]`);
          }
          if (block.confidence !== undefined) {
            console.log(`▸ Confidence: ${block.confidence}`);
          }
        }
        console.log("\n▸ ================================");
        console.log("\n▸ Unloading model...");
        await unloadModel({ modelId, clearStorage: false });
        console.log("▸ Model unloaded successfully.");
        process.exit(0);
      } catch (error) {
        console.error("✖", error);
        await close();
      }
      ```
    </WrapCode>
  </Tab>
</Tabs>

<Callout type="success">
  **Tip:** all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see [SDK quickstart](/quickstart).
</Callout>