# Build an Expo app (/tutorials/expo)



## What we'll build

We'll build an LLM chat mobile application using the following stack:

* [**Expo**](https://expo.dev) for for building and running a React Native app; and
* **QVAC** to run LLM inference locally.

## Prerequisites

* npm $\geq$ v10.9
* Linux/macOS (Windows with small adjustments)
* Mobile development environment with a physical device (iOS or Android)

<Callout type="info">
  Due to limitations with `llamacpp`, QVAC currently does not run on emulators.
  You **must** use a physical device.
</Callout>

<Callout title="On Windows" type="info">
  Some commands are Bash‑specific. On Windows, use PowerShell/WSL or adapt them.
</Callout>

## Step 1: set up your development environment

Follow Expo official documentation to set it up on your machine:

* [Expo docs — get started](https://docs.expo.dev/get-started/introduction/)
* [Expo docs — set up your environment](https://docs.expo.dev/get-started/set-up-your-environment/)

Confirm you can run the default template on your device.

## Step 2: set up an Expo project

Let's use the official Expo scaffold to create a minimal app structure.

<Steps>
  <Step>
    Create a new project:

    ```bash
    npx create-expo-app@latest qvac-expo-chat --template blank-typescript@sdk-54
    cd qvac-expo-chat
    ```

    This creates a minimal TypeScript Expo project with a single `App.tsx` entry point.

    <Callout type="info">
      During the SDK 55 transition period, Expo Go on a physical device requires an SDK 54 project. We pin `blank-typescript@sdk-54` to ensure compatibility with Expo Go. See [Expo docs — create a project](https://docs.expo.dev/get-started/create-a-project/) for details.
    </Callout>
  </Step>

  <Step>
    Start the Expo dev server:

    ```bash
    npx expo start
    ```
  </Step>

  <Step>
    Then run the app on your physical device following Expo's official workflow.
  </Step>
</Steps>

At this point, you should see the default Expo screen rendered on your phone. This verifies your Expo setup before we add QVAC.

## Step 3: install QVAC

In this step we'll add the QVAC SDK dependency and complete the Expo-specific installation steps required for running local inference on a physical mobile device.

<Steps>
  <Step>
    Install the SDK:

    ```bash
    npm i @qvac/sdk
    ```
  </Step>

  <Step>
    Add the following peer dependencies to your `package.json`:

    ```json title="package.json"
    {
      "dependencies": {
        "@qvac/sdk": "^0.7.0",
        "bare-rpc": "^1.0.0", // [!code ++]
        "expo": "~54.0.33",
        "expo-status-bar": "~3.0.9",
        "react": "19.1.0",
        "react-native": "0.81.5",
        "react-native-bare-kit": "^0.11.5" // [!code ++]
      },
      "devDependencies": {
        "@types/react": "~19.1.0",
        "bare-pack": "^1.5.1", // [!code ++]
        "typescript": "~5.9.2"
      }
    }
    ```
  </Step>

  <Step>
    Install the dependencies:

    ```bash
    npm install
    npx expo install expo-file-system expo-build-properties expo-device
    ```

    <Callout type="success">
      **Tip:** use `npx expo install` for all `expo-*` packages to ensure compatibility with your project's Expo SDK version.
    </Callout>
  </Step>

  <Step>
    Configure `expo-build-properties` and add `@qvac/sdk/expo-plugin` to the `plugins` array in your `app.json`:

    ```json title="app.json"
    {
      "expo": {
        "plugins": [
          ["expo-build-properties", { // [!code ++]
            "android": { "minSdkVersion": 29 } // [!code ++]
          }], // [!code ++]
          "@qvac/sdk/expo-plugin" // [!code ++]
        ]
      }
    }
    ```
  </Step>

  <Step>
    Prebuild your project to generate the native files:

    ```bash
    npx expo prebuild
    ```
  </Step>
</Steps>

## Step 4: make a smoke test

Before creating the chat UI, we'll run a smoke test to validate the full QVAC lifecycle in an Expo app.

<Steps>
  <Step>
    Replace the contents of `App.tsx` with the following code:

    <WrapCode>
      ```ts title="Apps.tsx" lineNumbers
      import React, { useEffect, useState } from "react";
      import { Platform, SafeAreaView, StatusBar, StyleSheet, Text, View } from "react-native";

      import {
        completion,
        downloadAsset,
        LLAMA_3_2_1B_INST_Q4_0,
        loadModel,
        type ModelProgressUpdate,
        unloadModel,
        VERBOSITY,
      } from "@qvac/sdk";

      export default function App() {
        const [status, setStatus] = useState("Starting…");
        const [output, setOutput] = useState("");
        const [progressPct, setProgressPct] = useState<number | null>(null);

        useEffect(() => {
          // Track model lifecycle and cancellation across async steps.
          let modelId: string | null = null;
          let cancelled = false;

          (async () => {
            try {
              setStatus("Downloading model…");
              await downloadAsset({
                assetSrc: LLAMA_3_2_1B_INST_Q4_0,
                onProgress: (progress: ModelProgressUpdate) => {
                  if (!cancelled) setProgressPct(Math.round(progress.percentage));
                },
              });

              if (cancelled) return;
              setProgressPct(null);

              setStatus("Loading model…");
              modelId = await loadModel({
                modelSrc: LLAMA_3_2_1B_INST_Q4_0,
                modelType: "llm",
                modelConfig: {
                  device: "gpu",
                  ctx_size: 2048,
                  verbosity: VERBOSITY.ERROR,
                },
                onProgress: (progress: ModelProgressUpdate) => {
                  if (!cancelled) setProgressPct(Math.round(progress.percentage));
                },
              });
              setProgressPct(null);

              if (cancelled) return;

              // 3) Run a streaming completion and update UI as tokens arrive.
              setStatus("Running completion…");

              const result = completion({
                modelId,
                history: [{ role: "user", content: "Say hello in one short sentence." }],
                stream: true,
              });

              let acc = "";
              for await (const token of result.tokenStream) {
                acc += token;
                if (!cancelled) setOutput(acc);
              }

              setStatus("Done ✅");
            } catch (e: any) {
              setStatus(`Error: ${e?.message ?? String(e)}`);
            }
          })();

          return () => {
            // Cleanup: cancel any in-flight work and unload the model.
            cancelled = true;
            if (modelId) {
              void unloadModel({ modelId, clearStorage: false }).catch(() => {});
            }
          };
        }, []);

        return (
          <SafeAreaView style={styles.safe}>
            {/* Minimal UI for the smoke test: title, status, streamed output. */}
            <View style={styles.container}>
              <Text style={styles.h1}>QVAC Smoke Test</Text>
              <Text style={styles.status}>
                {status}
                {progressPct != null ? ` (${progressPct}%)` : ""}
              </Text>
              {progressPct != null && (
                <View style={styles.progressBar}>
                  <View style={[styles.progressFill, { width: `${progressPct}%` }]} />
                </View>
              )}
              <Text style={styles.output}>{output}</Text>
            </View>
          </SafeAreaView>
        );
      }

      const styles = StyleSheet.create({
        // Basic dark theme layout.
        safe: { flex: 1, backgroundColor: "#0B0B0F", paddingTop: Platform.OS === "android" ? StatusBar.currentHeight : 0 },
        container: { flex: 1, padding: 16, gap: 12 },
        h1: { color: "white", fontSize: 18, fontWeight: "600" },
        status: { color: "#A7A7B3" },
        progressBar: {
          height: 8,
          backgroundColor: "#1A1A22",
          borderRadius: 4,
          overflow: "hidden",
        },
        progressFill: {
          height: "100%",
          backgroundColor: "#22C55E",
          borderRadius: 4,
        },
        output: { color: "white", fontSize: 16, lineHeight: 22 },
      });
      ```
    </WrapCode>
  </Step>

  <Step>
    Run the app on your physical device:

    ```bash
    # From the project root:
    npx expo run:android --device
    # or
    npx expo run:ios --device
    ```
  </Step>
</Steps>

On the first run, the model may take a while to download and load. Keep an eye on the terminal logs. To confirm the smoke test worked you should see:

* A status line progressing through **Downloading model…**, **Loading model…**, and **Running completion…**
* A short assistant output streaming into the UI
* A final status of **Done ✅**

## Step 5: add the chat UI

Now that QVAC is working in your Expo app, we'll replace the smoke test UI with a minimal chat interface:

* A message list with left/right bubbles
* A text input for composing messages
* Streaming updates into the latest assistant message bubble

<Steps>
  <Step>
    Replace the contents of `App.tsx` with the following code:

    <WrapCode>
      ```ts title="Apps.tsx" lineNumbers
      import React, { useEffect, useMemo, useRef, useState } from "react";
      import {
        ActivityIndicator,
        FlatList,
        KeyboardAvoidingView,
        Platform,
        SafeAreaView,
        StatusBar,
        StyleSheet,
        Text,
        TextInput,
        View,
      } from "react-native";

      import {
        completion,
        downloadAsset,
        LLAMA_3_2_1B_INST_Q4_0,
        loadModel,
        type ModelProgressUpdate,
        unloadModel,
        VERBOSITY,
      } from "@qvac/sdk";

      // Basic chat message shape for the UI.
      type Role = "user" | "assistant";
      type ChatMessage = { id: string; role: Role; content: string };

      function makeId() {
        // Lightweight unique-ish ID for list keys and message tracking.
        return `${Date.now()}-${Math.random().toString(16).slice(2)}`;
      }

      export default function App() {
        // Model lifecycle state.
        const [modelId, setModelId] = useState<string | null>(null);
        const [status, setStatus] = useState<string>("Initializing…");
        const [downloadPct, setDownloadPct] = useState<number | null>(null);

        // Chat UI state.
        const [input, setInput] = useState("");
        const [messages, setMessages] = useState<ChatMessage[]>([]);
        const [isGenerating, setIsGenerating] = useState(false);

        // Keep refs to the list and latest message array for async usage.
        const listRef = useRef<FlatList<ChatMessage>>(null);
        const messagesRef = useRef<ChatMessage[]>([]);
        messagesRef.current = messages;

        // Enable send only when ready and input isn't empty.
        const canSend = useMemo(() => {
          return !!modelId && !isGenerating && input.trim().length > 0;
        }, [modelId, isGenerating, input]);

        // Keep scrolled to bottom as messages grow.
        useEffect(() => {
          const t = setTimeout(() => {
            listRef.current?.scrollToEnd({ animated: true });
          }, 0);
          return () => clearTimeout(t);
        }, [messages]);

        useEffect(() => {
          // Initialize the model once on mount.
          let cancelled = false;

          (async () => {
            try {
              setStatus("Downloading model…");

              await downloadAsset({
                assetSrc: LLAMA_3_2_1B_INST_Q4_0,
                onProgress: (progress: ModelProgressUpdate) => {
                  if (!cancelled) setDownloadPct(Math.round(progress.percentage));
                },
              });

              if (cancelled) return;

              // Load the model into memory so we can run completions.
              setStatus("Loading model into memory…");

              const id = await loadModel({
                modelSrc: LLAMA_3_2_1B_INST_Q4_0,
                modelType: "llm",
                modelConfig: {
                  device: "gpu",
                  ctx_size: 2048,
                  verbosity: VERBOSITY.ERROR,
                },
                onProgress: (progress: ModelProgressUpdate) => {
                  if (!cancelled) setDownloadPct(Math.round(progress.percentage));
                },
              });

              if (cancelled) return;

              setModelId(id);
              setStatus("Ready");
              setDownloadPct(null);
            } catch (e: any) {
              if (!cancelled) {
                setStatus(`Init failed: ${e?.message ?? String(e)}`);
              }
            }
          })();

          return () => {
            // Cleanup on unmount: stop updates and unload the model.
            cancelled = true;

            // Cleanup: unload the model (don’t clear cache by default).
            // Note: React cleanup can’t be async directly, so we fire-and-forget.
            const id = modelId;
            if (id) {
              void unloadModel({ modelId: id, clearStorage: false }).catch(() => {});
            }
          };
          // Intentionally do NOT depend on modelId to avoid re-running init.
          // eslint-disable-next-line react-hooks/exhaustive-deps
        }, []);

        async function handleSend() {
          // Guard against sending before the model is ready or while generating.
          if (!modelId || isGenerating) return;

          const trimmed = input.trim();
          if (!trimmed) return;

          setInput("");
          setIsGenerating(true);

          // Append user message and a placeholder assistant message for streaming.
          const userMsg: ChatMessage = { id: makeId(), role: "user", content: trimmed };
          const assistantId = makeId();
          const assistantMsg: ChatMessage = { id: assistantId, role: "assistant", content: "" };

          setMessages((prev) => [...prev, userMsg, assistantMsg]);

          try {
            // Build chat history for the completion request.
            const history = [...messagesRef.current, userMsg].map((m) => ({
              role: m.role,
              content: m.content,
            }));

            // Run a streaming completion and update the last assistant bubble.
            const result = completion({
              modelId,
              history,
              stream: true,
            });

            let acc = "";

            for await (const token of result.tokenStream) {
              acc += token;

              // Update only the last assistant message content
              setMessages((prev) =>
                prev.map((m) => (m.id === assistantId ? { ...m, content: acc } : m))
              );
            }

            // Optional: stats (log only)
            try {
              const stats = await result.stats;
              console.log("📊 Completion stats:", stats);
            } catch {}
          } catch (e: any) {
            // Show any error in the assistant bubble.
            setMessages((prev) =>
              prev.map((m) =>
                m.id === assistantId
                  ? { ...m, content: `❌ Error: ${e?.message ?? String(e)}` }
                  : m
              )
            );
          } finally {
            setIsGenerating(false);
          }
        }

        return (
          <SafeAreaView style={styles.safe}>
            {/* Chat layout: header, message list, input row, and hint. */}
            <KeyboardAvoidingView
              style={styles.safe}
              behavior="padding"
              keyboardVerticalOffset={Platform.OS === "ios" ? 8 : StatusBar.currentHeight || 0}
            >
              <View style={styles.header}>
                <Text style={styles.title}>QVAC Expo Chat</Text>
                <Text style={styles.subtitle}>
                  {status}
                  {downloadPct != null ? ` (${downloadPct}%)` : ""}
                </Text>
                {downloadPct != null && (
                  <View style={styles.progressBar}>
                    <View style={[styles.progressFill, { width: `${downloadPct}%` }]} />
                  </View>
                )}
              </View>

              <View style={styles.chat}>
                <FlatList
                  ref={listRef}
                  data={messages}
                  keyExtractor={(m) => m.id}
                  renderItem={({ item }) => (
                    <View
                      style={[
                        styles.bubble,
                        item.role === "user" ? styles.bubbleUser : styles.bubbleAssistant,
                      ]}
                    >
                      <Text style={styles.bubbleText}>{item.content}</Text>
                    </View>
                  )}
                  contentContainerStyle={styles.chatContent}
                />
              </View>

              <View style={styles.inputRow}>
                <TextInput
                  style={styles.input}
                  value={input}
                  onChangeText={setInput}
                  placeholder={modelId ? "Type a message…" : "Loading model…"}
                  editable={!!modelId && !isGenerating}
                  returnKeyType="send"
                  onSubmitEditing={handleSend}
                  blurOnSubmit={false}
                />
                {isGenerating ? <ActivityIndicator /> : null}
              </View>

              <Text style={styles.hint}>
                Press “send/enter” to submit. Messages are streamed token-by-token.
              </Text>
            </KeyboardAvoidingView>
          </SafeAreaView>
        );
      }

      const styles = StyleSheet.create({
        // Simple dark theme chat UI.
        safe: { flex: 1, backgroundColor: "#0B0B0F", paddingTop: Platform.OS === "android" ? StatusBar.currentHeight : 0 },
        header: { paddingHorizontal: 16, paddingTop: 12, paddingBottom: 8 },
        title: { color: "white", fontSize: 18, fontWeight: "600" },
        subtitle: { color: "#A7A7B3", marginTop: 4 },
        progressBar: {
          height: 8,
          backgroundColor: "#1A1A22",
          borderRadius: 4,
          overflow: "hidden",
          marginTop: 8,
        },
        progressFill: {
          height: "100%",
          backgroundColor: "#22C55E",
          borderRadius: 4,
        },

        chat: { flex: 1 },
        chatContent: { paddingHorizontal: 16, paddingVertical: 12, gap: 10 },

        bubble: {
          maxWidth: "85%",
          paddingHorizontal: 12,
          paddingVertical: 10,
          borderRadius: 14,
        },
        bubbleUser: {
          alignSelf: "flex-end",
          backgroundColor: "#2B2BFF",
        },
        bubbleAssistant: {
          alignSelf: "flex-start",
          backgroundColor: "#1A1A22",
        },
        bubbleText: { color: "white", lineHeight: 20 },

        inputRow: {
          paddingHorizontal: 16,
          paddingVertical: 10,
          borderTopWidth: StyleSheet.hairlineWidth,
          borderTopColor: "#2A2A33",
          flexDirection: "row",
          gap: 10,
          alignItems: "center",
        },
        input: {
          flex: 1,
          backgroundColor: "#121219",
          color: "white",
          paddingHorizontal: 12,
          paddingVertical: 10,
          borderRadius: 12,
        },
        hint: {
          paddingHorizontal: 16,
          paddingBottom: 12,
          color: "#7E7E8A",
          fontSize: 12,
        },
      });
      ```
    </WrapCode>
  </Step>
</Steps>

The assistant response is streamed **token by token** as the model generates text.

## Task completed

Run the app again on your physical device:

```bash
npx expo run:ios --device
# or
npx expo run:android --device
```

On the first run, the model may download from peers (watch the terminal for progress). Once it finishes, type a message and press **Enter** or click **Send** — the response should stream into the UI token by token:
