AiSearchInstance

Source: src/Cloudflare/AiSearch/AiSearchInstance.ts

A Cloudflare AI Search (formerly AutoRAG) instance — a fully managed retrieval-augmented generation pipeline over your own data.

An instance continuously indexes a data source (an R2 bucket or a web crawl), embeds it into a managed Vectorize index, and answers search and chat queries against it. Creation returns immediately; the initial indexing run happens asynchronously.

The instance instanceId, namespace, type, source, and embeddingModel are fixed at creation — changing any of them triggers a replacement. Everything else (models, chunking, caching, reranking, public endpoint, sync interval) is mutable in place.

For the common R2 case, prefer the {@link AiSearch} construct, which also mints the service token the indexer needs to read your bucket. Use this low-level resource directly when you manage the token yourself, share one token across instances, or group instances under an {@link AiSearchNamespace}.

Creating an Instance

R2-backed instance

An R2 source needs a service token to read the bucket. Either pass a tokenId (see {@link AiSearchToken}) or let the {@link AiSearch} construct provision one for you.

const bucket = yield* Cloudflare.R2Bucket("docs", {});
const search = yield* Cloudflare.AiSearchInstance("docs-search", {
  source: bucket.bucketName,
  tokenId: serviceToken.id,
});

Tuned retrieval settings

const search = yield* Cloudflare.AiSearchInstance("docs-search", {
  source: bucket.bucketName,
  aiSearchModel: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
  chunkSize: 512,
  chunkOverlap: 64,
  maxNumResults: 20,
  cache: true,
  cacheThreshold: "close_enough",
});

R2 source options

For an r2 source, sourceParams filters which objects are indexed (all fields optional):

prefix — only index keys under this prefix.
includeItems / excludeItems — micromatch glob patterns (* within a path segment, ** across segments; max 10 each). Only objects matching an includeItems pattern are indexed; excludeItems takes precedence.
r2Jurisdiction — R2 data-residency jurisdiction of the source bucket.

const search = yield* Cloudflare.AiSearchInstance("docs-search", {
  source: bucket.bucketName,
  tokenId: serviceToken.id,
  sourceParams: {
    prefix: "docs/",
    includeItems: ["/docs/**"],
    excludeItems: ["/docs/drafts/**"],
  },
});

Web-crawler source options

sourceParams.webCrawler tunes how a web-crawler source is fetched, parsed, and stored. All fields are optional.

parseType selects how pages are discovered:

"sitemap" (Cloudflare default) — read <seed>/sitemap.xml (discovered via robots.txt) and index the URLs it lists.
"crawl" — start at source and follow links.
"feed-rss" — treat the seed as an RSS / Atom feed.

crawlOptions controls link discovery (mainly for parseType: "crawl"):

depth — how many links deep to follow from the seed.
includeSubdomains — also crawl subdomains of the seed host.
includeExternalLinks — follow links off the seed host.
maxAge — skip re-fetching pages younger than this (seconds).
source — where links come from: "all", "sitemaps", or "links".

parseOptions controls how each page is parsed:

useBrowserRendering — render JS in a headless browser before parsing.
includeImages — index image content.
specificSitemaps — explicit sitemap URLs to read (for "sitemap").
contentSelector — { path, selector }[] CSS selectors scoping which part of a page is indexed per URL path.
includeHeaders — extra request headers sent while crawling.

storeOptions overrides where crawled content is stored — Cloudflare provisions managed storage by default:

storageId — R2 bucket name to store crawl output in.
storageType — "r2".
r2Jurisdiction — R2 data-residency jurisdiction for the store bucket.

Basic web-crawler instance

const search = yield* Cloudflare.AiSearchInstance("site-search", {
  type: "web-crawler",
  source: "https://example.com",
  sourceParams: { webCrawler: { parseType: "crawl" } },
});

Fully-configured crawl

const search = yield* Cloudflare.AiSearchInstance("site-search", {
  type: "web-crawler",
  source: "https://example.com",
  sourceParams: {
    webCrawler: {
      parseType: "crawl",
      crawlOptions: {
        depth: 3,
        includeSubdomains: true,
        includeExternalLinks: false,
        maxAge: 86_400,
        source: "all",
      },
      parseOptions: {
        useBrowserRendering: true,
        includeImages: false,
        contentSelector: [{ path: "/docs", selector: "main" }],
      },
    },
  },
});

Sitemap and RSS sources

// Index the URLs listed in one or more sitemaps (the default parse mode).
const fromSitemap = yield* Cloudflare.AiSearchInstance("sitemap-search", {
  type: "web-crawler",
  source: "https://example.com",
  sourceParams: {
    webCrawler: {
      parseType: "sitemap",
      parseOptions: { specificSitemaps: ["https://example.com/sitemap.xml"] },
    },
  },
});

// Treat the seed as an RSS / Atom feed.
const fromFeed = yield* Cloudflare.AiSearchInstance("feed-search", {
  type: "web-crawler",
  source: "https://example.com/feed.xml",
  sourceParams: { webCrawler: { parseType: "feed-rss" } },
});

Store crawl output in a specific R2 bucket

const search = yield* Cloudflare.AiSearchInstance("site-search", {
  type: "web-crawler",
  source: "https://example.com",
  sourceParams: {
    webCrawler: {
      parseType: "crawl",
      storeOptions: { storageId: "my-crawl-bucket", storageType: "r2" },
    },
  },
});

Grouping under a namespace

Instances live in a namespace (the account-provided default when unspecified). Pass an {@link AiSearchNamespace}‘s name to group related instances — the engine then orders this instance after the namespace on deploy. The namespace is immutable; changing it replaces the instance.

const ns = yield* Cloudflare.AiSearchNamespace("docs-ns", {});
const search = yield* Cloudflare.AiSearchInstance("docs-search", {
  source: bucket.bucketName,
  namespace: ns.name,
});

Binding to an Effect Worker

Bind the instance during the Worker’s init phase with Cloudflare.AiSearchInstance.bind(instance), which attaches the single-instance ai_search binding and returns an Effect-native client whose search / chatCompletions methods return Effects. Provide {@link AiSearchInstanceBindingLive} in the Worker’s runtime layer.

import * as Cloudflare from "alchemy/Cloudflare";
import * as Effect from "effect/Effect";
import { HttpServerRequest } from "effect/unstable/http/HttpServerRequest";
import * as HttpServerResponse from "effect/unstable/http/HttpServerResponse";

export default class Api extends Cloudflare.Worker<Api>()(
  "api",
  { main: import.meta.filename },
  Effect.gen(function* () {
    const search = yield* Cloudflare.AiSearchInstance.bind(Search);

    return {
      fetch: Effect.gen(function* () {
        const request = yield* HttpServerRequest;
        const query = new URL(request.url).searchParams.get("q") ?? "";
        const answer = yield* search.chatCompletions({
          messages: [{ role: "user", content: query }],
        });
        return yield* HttpServerResponse.json(answer);
      }),
    };
  }).pipe(Effect.provide(Cloudflare.AiSearchInstanceBindingLive)),
) {}

Binding to an Async Worker

For a vanilla async fetch Worker, pass the instance under Worker.env. The engine attaches the same ai_search binding and InferEnv types env.SEARCH as the runtime AiSearchInstance handle.

export const Api = Cloudflare.Worker("api", {
  main: "./worker.ts",
  env: { SEARCH: search },
});
export type ApiEnv = Cloudflare.InferEnv<typeof Api>;

// worker.ts
export default {
  async fetch(request: Request, env: ApiEnv): Promise<Response> {
    const query = new URL(request.url).searchParams.get("q") ?? "";
    return Response.json(
      await env.SEARCH.chatCompletions({
        messages: [{ role: "user", content: query }],
      }),
    );
  },
};