API

Every public symbol exported from @nkwib/tapedeck. Source of truth: src/index.ts.

ExportKindWhat it does
cassetteMiddlewarefunctionRecord/replay/live middleware for wrapLanguageModel.
withCassettefunctionVitest helper: pin a test to a named cassette.
toFollowRoutefunctionVitest matcher: assert a tool trajectory follows a ToolRoute router.
CassetteErrorclassBase class for the whole error family.
CassetteMissErrorclassReplay miss — no cassette matched the hash.
CassetteSecretErrorclassA replayed cassette still holds unredacted secrets.
CassetteCorruptErrorclassUnreadable cassette: bad JSON, version, or shape.
CassetteModeErrorclassInvalid mode string.
computeCassetteHashfunctionThe stable hash used for cassette identity (async, WebCrypto).
loadCassettefunctionRead a hash-addressed cassette from a directory.
saveCassettefunctionWrite a hash-addressed cassette into a directory.
parseCassettefunctionParse + validate raw cassette text (single or multi).
serializeCassettefunctionSerialize a cassette file to its on-disk form.
isMultiCassettefunctionNarrow a CassetteFile to the multi-interaction format.
diffCassettesfunctionSemantic field-level diff of two single cassettes.
diffCassetteFilesfunctionDiff any two cassette files, pairing interactions by hash.
mergeCassetteDirsfunctionMerge cassette directories with conflict reporting.
fileCassetteStorefunctionThe default filesystem CassetteStore.
memoryCassetteStorefunctionIn-memory CassetteStore for tests and edge runtimes.
withSpanfunctionRun a function inside an OTel-compatible span.
stableStringifyfunctionDeterministic, key-sorted JSON.
normalizeToolsfunctionStrip tool descriptions before hashing.
cassetteFilenamefunctionOn-disk filename for a hash.
CASSETTE_VERSIONconstSingle-interaction cassette format version.
MULTI_CASSETTE_VERSIONconstMulti-interaction cassette format version.
REDACTEDconstPlaceholder written in place of a secret.
DEFAULT_REDACTconstBuilt-in key matchers.

# function cassetteMiddleware
function cassetteMiddleware(options?: CassetteMiddlewareOptions): LanguageModelV3Middleware

Returns an AI SDK LanguageModelV3Middleware for use with wrapLanguageModel. Intercepts both doGenerate (one-shot) and doStream (streaming). Behaviour is driven by mode — typically read from an env var, so you switch between recording and replaying with no other code changes. A bad static mode throws CassetteModeError eagerly, even before the first call.

OptionTypeDefaultDescription
mode'record' \| 'replay' \| 'live''live'Operating mode. record calls the real model and persists request + response; replay serves a cassette by hash and throws on a miss; live is passthrough.
cassetteDirstring'./cassettes'Directory cassettes are read from / written to.
redact(string \| RegExp)[][]Extra key matchers, merged with DEFAULT_REDACT. Strings match field/header names case-insensitively; RegExps test the raw key.
cassetteNamestringForce a specific filename instead of hash-addressed lookup. Named cassettes are multi-interaction: every call is stored in the file keyed by request hash. Mostly used internally by withCassette; set it for fixed fixtures.
storeCassetteStorefilesystemStorage backend (read/write/list). Pass memoryCassetteStore() on edge runtimes where there is no filesystem.
tracerTapedeckTracerOTel-compatible tracer (e.g. trace.getTracer('tapedeck')). Emits tapedeck.generate / tapedeck.stream spans with mode, hash, path, and hit/miss attributes; misses record the exception with an error status.
import { openai } from '@ai-sdk/openai';
import { generateText, wrapLanguageModel } from 'ai';
import { cassetteMiddleware } from '@nkwib/tapedeck';

const model = wrapLanguageModel({
  model: openai('gpt-4o'),
  middleware: cassetteMiddleware({
    mode: process.env.CASSETTE_MODE ?? 'live', // record | replay | live
    cassetteDir: './cassettes',
    redact: ['apiKey', 'authorization', /token/i],
  }),
});

const { text } = await generateText({ model, prompt: 'Say hi' });

When the ambient withCassette context is active, its mode, cassetteDir, and cassetteName take precedence over the values passed here.

Streaming is first-class
In record mode tapedeck drains the live stream, persists the ordered parts, then re-serves them so your code still receives the response. In replay mode the parts are replayed as a genuine ReadableStream via the SDK's own simulateReadableStream.
# function withCassette
function withCassette<T>(
  cassetteName: string,
  testFn: () => T | Promise<T>,
  options?: WithCassetteOptions
): Promise<T>

From @nkwib/tapedeck/vitest. Runs testFn with cassetteName pinned and replay forced for its duration, publishing an ambient context (via AsyncLocalStorage) that any active cassetteMiddleware instance picks up. The context tears down automatically on exit — no global setup/teardown needed.

options.mode overrides the forced replay; options.cassetteDir overrides the directory. The WithCassetteOptions shape is { cassetteDir?: string; mode?: CassetteMode }.

The named cassette is multi-interaction: every model call inside testFn is stored in the one file keyed by request hash, and each call replays its own response in any order. Each withCassette run is one recording session — in record mode the first write starts the file fresh, so re-recording never leaves stale interactions behind.

import { describe, it, expect } from 'vitest';
import { withCassette } from '@nkwib/tapedeck/vitest';

describe('checkout agent', () => {
  it('runs the checkout flow', async () => {
    await withCassette('checkout-flow.json', async () => {
      const result = await runAgent({ prompt: 'buy a t-shirt' });
      expect(result.steps).toHaveLength(3);
    });
  });
});
Re-exported core surface
tapedeck/vitest also re-exports cassetteMiddleware and the CassetteMiddlewareOptions / CassetteMode types, so a test file can import everything it needs from the one entry point.
# class CassetteError
class CassetteError extends Error {}

The base class every tapedeck error extends. Catch it to handle the whole family with a single instanceof. The constructor sets name from the concrete subclass and restores the prototype chain (Object.setPrototypeOf), so instanceof works across module boundaries and downlevel transpilation.

import { CassetteError } from '@nkwib/tapedeck';

try {
  await generateText({ model, prompt });
} catch (err) {
  if (err instanceof CassetteError) {
    // a miss, a leaked secret, a corrupt file, or a bad mode
  }
  throw err;
}
# class CassetteMissError
class CassetteMissError extends CassetteError {
  readonly hash: string;
  readonly cassetteDir: string;
  readonly cassettePath: string;
}

Thrown in replay mode when no cassette matches the request hash. The message embeds the computed hash, the path searched, and a hint to re-run with CASSETTE_MODE=record. This is the load-bearing failure: a changed prompt or tool schema produces a different hash, misses, and fails CI loudly instead of replaying stale data.

# class CassetteSecretError
class CassetteSecretError extends CassetteError {
  readonly paths: string[];
  readonly cassettePath: string | undefined;
}

Thrown when a cassette being replayed still contains a value that the active redact matchers would have stripped — i.e. a secret leaked into a committed cassette. paths lists the offending dotted field paths so the leak is easy to locate before it ships.

# class CassetteCorruptError
class CassetteCorruptError extends CassetteError {
  readonly cassettePath: string;
  readonly reason: string;
}

Thrown when a cassette file exists but is unreadable: invalid JSON, an unknown or missing version, a malformed response shape, or a response type that doesn't match the call (a stream cassette served to doGenerate, or vice versa).

# class CassetteModeError
class CassetteModeError extends CassetteError {
  readonly mode: string;
}

Thrown when an invalid mode string is supplied — anything other than record, replay, or live. Raised eagerly at middleware construction when mode is statically set, and otherwise when the ambient context resolves.

# function computeCassetteHash
function computeCassetteHash(request: CassetteRequestKey): Promise<string>

The stable hash that gives a cassette its identity. Resolves to the bare hex SHA-256 digest of the canonicalized, key-sorted JSON of { modelProvider, modelId, prompt, toolSchemas, maxOutputTokens, temperature, topP }. Tool schemas are normalized first (see normalizeTools), so cosmetic description changes don't invalidate a cassette but a changed prompt, input schema, or sampling param does. Async since 0.2.0 — hashing uses WebCrypto (crypto.subtle), available in Node ≥18, Workers, and browsers; digests are identical to the previous node:crypto implementation.

import { computeCassetteHash } from '@nkwib/tapedeck';

const hash = await computeCassetteHash({
  modelProvider: 'openai',
  modelId: 'gpt-4o',
  prompt: [{ role: 'user', content: [{ type: 'text', text: 'Say hi' }] }],
  temperature: 0.7,
});
// '9f2c…' (bare hex; callers prefix 'sha256:' for display)
# function loadCassette
function loadCassette(hash: string, dir: string): Promise<CassetteFile | null>

Read a hash-addressed cassette from dir. Resolves to null on a miss (the file doesn't exist) and throws CassetteCorruptError for a file that exists but is unreadable or malformed. CassetteFile is the Cassette | MultiCassette union — narrow with isMultiCassette.

# function saveCassette
function saveCassette(hash: string, dir: string, cassette: Cassette): Promise<void>

Write cassette into dir under its hash-addressed filename, creating parent directories as needed. The file is pretty-printed JSON so it diffs cleanly in PRs.

# function parseCassette
function parseCassette(raw: string, path: string): CassetteFile

Parse and validate raw cassette text — single (v1) or multi-interaction (v2). Throws CassetteCorruptError for bad JSON, an unknown version, or a malformed response shape. path is only used in error messages.

# function serializeCassette
function serializeCassette(cassette: CassetteFile): string

Serialize a cassette file to its on-disk form: pretty-printed JSON with a trailing newline, so cassettes diff cleanly in PRs.

# function isMultiCassette
function isMultiCassette(file: CassetteFile): file is MultiCassette

Narrow a CassetteFile to the multi-interaction format. Multi cassettes hold interactions: { hash, request, response }[] — one entry per model call a named-cassette test makes.

# function diffCassettes
function diffCassettes(a: Cassette, b: Cassette): CassetteDiffResult

Structurally diff two single cassettes, ignoring recordedAt. The result lists leaf-level divergences as dotted paths (request.prompt[0].content[0].text) plus a hashChanged flag. formatCassetteDiff(result) renders it as human-readable text — this is what npx tapedeck diff prints.

# function diffCassetteFiles
function diffCassetteFiles(a: CassetteFile, b: CassetteFile): CassetteFileDiffResult

Diff two cassette files of any format, pairing interactions by request hash. The result reports hashes present only on one side (onlyA / onlyB) and field-level divergences for shared hashes (changed). Render with formatCassetteFileDiff(result).

# function mergeCassetteDirs
function mergeCassetteDirs(
  srcDir: string,
  destDir: string,
  options?: { force?: boolean; store?: CassetteStore }
): Promise<MergeCassettesResult>

Merge every cassette in srcDir into destDir. New files are copied, identical files skipped, and same-name files with different content are reported as conflicts — left untouched unless force is set. Source files are validated before propagating, so a corrupt fixture fails the merge instead of spreading. Backs npx tapedeck merge.

# function fileCassetteStore
function fileCassetteStore(): CassetteStore

The default storage backend: cassettes as files on disk. node:fs is imported lazily inside each method, so importing tapedeck has no Node-only side effects — the core stays edge-importable.

# function memoryCassetteStore
function memoryCassetteStore(
  seed?: Record<string, string> | Map<string, string>
): CassetteStore & { entries: Map<string, string> }

An in-memory CassetteStore for tests and edge runtimes — seed it with cassette text at build time (e.g. bundled into a Worker) or back a custom store with KV/R2 using the same three-method interface (read / write / list).

import { cassetteMiddleware, memoryCassetteStore } from '@nkwib/tapedeck';

const store = memoryCassetteStore({
  'cassettes/abc….cassette.json': cassetteJsonText,
});
cassetteMiddleware({ mode: 'replay', store });
# function withSpan
function withSpan<T>(
  tracer: TapedeckTracer | undefined,
  name: string,
  attributes: Record<string, string | number | boolean>,
  fn: (span?: TapedeckSpan) => Promise<T>
): Promise<T>

Run fn inside a span when a tracer is configured: sets attributes up front, marks OK/ERROR status (SPAN_STATUS_OK / SPAN_STATUS_ERROR), records exceptions, and always ends the span. With no tracer it's a plain call. The TapedeckTracer / TapedeckSpan types are structural subsets of the OpenTelemetry interfaces, so trace.getTracer('tapedeck') just works — and tapedeck keeps zero runtime dependencies.

# function toFollowRoute
function toFollowRoute(received: unknown, router: RouteLike): ToFollowRouteResult

From @nkwib/tapedeck/vitest. A vitest matcher asserting that a tool-call trajectory only makes transitions a ToolRoute router allows. Accepts AI SDK result.steps, a flat { toolName }[] list, or bare tool-name strings, and pinpoints the first illegal transition. The router is typed structurally ({ adjacency, routerVersion }), so any ToolRoute version works — and toolroute doesn't need to be installed at all.

import { expect } from 'vitest';
import { toFollowRoute } from '@nkwib/tapedeck/vitest';

expect.extend({ toFollowRoute });
expect(result.steps).toFollowRoute(router);
# function stableStringify
function stableStringify(value: unknown): string

Deterministic JSON.stringify: object keys are emitted in sorted order at every level, so semantically equal requests serialize identically. The canonicalization primitive underneath computeCassetteHash.

# function normalizeTools
function normalizeTools(tools: CassetteRequestKey['tools']): unknown

Strip description fields recursively from a tool array before hashing. Descriptions are irrelevant to behaviour and churn frequently, so dropping them keeps a cassette stable across doc-only edits while still keying on the input schema. Returns undefined when tools is absent.

# function cassetteFilename
function cassetteFilename(hash: string): string

The on-disk filename for a hash-addressed cassette: `${hash}.cassette.json`.

# const CASSETTE_VERSION
const CASSETTE_VERSION = 'tapedeck@0.1.0';

The single-interaction (v1) cassette format version, stamped into every hash-addressed file's version field. On read, a cassette whose version doesn't start with tapedeck@ is rejected with CassetteCorruptError.

# const MULTI_CASSETTE_VERSION
const MULTI_CASSETTE_VERSION = 'tapedeck@0.3.0';

The multi-interaction (v2) cassette format version, stamped into named cassette files. A v2 file holds interactions: { hash, request, response }[] instead of a single request/response pair.

# const REDACTED
const REDACTED = '[REDACTED]';

The placeholder string written in place of a matched secret value at record time.

# const DEFAULT_REDACT
const DEFAULT_REDACT: (string | RegExp)[] = [
  'apiKey',
  'authorization',
  'x-api-key',
  'bearer',
  'token',
];

The built-in key matchers, always applied even when the caller passes none. Any redact option you supply is merged on top of these — you extend the defaults, never replace them.