Skip to content

@web-ai-sdk/prompt

web-ai-sdk building block for the Web’s Built-in Prompt API (LanguageModel). Single-shot prompts with system message, sampling controls, streaming, session reuse, and pluggable result caching.

Prompt API ships stable in Chrome 148+ — no flag required. Chrome 138–147 still works with chrome://flags/#prompt-api-for-gemini-nano enabled. On Edge it remains a developer preview in Canary/Dev 138+ behind edge://flags/#prompt-api-for-phi-mini, with Phi-4-mini’s stricter safety pipeline often refusing output (see Browser support). On any other browser this library is a no-op for the React hook (it stays in "unavailable"). The vanilla ask() throws PromptUnavailableError so callers can branch explicitly.

Terminal window
pnpm add @web-ai-sdk/prompt
# or: npm i @web-ai-sdk/prompt / bun add @web-ai-sdk/prompt

The React adapter ships as a subpath export, with no extra install. react is a peer dependency only when you import the /react entry.

import { ask } from "@web-ai-sdk/prompt";
const result = await ask({
input: "Summarize this in one sentence: WebMCP lets web pages expose tools to agents.",
systemPrompt: "You are concise. Reply with a single sentence.",
temperature: 0.2,
onUpdate: (text) => console.log("partial", text), // cumulative buffer
});
console.log(result.output, result.cached);

ask() shares a warm LanguageModel instance across same-shape callers so the cold start is paid once per persona. That’s right for embeds, widgets, ask-and-display flows. It’s the wrong shape for chat: two callers with the same mode would share one instance, so conversation history cross-bleeds and abort() on one caller kills the other.

import { usePrompt } from "@web-ai-sdk/prompt/react";
export function AskBox() {
const { status, output, error, ask, abort } = usePrompt({
systemPrompt: "You are a helpful assistant. Be concise.",
temperature: 0.7,
});
if (status === "unavailable") return null;
return (
<form
onSubmit={(e) => {
e.preventDefault();
const input = new FormData(e.currentTarget).get("q") as string;
if (input) ask(input);
}}
>
<input name="q" placeholder="Ask me anything" />
<button type="submit" disabled={status === "loading" || status === "streaming"}>
{status === "streaming" ? "Streaming…" : "Ask"}
</button>
{output && <p>{output}</p>}
{error && <small>{error.message}</small>}
</form>
);
}

State machine: idle | loading | streaming | done | unavailable. ask(input) triggers a request, cancels any in-flight one, and updates output as chunks stream. abort() cancels the current request; reset() clears state.

interface AskOptions {
input: string;
systemPrompt?: string;
temperature?: number;
topK?: number;
language?: string; // BCP-47 hint, folded into expectedInputs/Outputs
supportedLanguages?: readonly string[]; // default ["en"]
expectedInputs?: LanguageModelExpectedInput[]; // advanced passthrough
expectedOutputs?: LanguageModelExpectedOutput[]; // advanced passthrough
tools?: LanguageModelTool[]; // experimental: native function-calling passthrough
monitor?: (m: CreateMonitor) => void;
responseConstraint?: object; // JSON Schema for structured output
omitResponseConstraintInput?: boolean;
cache?: ResponseCache;
cacheKey?: string;
onUpdate?: (text: string) => void; // CUMULATIVE buffer
signal?: AbortSignal;
}
interface AskResult {
output: string | null;
cached: boolean;
}

Feature-detect helper.

checkAvailability(opts?): Promise<LanguageModelAvailability | null>

Section titled “checkAvailability(opts?): Promise<LanguageModelAvailability | null>”

Forwards to LanguageModel.availability(). Returns null if the global is missing or the call throws.

getLanguageModelApi, getOrCreateLanguageModel, defaultCacheKey; exported so you can compose your own pipeline (e.g. share one cached session across multiple call sites, or roll your own retry).

Two layers, same as @web-ai-sdk/summarizer:

  • Session cache (internal, in-memory, always on): a Map<stringifiedCreateOptions, LanguageModel> so consecutive calls with the same shape (system prompt, temperature, topK, language hints) reuse the warm session. Cold-start ≈ 1-3s; warm calls are sub-second.
  • Result cache (opt-in): pass a cache (anything matching { get, set }) to memoize final responses by (prompt, systemPrompt, temperature, topK). Omit it for a fresh model call every time.
// Off by default; every call hits the model.
ask({ input: "hi" });
// Opt in for sessionStorage-backed caching.
ask({ input: "hi", cache: "session" });
// Or persistent localStorage-backed caching.
ask({ input: "hi", cache: "local" });
// Or roll your own.
ask({ input: "hi", cache: myMap, cacheKey: "greeting" });

The vanilla ask() throws PromptUnavailableError when the API is missing or reports availability: "unavailable". The React hook absorbs this and returns status: "unavailable" instead.

AbortSignal is supported on both surfaces. Aborting mid-stream resolves cleanly; the result cache is not written for aborted runs.

MIT © Beto Muniz