API

Search endpoint

The integration injects one on-demand route tree (default /api/ask). The base route serves the overlay: keyword mode returns JSON; agentic mode streams a grounded answer as Server-Sent Events (text/event-stream). Keyless sub-routes expose the committed knowledge graph for CLIs, MCP servers, and generated clients.

The OpenAPI 3.1 contract is published at /openapi.yaml.

Knowledge graph reads (GET)

These routes read virtual:hev-ask/kg, never call a model, and never require an API key:

Route	Response
`GET /api/ask/glossary`	`{ "terms": GlossaryEntry[] }`
`GET /api/ask/glossary/{term}`	one `GlossaryEntry`, matched by term or alias
`GET /api/ask/sections`	`{ "sections": SectionSummary[] }`
`GET /api/ask/sections?group=API`	section summaries filtered by group
`GET /api/ask/sections/{id}`	one full `KnowledgeNode`
`GET /api/ask/overview`	`{ "overview": string, "context": string }`

A SectionSummary is the lightweight shape { id, title, heading, group, url }. For section IDs that contain / or #, URL-encode the ID when placing it in the path, for example /api/ask/sections/api%2Fcli%23flags.

Missing glossary terms, section IDs, or unknown read routes return 404 with a JSON error:

{ "error": "Not found." }

Request

POST with a JSON body:

{
  "query": "how does autoscaling work",
  "mode": "agentic"
}

Field	Type	Description
`query`	`string`	The search query. Empty or whitespace returns an empty result set.
`mode`	`'keyword' \| 'agentic'`	Optional. `keyword` forces the instant path; `agentic` requests the loop. Omitted behaves like `agentic` when a key is present.

Keyword response (JSON)

Keyword mode returns a 200 JSON envelope:

{
  "results": [
    {
      "title": "Concepts",
      "heading": "The agentic search loop",
      "url": "/docs/concepts#the-agentic-search-loop",
      "group": "Overview",
      "snippet": "When the reader presses Enter, the query goes to a bounded loop…"
    }
  ],
  "query": "how does agentic search work",
  "model": "claude-haiku-4-5",
  "mode": "keyword"
}

Field	Type	Description
`results`	`Result[]`	Ranked keyword matches (`title`, `heading?`, `url`, `group?`, `snippet`).
`query`	`string`	Echoed back.
`model`	`string`	The configured loop model.
`mode`	`'keyword'`	The mode that ran.
`warning`	`string?`	Present when agentic was requested but no key is configured (downgrade).

The url field carries the deep link — the page URL with #anchor appended for a section, absent only for a document’s intro chunk.

Agentic response (SSE)

When a key is present and mode is agentic, the endpoint responds with content-type: text/event-stream and streams the answer as it is generated. Each event is a named SSE frame:

event: search
data: {"query":"autoscaling"}

event: sources
data: {"sources":[{"title":"Core Concepts","heading":"Kubernetes autoscaling","url":"/docs/concepts#kubernetes-autoscaling","group":"Overview"}],"model":"claude-haiku-4-5","mode":"agentic"}

event: token
data: {"text":"Autoscaling scales workers based on "}

event: token
data: {"text":"lag signals. See [autoscaling](/docs/concepts#kubernetes-autoscaling)."}

event: done
data: {}

Event	Data	Meaning
`search`	`{ query }`	Context the model gathered — a search sub-query, or the heading of a section it opened. May fire several times.
`sources`	`{ sources: Source[], model, mode }`	The grounding source set, sent once before any `token`. Clients validate answer links against it.
`token`	`{ text }`	One delta of the streamed Markdown answer.
`done`	`{}`	The stream is complete.
`error`	`{ error }`	A failure that occurred after streaming began (HTTP status is already `200`).

A Source is { title, heading?, url, group? } — note there is no snippet; the answer prose carries the substance, and links point at url.

Mode selection

The endpoint decides what to run:

Empty query → { results: [], query: "", model, mode: "keyword" } (JSON).
mode: "keyword", or no API key → keyword JSON, mode: "keyword".
mode: "agentic" but no key → keyword JSON plus a warning, and mode: "keyword".
otherwise → the agentic SSE stream.

Errors

Status	Body	Cause
`400`	`{ "error": "Invalid JSON body." }`	The request body wasn’t valid JSON.
`404`	`{ "error": "…" }`	A knowledge-graph read route, glossary term, or section ID wasn’t found.
`500`	`{ "error": "…" }`	The chunk index failed to build (e.g. a misconfigured collection).
—	`event: error`	A failure during the agentic stream. The HTTP status is already `200`, so errors arrive as a final SSE `error` event rather than a status code.

The API key

The endpoint resolves ANTHROPIC_API_KEY from, in order: the adapter runtime env (locals.runtime.env, e.g. Cloudflare), process.env, then import.meta.env. Set it wherever your host injects server secrets; it is never sent to the browser.

LLM tracing

Set POSTHOG_KEY (or POSTHOG_API_KEY) in the same environment and every agentic answer emits a PostHog $ai_generation trace — model, tokens, latency, and the loop’s tool calls. POSTHOG_HOST overrides the US-cloud ingestion host, and POSTHOG_CAPTURE_CONTENT (off | redacted | full, default full) controls how much prompt and answer text ships with each event. No key → no-op; the answer path never depends on telemetry.

Index lifecycle

The chunk index is built once per server instance on the first request and cached for the process lifetime. On the first request the endpoint also compares the live content hash against the knowledge graph’s hash and logs a one-time warning if they differ — your cue to run ask kg build.