Skip to main content

RAG: ground Claude's answer in your documents

Retrieval-augmented generation grounds an answer in your own data. Anthropic has no embeddings endpoint, so this example embeds a small knowledge base and the question with Voyage AI, finds the closest passages by cosine similarity, and passes them to Claude as context. Set ANTHROPIC_API_KEY and VOYAGE_API_KEY before running.

import Anthropic from "npm:@anthropic-ai/sdk";

const anthropic = new Anthropic();
A tiny knowledge base. In a real app these passages would live in a vector database rather than an array.
const documents = [
  "Deno is secure by default: network, file system, and environment access " +
  "must be granted explicitly with flags like --allow-net.",
  "Deno has a built-in test runner. Write Deno.test() and run `deno test`.",
  "Deno supports npm packages through the npm: specifier, for example " +
  "import express from 'npm:express'.",
];
Embed text with Voyage AI's REST API. input_type marks whether the text is a stored document or a search query, which improves retrieval quality.
async function embed(
  input: string[],
  inputType: "document" | "query",
): Promise<number[][]> {
  const res = await fetch("https://api.voyageai.com/v1/embeddings", {
    method: "POST",
    headers: {
      authorization: `Bearer ${Deno.env.get("VOYAGE_API_KEY")}`,
      "content-type": "application/json",
    },
    body: JSON.stringify({
      model: "voyage-3.5-lite",
      input,
      input_type: inputType,
    }),
  });
  if (!res.ok) {
    throw new Error(`Voyage error ${res.status}: ${await res.text()}`);
  }
  const json = await res.json();
  return json.data.map((d: { embedding: number[] }) => d.embedding);
}
Cosine similarity scores how close two embedding vectors are.
function cosine(a: number[], b: number[]): number {
  let dot = 0, normA = 0, normB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    normA += a[i] * a[i];
    normB += b[i] * b[i];
  }
  return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}

const question = "How do I run tests in Deno?";
Embed the documents once (this is the "index") and the incoming question.
const docVectors = await embed(documents, "document");
const [queryVector] = await embed([question], "query");
Rank documents by similarity to the question and keep the top two.
const retrieved = documents
  .map((text, i) => ({ text, score: cosine(queryVector, docVectors[i]) }))
  .sort((a, b) => b.score - a.score)
  .slice(0, 2);
Hand the retrieved passages to Claude as context and ask it to answer from them, which keeps the response grounded in your data.
const context = retrieved.map((d) => `- ${d.text}`).join("\n");
const message = await anthropic.messages.create({
  model: "claude-opus-4-8",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: `Answer the question using only the context below.\n\n` +
        `Context:\n${context}\n\nQuestion: ${question}`,
    },
  ],
});

for (const block of message.content) {
  if (block.type === "text") console.log(block.text);
}

Run this example locally using the Deno CLI:

deno run -N -E https://docs.deno.com/examples/scripts/anthropic_rag.ts

Did you find what you needed?

Privacy policy