Knowledge-base Q&A bot

Build a chatbot that grounds every answer in your content. Customers ask questions in Slack, on your site, or via email; the bot calls brain.get_context, hands the result to an LLM, and replies with the answer plus citations.

The flow

User asks "what's your refund policy?"
        ↓
brain.get_context({ topic: "refund policy", maxChunks: 4 })
        ↓
LLM gets: "Here's relevant content from your KB:\n[chunk 1]...\n[chunk 2]..."
        ↓
LLM answers grounded in real KB content, with sources cited
        ↓
"Per our policy: customers may request a refund within 30 days... [doc_xxx]"

One-time: ingest your knowledge

Before the bot can answer anything, your knowledge needs to be in the Brain.

# Ingest a single doc
curl -X POST https://mcp.vlozi.app/tools/brain.ingest \
  -H "Authorization: Bearer $VLOZI_API_KEY" \
  -H "content-type: application/json" \
  -d "{
    \"filename\": \"refund-policy.md\",
    \"content\": $(cat refund-policy.md | jq -Rs .),
    \"fileType\": \"md\"
  }"

For batch ingestion of a whole folder:

import fs from "node:fs/promises";
import path from "node:path";
 
const folder = "./knowledge-base";
for (const file of await fs.readdir(folder)) {
  if (!file.endsWith(".md")) continue;
  const content = await fs.readFile(path.join(folder, file), "utf8");
  await fetch("https://mcp.vlozi.app/tools/brain.ingest", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.VLOZI_API_KEY}`,
      "content-type": "application/json",
    },
    body: JSON.stringify({ filename: file, content, fileType: "md" }),
  });
  console.log(`Ingested ${file}`);
}

TIP

Re-ingest is safe — call brain.list_memories first, find the old doc by filename, brain.delete_memory it, then ingest the new version. The CI/CD pipeline pattern.

The bot

Slack bot (Bolt)

import { App } from "@slack/bolt";
import Anthropic from "@anthropic-ai/sdk";
 
const app = new App({
  token: process.env.SLACK_BOT_TOKEN,
  signingSecret: process.env.SLACK_SIGNING_SECRET,
});
const anthropic = new Anthropic();
 
app.message(async ({ message, say }) => {
  if (!("text" in message) || !message.text) return;
 
  // 1. Get grounded context
  const ctxRes = await fetch("https://mcp.vlozi.app/tools/brain.get_context", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.VLOZI_API_KEY}`,
      "content-type": "application/json",
      "x-agent-id": "slack-kb-bot",
    },
    body: JSON.stringify({ topic: message.text, maxChunks: 5 }),
  });
  const { data } = await ctxRes.json();
 
  // 2. Ask Claude to answer with that context
  const reply = await anthropic.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 800,
    system: `You are a customer support assistant for Acme Co.
Answer the user's question using ONLY the provided context. If the context
doesn't cover it, say "I don't have that in our knowledge base — let me
loop in a human." Cite source IDs in [brackets] after relevant sentences.
 
Context:
${data.context}
 
Sources: ${data.sources.map((s: any) => s.docId).join(", ")}`,
    messages: [{ role: "user", content: message.text }],
  });
 
  const text = reply.content[0].type === "text" ? reply.content[0].text : "";
  await say(text);
});
 
await app.start();

// app/api/support/route.ts
export async function POST(req: Request) {
  const { question } = await req.json();
 
  const ctxRes = await fetch("https://mcp.vlozi.app/tools/brain.get_context", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.VLOZI_API_KEY!}`,
      "content-type": "application/json",
      "x-agent-id": "web-support-widget",
    },
    body: JSON.stringify({ topic: question, maxChunks: 6 }),
  });
  const { data } = await ctxRes.json();
 
  // Stream the LLM response
  return new Response(/* ... pipe Claude's streamed reply, including data.sources for citations */);
}

Why grounded answers beat raw LLM

The bot says "your refund window is 30 days" because your refund policy doc says exactly that. Without brain.get_context, the LLM would invent a plausible-sounding number based on its training data.

The brain.get_context response includes sources — an array of { docId, similarity }. Surface these as citation chips in the UI so users can click through to the original source.

Operations