Vector RAG Operations Control Plane
Operate RAG pipelines through Console by combining file ingestion, vector index management, embeddings, and chat retrieval in one control surface.
Overview
Many RAG projects stall after the first demo because document ingestion, vector operations, and retrieval quality are spread across unrelated tools. Console brings those tasks into one platform so teams can run RAG like an operational system rather than a script.
This scenario fits internal knowledge assistants, support knowledge bases, compliance search, and document-grounded copilots.
Architecture
Console provides the files pipeline, vector provider abstraction, and model gateway. Console SDK lets you wire the whole flow from a Node service or an internal admin tool.
The key advantage is operational consistency: the same platform handles uploads, embeddings, indexes, retrieval, and the final grounded answer.
1. Upload Source Files And Create The Index
Start by loading documents and provisioning the index that will serve retrieval.
import fs from 'node:fs';
import { ConsoleClient } from '@cognipeer/console-sdk';
const client = new ConsoleClient({
apiKey: process.env.COGNIPEER_API_KEY!,
baseURL: 'https://console.example.com',
});
await client.files.upload({
file: fs.createReadStream('./docs/expense-policy.pdf'),
purpose: 'assistants',
});
await client.vectors.indexes.create('qdrant-main', {
name: 'policy-knowledge-base',
dimension: 1536,
metric: 'cosine',
});2. Embed And Upsert Chunks
Once files are parsed into chunks, embed them and upsert them through the vector API.
const chunks = [
'Taxi receipts are reimbursable when attached within 10 business days.',
'International hotel expenses require manager approval above 300 EUR per night.',
];
const embeddingResponse = await client.embeddings.create({
model: 'text-embedding-3-small',
input: chunks,
});
await client.vectors.upsert('qdrant-main', 'policy-knowledge-base', {
vectors: chunks.map((text, index) => ({
id: 'policy-' + (index + 1),
values: embeddingResponse.data[index].embedding,
metadata: { text, source: 'expense-policy.pdf' },
})),
});3. Run Retrieval And Grounded Answer Generation
Query the vector index first, then pass the retrieved context into chat completion through the same Console surface.
const question = 'Can I expense a 340 EUR hotel room during an international event?';
const queryEmbedding = await client.embeddings.create({
model: 'text-embedding-3-small',
input: question,
});
const matches = await client.vectors.query('qdrant-main', 'policy-knowledge-base', {
query: {
vector: queryEmbedding.data[0].embedding,
topK: 3,
},
});
const context = matches.result.matches
.map((match) => match.metadata?.text)
.filter(Boolean)
.join('
');
const answer = await client.chat.completions.create({
model: 'rag-answer-model',
messages: [
{
role: 'system',
content: 'Answer only with the provided policy context:
' + context,
},
{ role: 'user', content: question },
],
});
console.log(answer.choices[0].message.content);Result
You get a RAG operations pattern that:
- Unifies files, embeddings, vector indexes, and chat in one platform - Lets platform teams manage vector backends without rewriting app code - Improves traceability for grounded answers and data sources - Fits policy assistants, knowledge search, and document-heavy internal workflows