What is a vector database?

A vector database stores high-dimensional embeddings (numerical representations of meaning). It enables semantic search—finding results by meaning instead of keywords. CogniWeave uses vector databases to index documents and power AI-driven retrieval and citation.

How does CogniWeave differ from Weaviate or Pinecone?

CogniWeave runs its own managed memory engine, informed by published research on lifelong memory for LLM agents. It compresses documents at ingestion time, consolidates related memories, and retrieves only what each query needs—reducing noise and improving latency compared to vector-only systems.

Can I use CogniWeave with Claude or other LLMs?

Yes. CogniWeave works as a retrieval layer for any LLM via RAG (Retrieval Augmented Generation). Index your documents, query CogniWeave for relevant context, and pass results to Claude, GPT, or your preferred LLM for cited answers.

What is semantic search?

Semantic search finds results based on meaning, not keywords. For example, querying 'ways to cool a room without AC' returns results about fans and ventilation—even if you didn't type those exact words. It uses transformer embeddings to understand intent.

AI memory workspace

Long-term memory
for your AI.

CogniWeave indexes your team's documents, emails, and files into a persistent vector memory. Every answer is grounded in a specific passage, with a citation that traces back to the source.

See how it works

Files

ingest

acme-msa-2024.pdfPDF · 12p

kickoff thread.emlEML

Q3-numbers.xlsxXLSX

Pipeline

Parse
Chunk
Embed
Store

Long-term memorylive

Vectors cluster by meaning, not keywords.

AES-256 encryption at rest

Tenant-isolated per workspace

SAML 2.0 SSO

Append-only audit log

No model training on your data

Features

Every capability in service of one outcome:
an answer you can verify.

Retrieval is only as good as what precedes it. We built the full pipeline — parse, chunk, embed, store, retrieve — because each stage determines the quality of the next.

Parse

Every format, one pipeline.

Emails, PDFs, scans, spreadsheets, images — normalised into clean, structured text.

OCR

Read the unreadable.

Image-based pages and handwritten scans become searchable text with layout intact.

Chunk

Right-sized context.

Documents are split into semantically coherent passages so retrieval stays precise.

Embed

Meaning as geometry.

Each chunk becomes a dense vector that captures intent, not just keywords.

Retrieve & cite

Grounded answers with sources.

Ask a question. CogniWeave returns the answer with citations linked to the original chunk.

Question

Is the customer entitled to a refund on order #4821?

Answer

Based on the refund policy^[1], the customer is eligible for a partial refund because the order was delivered after the agreed window^[2]. A similar case was resolved by the support team last month^[3].

Audit

Every answer traceable.

Lineage from query to vector to source — visible in the workspace.

How it works

A pipeline, not a black box.

Five stages between raw files and a cited answer.

Ingest

Drop in emails, PDFs, scans, documents, spreadsheets, and images.

Supported: PDF, DOCX, XLSX, PNG, JPG, EML, MBOX, TXT, MD, HTML.

Parse & chunk

Extract text and layout, run OCR on images, split into coherent passages.

Target chunk size: ~512 tokens, 64-token overlap.

Embed

Each chunk is encoded into a vector that places it in semantic space.

Embedding dimension: 1536.

Store

Vectors land in an indexed long-term store — your data, your boundary.

Index: HNSW. Tenant isolation: per-workspace.

Retrieve & cite

Questions trigger nearest-neighbor retrieval; the model answers with citations.

Target retrieval latency: under 200ms.

The difference

Not search. Not storage. Memory.

Keyword search returns the documents that contain your words. CogniWeave returns the passages that address your question — regardless of whether they share a single word with your query.

Search finds keywords.

Keyword search is fast and literal. It works well when you know exactly what you are looking for. Vector retrieval works differently: it finds documents that are semantically related to your query, even when they share no words in common. The distinction matters most when you do not yet know what to search for.

Storage holds files.

A file system organises by location. CogniWeave organises by meaning. The same document surfaces wherever it is relevant — not only in the folder you placed it in. Your knowledge becomes navigable by what it contains, not by where it lives.

Chatbots forget.

Language models do not have persistent memory. CogniWeave gives them one. Every answer draws from an index that grows as your team adds knowledge. Every retrieval is logged. Nothing is inferred without a traceable source.

Long-term memory

Meaning in geometry.

Every passage we index becomes a point in a high-dimensional space. Passages that address the same idea sit close together, regardless of the words they use.

When you ask a question, retrieval is a traversal of that geometry — finding the nearest neighbours by meaning, not by lexical overlap. The practical result: your question surfaces relevant material even when it was written differently from how you asked.

Clustering by meaning

Semantic neighborhoods surface related work automatically.

Drift detection

Surface stale or contradicting evidence as your corpus evolves.

Citation lineage

Every answer carries the chunk, the document, and the version.

Security

Your data stays yours.

Every boundary is explicit. Every access is logged. We do not train on your data.

Your files, vectors, and queries are yours. We process them to deliver the service. We do not use them to train models, and we do not share them with third parties. You can export or delete your workspace at any time.

Read the security overview

Encryption in transit

TLS 1.2+ on every request.

Encryption at rest

AES-256 on stored vectors and source files.

SSO

SAML 2.0 single sign-on.

Audit logs

Append-only event log for every retrieval.

Regional data residency

Choose where vectors and files live.

SOC 2

In progress

In progress — not yet certified.

Private VPC

Dedicated deployment available on request.

You own your data

Export or delete at any time. No silent training.

MEMORY ARCHITECTURE

Memory built for long horizons.

Most AI memory systems either carry the full history forward and drown in noise, or filter aggressively and pay for it in latency. CogniWeave runs its own managed memory engine, informed by published research on lifelong memory for LLM agents. It compresses interactions at the point of capture, consolidates related memories over time, and retrieves only what each query needs.

See how it works

Give your AI a memory.

Vector memory for teams that need answers they can verify and data they can trust.

Read the security overview

Integrations

Connects to your stack

Cogniwave runs on the Model Context Protocol. Any MCP-compatible client, agent, or RPA platform can plug in.

Open standard

MCP is the open protocol that lets AI clients and tools talk to each other without bespoke adapters.

Two-way

Cogniwave exposes tools and consumes them. Use it as a host, a client, or both.

No lock-in

Bring your own model, memory, automation, and cloud. Swap any layer without rebuilding the rest.

Need a provider that is not listed. Cogniwave can connect to any MCP-compatible service. Talk to us.

Long-term memoryfor your AI.

Every capability in service of one outcome:an answer you can verify.

Every format, one pipeline.

Read the unreadable.

Right-sized context.

Meaning as geometry.

Grounded answers with sources.

Every answer traceable.

A pipeline, not a black box.

Ingest

Parse & chunk

Embed

Store

Retrieve & cite

Not search. Not storage. Memory.

Search finds keywords.

Storage holds files.

Chatbots forget.

Meaning in geometry.

Clustering by meaning

Drift detection

Citation lineage

Your data stays yours.

Memory built for long horizons.

Give your AI a memory.

Connects to your stack

Long-term memory
for your AI.

Every capability in service of one outcome:
an answer you can verify.