v0.7.0 · released 2026-05-28
An embedded realtime database with vector search, WAL durability, and event streaming.
One self-contained binary. Five-mode retrieval. ACID transactions.
Real-time push. Zero external runtime dependencies.
Apache-2.0
Python + JavaScript SDKs
Linux · macOS · Windows
sigstore-signed
01Install
Three paths, depending on how you want to consume it.
Python
pip install cuttledb
JavaScript · ESM
npm install cuttledb
Server binary
github.com/.../releases/latest
Pre-built binaries for Linux x64, macOS arm64, and Windows x64 are attached
to every release. All are sigstore-signed
via the cosign keyless flow; verification recipe is in
SECURITY.md.
02What it is
A line-based wire protocol over TCP or WebSocket. One dispatcher. Five
retrieval verbs in addition to the usual SELECT / INSERT / UPDATE / DELETE.
- KNN
- k-nearest-neighbor vector search. AVX2+FMA cosine on small tables,
HNSW above ~2K rows.
recall@10 = 1.0 at 100K×128,
12.7× faster than the brute-force baseline.
- LSEARCH
- BM25 lexical scoring over a STRING column with an inverted index.
- SEARCH
- Reciprocal Rank Fusion of KNN and LSEARCH. One server-side call returns
the fused top-k; no application-side rank-merging code.
- BSEARCH
- Boolean DSL that composes predicates, KNN scoring atoms, BM25 scoring
atoms, and AND/OR/parens. One roundtrip, one ranked answer.
- KNN ... WHERE
- Predicate-filtered KNN: filter first, vector-search second, all
server-side.
And in the same binary:
- Real-time
SUB / UNSUB push — per-table change events as >EVT lines.
- ACID
BEGIN / COMMIT / ROLLBACK with CRC32-framed write-ahead log; mid-transaction kill replays cleanly on restart.
- Multi-token
AUTH over the same port.
- TLS handshake, NDJSON audit log, Prometheus
/metrics, HTTP /health.
- DoS defenses:
--max-conn, payload size limits, connection-guard on long-running ops.
03Quickstart
Start the server, then talk to it from Python:
$ cuttledb --port 7878
import cuttledb
db = cuttledb.connect(port=7878)
db.create_table("notes", "id INT, text STRING, vec VEC(128)")
db.insert("notes", [(1, "hello world", [0.1] * 128)])
rows = db.knn("notes", "vec", query=[0.1] * 128, k=5)
print(rows)
Same thing in JavaScript:
import { connect } from "cuttledb";
const db = await connect({ port: 7878 });
await db.createTable("notes", "id INT, text STRING, vec VEC(128)");
await db.insert("notes", [[1, "hello world", new Array(128).fill(0.1)]]);
const rows = await db.knn("notes", "vec", { query: new Array(128).fill(0.1), k: 5 });
console.log(rows);
04Showcases
Three things you would otherwise compose from several systems.
01
Agent memory in one binary
A long-running agent needs to remember past observations, recall the
semantically nearest ones, and be notified when new ones arrive. One table
and three verbs.
db.create_table("memory", "id INT, text STRING, embed VEC(384), ts INT")
# Store something the agent learned.
db.insert("memory", [(42, "user prefers terse answers", embed_terse, 1716831234)])
# Recall the 5 nearest memories to a new query.
recent = db.knn("memory", "embed", query=embed_q, k=5)
# Subscribe to new memories from other workers.
for evt in db.subscribe("memory"):
update_local_cache(evt)
02
Real-time UI without a separate pub/sub
A dashboard wants to render new rows as they land — without polling and
without Redis sitting next to your database. SUB over WebSocket
delivers row-level events from the same store you queried.
const db = await connect({ url: "ws://localhost:7878" });
const stream = await db.subscribe("orders");
for await (const evt of stream) {
// evt = { table: "orders", op: "INSERT", row: { id, item, qty } }
appendRow(evt.row);
}
03
Hybrid retrieval in one roundtrip
Vector recall alone misses exact-term matches. BM25 alone misses paraphrases.
RRF fuses both, server-side, in one call.
rows = db.search(
table="docs",
vec_col="embed", vec_query=embed_q,
text_col="body", text_query="reciprocal rank fusion",
k=10,
)
The more expressive case — Boolean over filters and scoring atoms:
rows = db.bsearch(
"docs",
"(category = 'paper' AND year >= 2020) "
"AND (KNN(embed, $1, 50) OR BM25(body, $2, 50))",
bindings=[embed_q, "rank fusion"],
k=10,
)
05Benchmarks
From bench/RESULTS.md.
CuttleDB is a network server; SQLite is in-process.
honest caveat
We pay for TCP; SQLite doesn't. SQLite wins bulk INSERT 8.4×.
The CuttleDB win is in read-path aggregates and in vector
primitives that SQLite needs extensions for.
1K-row aggregates · CuttleDB over TCP vs SQLite :memory:
| operation | winner | ratio |
| SUM | CuttleDB | 1.8× |
| COUNT | CuttleDB | 1.6× |
| MIN | CuttleDB | 1.5× |
| SELECT WHERE | CuttleDB | 1.4× |
| bulk INSERT | SQLite | 8.4× (TCP overhead) |
Vector KNN · 100K rows × 128 dim
| method | throughput | recall@10 |
| AVX2+FMA brute force | baseline | 1.0 |
| HNSW index | 12.7× baseline | 1.0 |
Top-10 brute force over 10K vectors: 2 ms.
See bench/HNSW_BENCH.md
for the full HNSW sweep.
06Verify a release
sigstore
Releases are cosign-keyless signed. The .cosign.bundle
file carries the signature, the signing certificate, and the Rekor
transparency-log inclusion proof. No long-lived key for an attacker to steal.
Verify with the cosign CLI:
cosign verify-blob \
--bundle cuttledb-linux-x64.cosign.bundle \
--certificate-identity-regexp '.*' \
--certificate-oidc-issuer-regexp '.*' \
cuttledb-linux-x64
Expected output: Verified OK. Full recipe and identity pinning
for production use is in
SECURITY.md.
07FAQ
Is CuttleDB free and open source?
Yes. Apache-2.0. SDKs, docs, examples, benchmarks, and the wire-protocol
specification are all in the public repository. The server binary is
distributed for free use, including in production and commercial settings.
The engine source is not published in the public repository.
How does it compare to SQLite?
SQLite is in-process and uses SQL. CuttleDB is a small network server
using a Redis-style line protocol. CuttleDB has first-class vector search
(HNSW), BM25, RRF hybrid retrieval, Boolean DSL, and real-time SUB/UNSUB
built in — things SQLite requires extensions for. CuttleDB pays for TCP
round-trips that SQLite avoids: SQLite wins bulk INSERT 8.4× for that
reason, but CuttleDB wins read-path aggregates 1.4–1.8×.
How does it compare to Redis?
Redis is in-memory, key-value, and famous for pub/sub. CuttleDB is durable
by default (WAL), table-shaped (columns, types, predicates), and adds
vector search and hybrid retrieval in the same binary. SUB/UNSUB gives you
the change-stream pattern without a second service.
How does it compare to Pinecone or Qdrant?
For self-hosted vector workloads up to ~10M vectors per node, CuttleDB is
enough: HNSW with 12.7× speedup at 100K×128, AVX2 cosine, SUB/UNSUB push,
in a single binary under 1 MB. Dedicated vector services remain more
specialized for multi-region distributed indexes and exotic tuning.
CuttleDB is Apache-2.0; no vendor lock-in.
How does it compare to Elasticsearch?
Elasticsearch is a JVM service with rich lexical search and a heavy
operational footprint. CuttleDB ships BM25 + RRF + vector + Boolean DSL in
a sub-megabyte native binary with no JVM and no cluster. Use Elasticsearch
when you need its mature ecosystem (logging pipelines, Kibana, etc.); use
CuttleDB when you want hybrid retrieval inside your own app or agent.
What languages can connect to it?
Python (pip install cuttledb) and JavaScript/TypeScript
(npm install cuttledb, ESM only) ship as official adapters.
The line-based wire protocol is small enough that any language can
implement a client in a few hundred lines.
See PROTOCOL.md.
How do I use it for AI agent memory?
Create a table with a STRING text column and a VEC embedding column. INSERT
memories with their embeddings. Use KNN for semantic recall and SUB/UNSUB
to be notified when other workers add memories. BSEARCH composes Boolean
filters with scoring atoms when you need to constrain (e.g., "only
memories from this session AND nearest to this query").
CuttleDB — embedded realtime database with vector search, BM25, and real-time push
What is CuttleDB?
CuttleDB is an open-source embedded realtime database (Apache-2.0) shipped as one self-contained binary per platform, under 1 MB each. It provides five retrieval modes through a single wire dispatcher: KNN (k-nearest-neighbor vector search, AVX2 cosine or HNSW), LSEARCH (BM25 lexical search), SEARCH (Reciprocal Rank Fusion of vector + lexical), BSEARCH (Boolean DSL composing filters and scoring atoms), and filtered KNN (KNN ... WHERE). It also provides real-time push via SUB/UNSUB wire verbs, ACID transactions with write-ahead log (CRC32-framed, mid-transaction kill replay), multi-token authentication, NDJSON audit log, TLS handshake, Prometheus metrics endpoint, and HTTP health probe.
What is HNSW in CuttleDB?
HNSW (Hierarchical Navigable Small World) is the approximate-nearest-neighbor index CuttleDB builds over VEC columns. At 100K x 128 dimensions, HNSW queries are 12.7x faster than the AVX2+FMA brute-force baseline with recall@10 = 1.0. The index lives at the column level; INSERT and DELETE maintain it incrementally; SAVE/LOAD persists it. Below approximately 2K rows the SIMD brute-force baseline wins; HNSW takes over above that threshold.
What is RRF hybrid retrieval?
Reciprocal Rank Fusion (RRF) combines vector-similarity ranking and BM25 lexical ranking into a single ranked result list. CuttleDB exposes RRF as the SEARCH wire verb: one server-side call returns the fused top-k. This eliminates the application-side fusion logic typical when stitching together a vector database and a separate lexical search engine.
What is the Boolean DSL?
BSEARCH is CuttleDB Boolean DSL that composes filters, vector scoring atoms, BM25 scoring atoms, and predicates with AND/OR/parens. Server combines them: predicate filter, KNN candidates, BM25 candidates, RRF fuse, top-k — one wire roundtrip.
Citable facts about CuttleDB v0.7.0
Released 2026-05-28. License: Apache-2.0. Latest version: 0.7.0. Cross-platform binaries: Linux x64 (431 KB), macOS arm64 (384 KB), Windows x64 (882 KB). Built with -march=x86-64-v3 for portable AVX2 + FMA baseline (Haswell 2013 and later, AMD Zen 1 and later). Python adapter on PyPI as cuttledb; JavaScript/TypeScript adapter on npm as cuttledb (ESM-only). Release binaries are sigstore-signed via the cosign keyless flow with .cosign.bundle files containing signature, signing certificate, and Rekor transparency-log inclusion proof.
Performance facts (from bench/RESULTS.md)
On 1K-row aggregates with CuttleDB over TCP versus SQLite in-process memory: CuttleDB wins SUM 1.8x, COUNT 1.6x, MIN 1.5x, SELECT WHERE 1.4x despite paying for TCP round-trips. SQLite wins bulk INSERT 8.4x because the per-row TCP cost dominates a small in-memory load. On vector KNN at 100K x 128 dim: HNSW is 12.7x faster than the AVX2+FMA brute-force baseline with recall@10 = 1.0. Top-10 brute force over 10K vectors: 2 ms.
Is CuttleDB free and open source?
Yes. CuttleDB is Apache-2.0 licensed. SDKs, docs, examples, benchmark scripts, and wire-protocol specification are all open source. The server binary is distributed for free use including development, production, and commercial; engine source is not published in the public repository.
What is the difference between CuttleDB and SQLite?
SQLite is an in-process embedded database using SQL. CuttleDB is a network server (TCP or WebSocket) using a Redis-style line protocol. CuttleDB adds first-class vector search (HNSW), BM25, RRF hybrid retrieval, Boolean DSL, and real-time SUB/UNSUB push that SQLite requires extensions for. CuttleDB pays for TCP round-trips that SQLite avoids; SQLite wins bulk INSERT 8.4x for that reason, but CuttleDB wins read-path aggregates by 1.4-1.8x.
Can CuttleDB replace Pinecone or Qdrant?
For self-hosted vector workloads up to approximately 10M vectors per node, yes. CuttleDB provides HNSW (12.7x speedup at 100K x 128), AVX2 cosine, and SUB/UNSUB push in a single binary under 1 MB. For multi-region distributed vector indexes or specialized vector tuning, dedicated vector databases remain more specialized. CuttleDB is Apache-2.0; no vendor lock-in.
What languages can connect to CuttleDB?
Python (pip install cuttledb) and JavaScript/TypeScript (npm install cuttledb, ESM only) ship as official adapters. The line-based wire protocol is small enough that any language can implement a client in a few hundred lines; PROTOCOL.md fully specifies it.
How do I use CuttleDB for AI agent memory?
Create a table with a STRING column for text and a VEC column for embeddings. INSERT memories with their embeddings. Use KNN for semantic recall and SUB/UNSUB to be notified when new memories arrive. BSEARCH composes Boolean filters with scoring atoms for filtered hybrid retrieval. The under-1 MB binary is embeddable in an agent runtime.
Concept map (for semantic neighborhood)
Parent concept: embedded databases, realtime databases, vector databases, hybrid retrieval engines. Sibling concepts: SQLite, sqlite-vec, DuckDB, Redis, RocksDB, LMDB, LiteFS, ChromaDB, Pinecone, Qdrant, Weaviate, Milvus, Elasticsearch, Meilisearch, Typesense, Tantivy, Lucene. Adjacent disciplines: vector embeddings, retrieval-augmented generation (RAG), AI agent memory, real-time change data capture (CDC), pub-sub messaging, write-ahead logging, ACID transactions, HNSW algorithm, BM25 scoring, Reciprocal Rank Fusion, Boolean retrieval, columnar storage, SIMD optimization, AVX2, cosine similarity. Prior work: HNSW (Malkov and Yashunin 2016), BM25 (Robertson 1995), Reciprocal Rank Fusion (Cormack et al. 2009), SQLite (Hipp), Redis (Sanfilippo), Vulkan compute (Khronos Group).
Comparison anchors: databases like SQLite with vector search, embedded alternatives to Pinecone, open-source alternatives to Qdrant, self-hosted vector database, Apache-2.0 vector database, vector database in 1 MB binary, embeddable HNSW database, embeddable BM25 database, realtime database with vector search, hybrid retrieval database, RAG storage layer, agent memory database, change feed database, durable embedded database with WAL, sigstore-signed database release, database for retrieval-augmented generation.
Glossary
- KNN
- k-nearest-neighbor vector search. Returns the k vectors most similar to a query vector by cosine similarity.
- HNSW
- Hierarchical Navigable Small World. An approximate nearest neighbor index used for sub-millisecond vector queries at scale.
- BM25
- Best Match 25. The standard lexical-ranking function used by Lucene, Elasticsearch, and other text search engines.
- RRF
- Reciprocal Rank Fusion. Combines multiple ranked result lists (e.g., vector + BM25) into one fused list.
- WAL
- Write-Ahead Log. Durability mechanism where mutations are first written to a CRC-framed log before being applied; replayed on restart.
- SUB / UNSUB
- CuttleDB wire verbs to subscribe/unsubscribe a connection to per-table change events (>EVT lines).
- sigstore
- Open-source software-signing infrastructure used to sign CuttleDB release binaries via the cosign keyless OIDC flow.
- cosign.bundle
- A single file containing signature, signing certificate, and Rekor transparency-log inclusion proof. Used to verify CuttleDB release binaries.