768D embeddings → 72D. Same Pinecone index. 90.6% less storage.
Meaning Retention
99.6%
Cosine similarity preserved after compression. Better than PCA.
Query Latency
<50ms
Deterministic. CPU only. No GPU. Air-gap deployable.
Document
0 chars
Samples
72D Semantic Vector
— floats
compress a document to see its coordinates
Search by Meaning
compress a document first
compress a document then search by meaning
Take this to production.
You just searched a compressed document by meaning. Now imagine your entire knowledge base — contracts, research, tickets, emails — all stored as 72 floats each. Queried in milliseconds. 90% cheaper than what you're running today.
1B vectors768D = 3TB storage → 72D = 288GB
cost~$9,000/mo Pinecone → ~$850/mo
query speed10× faster — smaller vectors fit in cache
deploymentCPU only · air-gapped · 26MB model
pip install arbiter-engine
from arbiter_engine import rank, embed
# Your existing pipeline
doc_vector = embed(document) # 72 floats# Drop into Pinecone / Weaviate / FAISS
index.upsert([(doc_id, doc_vector)])
# Query by meaning
results = rank(query, candidates)
print(results.top.score) # 0.731
Same database. Same queries. Just 72D vectors instead of 768D or 1536D.
Enterprise deployment, air-gapped, on-prem: [email protected]