RAG Cost Calculator

Calculate the full cost of your RAG pipeline — embeddings, vector storage and retrieval — and compare Pinecone, Supabase and Weaviate to find the cheapest option.

Inputs

Embedding model

3-small is 6× cheaper. Use 3-large only if retrieval quality matters critically.

Vector store

Select your current or planned vector database provider.

Total documents

Total number of chunks stored in your vector DB. One page ≈ 3–5 chunks.

Avg tokens per chunk

Typical chunk size: 256–1024 tokens. Smaller = more precise retrieval.

Monthly vector queries

Each user question typically triggers 1–3 retrieval queries.

Avg query tokens

Length of the embedded query string. Usually 50–300 tokens.

Monthly re-embed ratio: 0.10

Fraction of documents re-embedded each month due to updates. 0.1 = 10% monthly churn.

Monthly RAG pipeline cost

$1.29/mo

Current setup is optimal

Pinecone Serverless is already the cheapest option for this workload.

Yearly cost

$15.53/yr

Embedding cost

$0.88/mo

Storage cost

$0.01/mo

Query cost

$0.40/mo

Cost per query

$0.000013

Embedding model

text-embedding-3-small

Cost breakdown

Item	Monthly	Yearly
Embeddings (initial + re-embed)	$0.88	$10.56
Vector storage	$0.01	$0.12
Query + retrieval	$0.40	$4.85
Total	$1.29	$15.53

Comparison

Option	Monthly	Yearly
Pinecone Serverlesscurrentcheapest	$1.29	$15.53
Supabase pgvector	$1.32	$15.82
Weaviate Serverless	$1.29	$15.52

Data updated 2026-06-30 · Embedding: openai.com/api/pricing · Pinecone: www.pinecone.io/pricing/ · Supabase: supabase.com/pricing · Weaviate: weaviate.io/pricing

Industry Benchmark

Cost per document vs. industry averageIndustry avg: 0.0008 $/doc/mo

You are at the 98th percentile

Data updated 2026-06-30 · Embedding: openai.com/api/pricing · Pinecone: www.pinecone.io/pricing/ · Supabase: supabase.com/pricing · Weaviate: weaviate.io/pricing

Trends & comparison

Trend

Comparison (monthly vs. yearly)

RAG pipeline cost breakdown

A production RAG system has three cost components: (1) embedding creation — converting documents into vectors once, plus re-embedding on updates; (2) vector storage — storing vectors in your database monthly; (3) query retrieval — embedding each user query and running similarity search. This calculator makes all three transparent.

Choosing the right vector database

Pinecone Serverless is purpose-built for vector search with managed infrastructure. Supabase pgvector runs on PostgreSQL and is free for reads — ideal if you already use Supabase. Weaviate offers a managed serverless option with good developer tooling. Compare total monthly cost for your actual query volume before committing.

Frequently asked questions

What does a RAG pipeline cost per month?▾

For a typical knowledge base of 50,000 chunks with 100,000 queries/month using text-embedding-3-small, expect $20–$80/month depending on your vector store. Supabase pgvector is often the cheapest for small-to-medium datasets.

Pinecone vs Supabase pgvector — which is cheaper?▾

Supabase pgvector is generally cheaper for lower query volumes because reads are free (queries run inside your database). Pinecone charges per read operation, which adds up at high query volume. Run the calculator with your actual numbers.

Should I use text-embedding-3-small or text-embedding-3-large?▾

text-embedding-3-small costs $0.02/MTok and text-embedding-3-large costs $0.13/MTok — 6.5× more expensive. For most production RAG systems, 3-small provides sufficient quality. Use 3-large only for highly technical or specialized corpora where retrieval accuracy is critical.

Does this calculator include the LLM inference cost?▾

No — this calculator covers only the embedding and vector retrieval pipeline. Use the OpenAI Cost Calculator or Claude Cost Calculator for the generation (LLM inference) cost of your RAG responses.