Vector Index Architecture Migration Plan: Pinecone to Supabase pgvector
1. Overview and Motivation
Currently, the vector database (Pinecone) partitions curriculum data into hard isolation boundaries using namespaces (e.g., grade-elementary-6-geography). While this design natively filters results, it prevents true broad Retrieval-Augmented Generation (RAG) across the entire curriculum and introduces architectural complexity with dual storage (vectors in Pinecone, text/metadata in Postgres).
The Goal: Migrate away from Pinecone and replace it with Supabase pgvector. This simplifies the architecture by unifying data storage and vector search within the same Postgres database. It naturally solves the "namespace" problem by storing vectors alongside chunks, allowing true broad semantic retrieval across the entire knowledge base, utilizing metadata to filter or softly boost (bias) results based on user profiles or request hints, and natively combining semantic and lexical searches in SQL.
2. Target Architecture
2.1. Vector DB Schema (Postgres)
- Target Store: Supabase Postgres (
chunksorembeddingstable). - Column: Add an
embedding vector(1536)column to the appropriate table. - Index: Create an
hnsworivfflatindex on theembeddingcolumn for fast vector similarity search using cosine distance (<=>). - Existing Metadata: Metadata filtering is inherently solved because the vectors live in the same row as all document metadata (
grade,subject,document_id,language, etc.).
2.2. Query Execution Paradigm
- Global Search Default: Semantic queries search the entire table by default.
- Hybrid RPC: Create a single Postgres RPC (e.g.,
search_chunks_hybrid) that:- Retrieves: Performs a vector similarity search (
<=>) for the top $K$ semantic matches and a full-text search (websearch_to_tsquery) for the top $K$ lexical matches. - RRF Fusion: Applies Reciprocal Rank Fusion (RRF) natively in SQL to blend the ranked semantic and lexical sets.
- Profile Bias / Filter: Optionally accepts
p_gradeandp_subjectparameters to strictly filter (WHERE clause) or softly boost results.
- Retrieves: Performs a vector similarity search (
- LLM Rerank: Pass the top fused candidates to the GPT-mini reranker for final semantic validation in the Python application.
3. Component Migration Breakdown
Phase 1: Database Schema Updates
- Target Files: New DB Migration (
db/migrations/XXXX_add_vector_column.sql). - Changes:
- Enable
pgvectorextension if not already enabled. - Add the
embeddingcolumn (typevector(1536)) to the table storing text chunks (e.g.,chunkstable). - Create an index (HNSW recommended) for optimal distance queries.
- Enable
Phase 2: Ingestion Pipeline Consolidation
- Target Files:
app/services/embedding_service.py,app/services/ingestion.py. - Changes:
- Deprecate/Remove
pinecone_adapter.py. - Update ingestion logic to write OpenAI embeddings directly to Supabase via the Python Supabase client during the chunk insertion phase.
- Deprecate/Remove
Phase 3: Unified Hybrid Retrieval Pipeline
- Target Files:
app/services/retrieval_pipeline.py, New DB Migration for RPC. - Changes:
- Write a new stored procedure
search_chunks_hybridthat performs the combined vector/text search and RRF. - Update
retrieval_pipeline.pyto call the single Supabase RPC instead of coordinating Pinecone (semantic) and Supabase (lexical) separately.
- Write a new stored procedure
Phase 4: Data Wiping and Re-ingestion
- Strategy: Because this is a pre-production environment, no backfill script is needed.
- Action: Wipe existing chunks/documents and run the standard ingestion pipeline to re-process the curriculum, creating the new
pgvectorembeddings natively.
4. Rollout Strategy & Risk Mitigation
- Code Deletion: Safely remove all Pinecone dependencies (
pinecone-clientfromrequirements.txt). - Validation: Run the existing test suites (
promptfooor pytest) to verify that hybrid retrieval still meets the relevance baseline using the newpgvectorimplementation. - Simplicity: Rely on the fact that production data does not exist to fast-track this migration without complex dual-write/dual-read logic.
5. Explicit Validation Plan
- Unit Tests:
- Verify that
retrieval_pipeline.pyonly calls Supabase and no longer references Pinecone. - Update ingestion pipeline tests to assert embeddings are passed to Supabase.
- Verify that
- Database Index Check: Ensure
EXPLAIN ANALYZEon the hybrid RPC uses the vector index rather than an exact sequential scan when tables grow. - End-to-End Broad Retrieval Test:
- Perform a manual chat query with no
gradeorsubjectprovided. - Verify the response pulls context seamlessly across the entire table.
- Perform a manual chat query with no
6. Explicit Go/No-Go Checklist
- [ ] Supabase project supports
pgvectorextension. - [ ] Pinecone SDK removed from dependencies.
- [ ] New
search_chunks_hybridRPC correctly implements RRF blending in SQL. - [ ] Ingestion pipeline modified to store vectors in Postgres directly.