Skip to content

Plan: GH-0029 Hybrid Retrieval with RRF and BM25

Context

This issue defines the backend retrieval fusion strategy that combines semantic Pinecone search with lexical Postgres search.

Problem

Dense retrieval alone misses exact-term matches, while lexical retrieval alone misses conceptual matches. The backend needs one ranking strategy that balances both without losing observability.

Current state in repo

  • app/services/retrieval_pipeline.py contains hybrid candidate fusion and logs HYBRID_RETRIEVAL markers.
  • db/migrations/20260301000031_hybrid_retrieval_lexical_rpc.sql adds the lexical RPC.
  • tests/services/test_retrieval_pipeline.py covers hybrid fusion and latency expectations.
  • docs/90_ops/hybrid_retrieval_rrf.md describes the current scoring model.

Target state

  • Retrieval combines semantic and lexical candidates using a documented fusion model.
  • The backend preserves a target 75/25 semantic-to-lexical weighting.
  • Fusion output is observable, testable, and stable enough for release use.

Constraints

  • Backend-only scope.
  • Retrieval must work with current Postgres and Pinecone boundaries.
  • The fusion layer must stay deterministic enough for CI regression tests.
  • The plan must not assume frontend filtering or ranking logic.

Proposed approach

  1. Query Pinecone and lexical RPC in the same retrieval flow.
  2. Normalize and blend candidate scores for downstream compatibility.
  3. Use weighted Reciprocal Rank Fusion for final ordering.
  4. Emit structured fusion metadata and latency logs for debugging and benchmarking.

Risks

  • Dual retrieval paths can increase latency.
  • Poor weighting can overfavor one source and reduce answer quality.
  • Fusion behavior can become hard to reason about without stable logging and tests.

Open questions

  • Should the 75/25 target remain fixed or become configuration-driven later?
  • Which latency budget should block merges when hybrid complexity grows?

Acceptance criteria

  • A plan doc exists for #29 under docs/plans/.
  • The doc defines semantic plus lexical retrieval and weighted RRF.
  • The doc names the lexical RPC and fusion test coverage.
  • The plan remains backend-only.

Files likely to change

  • docs/plans/gh-0029-hybrid-retrieval-rrf-bm25.md
  • app/services/retrieval_pipeline.py
  • db/migrations/20260301000031_hybrid_retrieval_lexical_rpc.sql
  • tests/services/test_retrieval_pipeline.py
  • app/services/gpt_mini.py
  • #29 - [Backend][Retrieval] 75/25 hybrid search with RRF + BM25 fusion

Status

Backfilled planning stub