Skip to content

BacMR Backend Documentation

Current State

Retrieval

Current retrieval pipeline

The primary retrieval implementation is app/services/retrieval_pipeline.py.

The pipeline does the following:

Detect query language with GPTMiniService.detect_language
Translate the query when the query language differs from the corpus language
Generate an embedding with EmbeddingService
Query Pinecone for semantic candidates
Query Supabase Postgres RPC search_chunks_lexical for lexical candidates
Blend semantic and lexical scores with weighted reciprocal-rank fusion
Apply deterministic profile-aware reranking
Apply GPT-mini reranking
Fetch full chunk text from Postgres-backed chunk storage

Weighting and ranking

Current hard-coded weights are:

semantic weight: 0.75
lexical weight: 0.25
reciprocal-rank fusion constant: 60

Profile-aware reranking boosts candidates when chunk metadata matches:

grade
subject
major
subscription tier multiplier

Tier limits are resolved through app/services/tier_config.py.

Multilingual behavior

Chat language handling is split across:

app/services/language_context.py
app/services/gpt_mini.py
app/services/llm.py

Current behavior:

prompt and response language follow the user request
retrieval language is normalized to English in chat flow
canonical chunk storage for hybrid ingestion is English-first
final answers are guarded for citation use and language consistency

Chat orchestration around retrieval

app/agents/teacher_agent.py wraps retrieval inside a LangGraph state machine:

check_wallet
retrieve
ask_clarifying
finalize

The graph asks for clarification when:

the question is vague and context is missing
retrieval returns no matches
the best retrieval score is weak

Operational caveats

Retrieval depends on both Pinecone and the lexical SQL RPC being healthy.
The final answer generator enforces source-grounding heuristics, so weak citations can degrade into uncertainty responses even when retrieval succeeds.
Cached rerank results live in an in-memory cache, so they are not shared across processes or deploys.