Skip to content

Plan: GH-0030 Profile-Aware Reranking

Context

This issue adds deterministic user-context signals to retrieval ordering before GPT reranking, using student profile fields such as grade, subject, major, and tier.

Problem

Hybrid retrieval can still rank off-profile chunks highly unless the backend injects explicit user context before the LLM reranker runs.

Current state in repo

  • app/services/retrieval_pipeline.py includes profile-aware rerank logic and metadata.
  • tests/services/test_retrieval_pipeline.py covers grade and subject boosting behavior.
  • The agent already passes grade and subject context into retrieval.

Target state

  • Matching grade, subject, and major context influences ranking deterministically.
  • Tier context can slightly adjust the impact of reranking.
  • The rerank stage remains observable even if GPT reranking is unavailable.

Constraints

  • Backend-only scope.
  • Deterministic reranking must remain explainable and testable.
  • The plan must preserve current retrieval outputs and metadata contracts.
  • User profile context can be incomplete and must not break retrieval when missing.

Proposed approach

  1. Compute explicit context boosts for grade, subject, and major matches.
  2. Apply a tier multiplier to the contextual boost.
  3. Attach a full rerank breakdown into candidate metadata.
  4. Skip the stage cleanly when profile context is missing.

Risks

  • Overweighting profile signals can hide globally relevant prerequisites.
  • Incomplete or stale profile data can reduce answer quality.
  • Deterministic boosts can become arbitrary if not validated against real use cases.

Open questions

  • Should profile-aware reranking ever demote prerequisite material explicitly?
  • Should major and tier weights remain hardcoded or move into configuration later?

Acceptance criteria

  • A plan doc exists for #30 under docs/plans/.
  • The doc defines deterministic pre-LLM reranking based on user profile data.
  • The doc includes fallback behavior for missing profile context.
  • The plan names the current retrieval service and tests.

Files likely to change

  • docs/plans/gh-0030-profile-aware-reranking.md
  • app/services/retrieval_pipeline.py
  • app/agents/teacher_agent.py
  • tests/services/test_retrieval_pipeline.py
  • #30 - [Backend][Retrieval] User-profile-aware reranking (grade/subject/tier)

Status

Backfilled planning stub