Plan: GH-0031 Grounded Reasoning Synthesis
Context
This issue governs the final answer-generation stage of the backend chat pipeline, with emphasis on grounded synthesis, citation discipline, and safe multilingual output.
Problem
Even with good retrieval, answer generation can hallucinate, drop citations, or drift into the wrong language unless the backend enforces explicit guardrails.
Current state in repo
app/services/llm.pyimplements grounded answer generation, citation checks, and language safety.tests/services/test_llm.pycovers empty-context fallback, citation validation, and language-guard behavior.app/agents/teacher_agent.pycalls into this synthesis path after retrieval.
Target state
- Final answers use only grounded context.
- Citation markers are preserved and validated.
- When context is insufficient, the backend returns explicit uncertainty instead of guessing.
- Output remains in the requested language.
Constraints
- Backend-only scope.
- Grounding guardrails must work for streaming and non-streaming paths as much as current architecture allows.
- The plan must preserve citation formatting and math notation.
- The reasoning model should remain replaceable without changing the contract.
Proposed approach
- Define a strict synthesis prompt that requires grounding and citations.
- Short-circuit to uncertainty responses when no usable context exists.
- Post-validate final answers against available citation IDs.
- Apply a language guard after synthesis to keep output consistent with the request.
Risks
- Guardrails that are too strict can reject helpful answers.
- Streaming behavior may lag behind non-streaming validation guarantees.
- Model changes can alter citation behavior or output formatting.
Open questions
- How much post-validation should happen for streaming responses?
- Should exercise-mode hinting stay in this issue or be separated into another planning track?
Acceptance criteria
- A plan doc exists for
#31underdocs/plans/. - The doc states grounding, citations, uncertainty fallback, and language safety as required outputs.
- The plan names the current synthesis service and tests.
- The scope stays backend-only.
Files likely to change
docs/plans/gh-0031-grounded-reasoning-synthesis.mdapp/services/llm.pyapp/agents/teacher_agent.pytests/services/test_llm.pytests/agents/test_teacher_agent.py
Related issue
#31-[Backend][LLM] Reasoning-model grounded synthesis + safe multilingual output
Status
Backfilled planning stub