Plan: GH-0031 Grounded Reasoning Synthesis

Context

This issue governs the final answer-generation stage of the backend chat pipeline, with emphasis on grounded synthesis, citation discipline, and safe multilingual output.

Problem

Even with good retrieval, answer generation can hallucinate, drop citations, or drift into the wrong language unless the backend enforces explicit guardrails.

Current state in repo

app/services/llm.py implements grounded answer generation, citation checks, and language safety.
tests/services/test_llm.py covers empty-context fallback, citation validation, and language-guard behavior.
app/agents/teacher_agent.py calls into this synthesis path after retrieval.

Target state

Final answers use only grounded context.
Citation markers are preserved and validated.
When context is insufficient, the backend returns explicit uncertainty instead of guessing.
Output remains in the requested language.

Constraints

Backend-only scope.
Grounding guardrails must work for streaming and non-streaming paths as much as current architecture allows.
The plan must preserve citation formatting and math notation.
The reasoning model should remain replaceable without changing the contract.

Proposed approach

Define a strict synthesis prompt that requires grounding and citations.
Short-circuit to uncertainty responses when no usable context exists.
Post-validate final answers against available citation IDs.
Apply a language guard after synthesis to keep output consistent with the request.

Risks

Guardrails that are too strict can reject helpful answers.
Streaming behavior may lag behind non-streaming validation guarantees.
Model changes can alter citation behavior or output formatting.

Open questions

How much post-validation should happen for streaming responses?
Should exercise-mode hinting stay in this issue or be separated into another planning track?

Acceptance criteria

A plan doc exists for #31 under docs/plans/.
The doc states grounding, citations, uncertainty fallback, and language safety as required outputs.
The plan names the current synthesis service and tests.
The scope stays backend-only.

Files likely to change

docs/plans/gh-0031-grounded-reasoning-synthesis.md
app/services/llm.py
app/agents/teacher_agent.py
tests/services/test_llm.py
tests/agents/test_teacher_agent.py

#31 - [Backend][LLM] Reasoning-model grounded synthesis + safe multilingual output

Status

Backfilled planning stub