Skip to content

Plan: GH-0028 Multilingual Chat Input and Output

Context

This issue covers the multilingual boundary of the backend chat flow: detect incoming prompt language, translate when needed for retrieval, and return answers in the requested language.

Problem

Students can ask in Arabic, French, or English, but retrieval and generation quality degrade if input language, retrieval language, and response language are not handled consistently.

Current state in repo

  • app/services/retrieval_pipeline.py detects query language and can translate queries before retrieval.
  • app/services/llm.py includes translation and output-language guard behavior.
  • app/services/language_context.py holds lightweight language metadata logic.
  • app/agents/teacher_agent.py tracks prompt, retrieval, and response language fields.
  • Tests already cover translation and language-guard behavior.

Target state

  • Chat input language is detected reliably.
  • Retrieval uses a consistent canonical query language when required.
  • Final answers are returned in the user's requested language while preserving citations and math formatting.

Constraints

  • Backend-only scope.
  • Retrieval and output language handling must remain explicit and auditable.
  • Translation behavior must not break citation markers or grounded answers.
  • The solution must integrate with the existing agent and service boundaries.

Proposed approach

  1. Resolve prompt language at the start of the request.
  2. Translate only the retrieval query when corpus language requires it.
  3. Keep response language separate from retrieval language.
  4. Apply a final language guard after grounded synthesis so output stays in the requested language.

Risks

  • Extra translation steps can add latency and cost.
  • Language detection errors can degrade retrieval relevance.
  • Translation can distort student intent or citation formatting if guardrails are weak.

Open questions

  • Should prompt language detection happen in the router, the agent, or the retrieval layer only?
  • Which language transitions should be logged for debugging and analytics?

Acceptance criteria

  • A plan doc exists for #28 under docs/plans/.
  • The doc distinguishes prompt language, retrieval language, and response language.
  • The plan stays backend-only and grounded in current code paths.
  • The likely service and test files are identified.

Files likely to change

  • docs/plans/gh-0028-multilingual-chat-io.md
  • app/agents/teacher_agent.py
  • app/services/retrieval_pipeline.py
  • app/services/llm.py
  • app/services/language_context.py
  • tests/services/test_retrieval_pipeline.py
  • tests/services/test_llm.py
  • #28 - [Backend][Chat] Language detect/translate in + translate out

Status

Backfilled planning stub