Plan: GH-0032 Release Verification and Observability
Context
This issue is the backend release gate for proving the scrape -> ingest -> retrieve -> answer pipeline is observable and stable enough to ship.
Problem
A backend release is risky without explicit verification gates, request tracing, metrics, and regression checks that cover the end-to-end path rather than isolated units only.
Current state in repo
app/core/metrics.py,app/core/logging.py, andapp/core/middleware.pyprovide metrics, structured logs, and request IDs.app/api/routers/metrics.pyis registered in the API router and has tests intests/routers/test_metrics.py.- Promptfoo assets and CI workflow exist under
promptfoo/and.github/workflows/promptfoo-rag-eval.yml. - Request correlation and hybrid retrieval logging already exist in runtime code.
Target state
- The backend has a documented release verification path with observable checkpoints.
- Request IDs, metrics, and retrieval logs are sufficient to trace production failures.
- Regression tooling covers both service-level and end-to-end backend behavior.
Constraints
- Backend-only scope.
- Verification should rely on code and CI artifacts in this repo, not a separate admin interface.
- Release evidence must be usable by both humans and AI agents.
- Observability additions must not materially break runtime performance.
Proposed approach
- Treat request ID propagation, metrics endpoints, and structured logs as required release signals.
- Use Promptfoo and targeted tests to cover retrieval and multilingual regressions.
- Define an end-to-end smoke path that touches scraping, ingestion, and chat where feasible.
- Record release-readiness evidence in docs and CI outputs rather than ad hoc notes only.
Risks
- Verification can become expensive or flaky if it depends too heavily on live external services.
- Metrics and logs can exist without answering the most important operational questions.
- End-to-end checks may lag behind real production data quality.
Open questions
- Which checks should block merge versus remain informational?
- Should release evidence live only in CI artifacts, or also be summarized in docs for each release cycle?
Acceptance criteria
- A plan doc exists for
#32underdocs/plans/. - The doc defines observability and end-to-end verification as the release gate.
- The doc names current metrics, logging, request ID, and Promptfoo touchpoints.
- The plan remains backend-only.
Files likely to change
docs/plans/gh-0032-release-verification-observability.mdapp/core/metrics.pyapp/core/logging.pyapp/core/middleware.pyapp/api/routers/metrics.pytests/routers/test_metrics.pypromptfoo/promptfooconfig.yaml.github/workflows/promptfoo-rag-eval.yml
Related issue
#32-[Backend][Verify] Observability + end-to-end release verification
Status
Backfilled planning stub