Ingestion Handoff Orchestration (Issue #26)
What changed
When POST /scraping/{source}/sync persists references, backend now immediately runs a reference_id handoff orchestration step:
- Load references touched by the current
scrape_run_id. - Create/update an
ingestion_handoffsrecord per(reference_id, payload_hash). - Queue ingestion via
ingestion_jobsusingreference_id. - Track handoff lifecycle + reason codes for observability.
Lifecycle
ingestion_handoffs.status uses:
queuedrunningcompletedfailed
reason_code captures why the transition happened (examples: queued_from_scrape, ingestion_job_queued, ingestion_job_already_active, retry_exhausted).
Retry strategy
- Max attempts:
3(configurable at service init). - Retry only on transient errors (timeouts, connection/rate-limit/503 patterns).
- Backoff is exponential (
base * 2^(attempt-1)); default base is0in current implementation to keep sync calls fast. - On max-attempt exhaustion, handoff is marked
failedwithreason_code=retry_exhausted.
Idempotency / dedupe
- DB-level dedupe:
UNIQUE(reference_id, payload_hash)oningestion_handoffs. - Runtime dedupe: if latest ingestion job for a reference is already active/ready, no duplicate job payload is inserted.
Queryability
GET /scraping/{source}/handoffslists handoff records.- Supports filters:
status,scrape_run_id, pagination (limit,offset).
Sync response additions
POST /scraping/{source}/sync now includes:
handoff_queued_counthandoff_completed_counthandoff_failed_counthandoff_skipped_count
Rollback plan (migration 029)
If rollback is required, treat this migration as data-destructive for handoff history only (does not delete references or ingestion_jobs rows).
DROP TRIGGER IF EXISTS trg_update_ingestion_handoff_timestamp ON ingestion_handoffs;
DROP TABLE IF EXISTS ingestion_handoffs;
DROP FUNCTION IF EXISTS update_ingestion_handoff_timestamp();
Post-rollback expectation:
- /scraping/{source}/sync still persists references, but handoff orchestration/query endpoint (/handoffs) is unavailable until migration 029 is re-applied.