Skip to content

Repository Structure

/
├── .env.example                # Template for environment variables
├── .gitignore                  # Git ignore patterns
├── .pre-commit-config.yaml     # Pre-commit hooks configuration
├── .dockerignore               # Docker ignore patterns
├── requirements.txt            # Python dependencies
├── requirements-dev.txt        # Development dependencies
├── pytest.ini                  # Pytest configuration
├── ruff.toml                   # Ruff linter/formatter configuration
├── Dockerfile                  # Container build instructions
├── docker-compose.yml          # Local development stack
├── render.yaml                 # Render.com deployment configuration
├── mkdocs.yml                  # Documentation site configuration
├── openapi.json                # OpenAPI/Swagger specification
├── Readme.md                   # Project README
├── GEMINI.md                   # Gemini-specific documentation
├── app/                        # Main application code (56 Python files)
│   ├── main.py                 # Application entry point
│   ├── agents/                 # LangGraph state machines (2 files)
│   │   ├── __init__.py
│   │   └── teacher_agent.py    # Main conversational agent
│   ├── api/                    # FastAPI routing layer
│   │   ├── __init__.py
│   │   ├── router.py           # Main router with registered endpoints
│   │   └── routers/            # Individual route modules (9 files)
│   │       ├── __init__.py
│   │       ├── admin.py        # REGISTERED - Admin endpoints
│   │       ├── auth.py         # REGISTERED - Authentication
│   │       ├── chat.py         # REGISTERED - Chat/conversation
│   │       ├── curriculum.py   # REGISTERED - Curriculum browsing
│   │       ├── me.py           # REGISTERED - User profile
│   │       ├── quiz.py         # NOT REGISTERED - Stub implementation
│   │       ├── scraping.py     # REGISTERED - Web scraping
│   │       └── wallet.py       # REGISTERED - Billing/wallet
│   ├── core/                   # Core infrastructure (8 files)
│   │   ├── __init__.py
│   │   ├── auth.py             # JWT verification, admin auth
│   │   ├── config.py           # Settings via pydantic-settings
│   │   ├── dependencies.py     # Dependency injection singletons
│   │   ├── logger.py           # Logging configuration
│   │   ├── logging.py          # Additional logging utilities
│   │   ├── metrics.py          # Metrics collection
│   │   └── middleware.py       # Request/response middleware
│   ├── models/                 # Database models (3 files)
│   │   ├── __init__.py
│   │   ├── billing.py          # Wallet, reservations, transactions
│   │   └── ingestion.py        # Ingestion jobs, chunks
│   ├── schemas/                # Request/response schemas (8 files)
│   │   ├── __init__.py
│   │   ├── admin.py            # Admin request/response models
│   │   ├── auth.py             # Auth schemas
│   │   ├── chat.py             # Chat message schemas
│   │   ├── curriculum.py       # Curriculum schemas
│   │   ├── ingestion.py        # Ingestion request/response
│   │   ├── scraping.py         # Scraping schemas
│   │   └── user.py             # User profile schemas
│   └── services/               # Business logic (22 files)
│       ├── __init__.py
│       ├── cache.py            # Caching layer
│       ├── chunking.py         # Text chunking service
│       ├── circuit_breaker.py  # Circuit breaker pattern
│       ├── deduplication.py    # Chunk deduplication
│       ├── embedding_service.py # OpenAI embeddings + Pinecone upsert
│       ├── gpt_mini.py         # GPT-4o-mini service
│       ├── ingestion.py        # Document ingestion pipeline
│       ├── llm.py              # GPT-4o answer generation
│       ├── pdf_processor.py    # PDF processing
│       ├── pinecone_adapter.py # Pinecone client wrapper
│       ├── quality_checker.py  # Content quality validation
│       ├── quiz_generator.py   # Quiz generation logic
│       ├── retrieval_pipeline.py # RAG retrieval pipeline
│       ├── scraper_service.py  # Web scraping orchestration
│       ├── text_normalizer.py  # Text normalization
│       ├── tier_config.py      # Tier-based configuration
│       ├── upload.py           # File upload handling
│       ├── wallet_reservation.py # Atomic billing operations
│       └── scrapers/           # Source-specific scrapers (3 files)
│           ├── __init__.py
│           ├── base.py         # Base scraper class
│           └── koutoubi.py     # Koutoubi.ma scraper
├── tests/                      # Test suite (24 Python files)
│   ├── __init__.py
│   ├── conftest.py             # Shared fixtures and mocks
│   ├── test_health.py          # Health endpoint test
│   ├── agents/                 # Agent tests (2 files)
│   │   ├── __init__.py
│   │   └── test_teacher_agent.py
│   ├── routers/                # Router integration tests (10 files)
│   │   ├── __init__.py
│   │   ├── test_admin_documents.py
│   │   ├── test_admin_ingestion.py
│   │   ├── test_admin_users.py
│   │   ├── test_auth.py
│   │   ├── test_chat.py
│   │   ├── test_curriculum.py
│   │   ├── test_me.py
│   │   ├── test_scraping.py
│   │   └── test_wallet.py
│   └── services/               # Service unit tests (9 files)
│       ├── __init__.py
│       ├── test_cache.py
│       ├── test_deduplication.py
│       ├── test_llm.py
│       ├── test_pdf_processor.py
│       ├── test_quality_checker.py
│       ├── test_text_normalizer.py
│       ├── test_tier_config.py
│       └── test_wallet_reservation.py
├── db/                         # Database schemas and migrations
│   ├── bootstrap.sql           # Complete schema for new installations
│   └── migrations/             # Versioned SQL migrations (22 files)
│       ├── 20260216000001_curriculum_document_tracking.sql
│       ├── 20260216000002_usage_logs.sql
│       ├── 20260216000003_wallet_system.sql
│       ├── 20260216000004_create_scrape_runs.sql
│       ├── 20260216000005_create_references.sql
│       ├── 20260216000006_rls_phase_1.sql
│       ├── 20260216000007_profiles_trigger.sql
│       ├── 20260216000008_pedagogical_tables.sql
│       ├── 20260216000009_rls_phase_2_public_tables.sql
│       ├── 20260216000010_secure_functions.sql
│       ├── 20260216000011_indexes.sql
│       ├── 20260217000012_ingestion_jobs.sql
│       ├── 20260217000013_chunks_enhanced.sql
│       ├── 20260217000014_reservations.sql
│       ├── 20260217000015_embedding_refs.sql
│       ├── 20260217000016_rls_new_tables.sql
│       ├── 20260217000017_references_enhancements.sql
│       ├── 20260217000018_jwt_custom_claims_hook.sql
│       ├── 20260217000019_update_rls_for_jwt_claims.sql
│       ├── 20260217000020_transactions.sql
│       ├── 20260218000021_documents_enrichment.sql
│       └── 20260219000022_fix_rls_service_role.sql
├── docs/                       # MkDocs documentation site
│   ├── index.md
│   ├── 00_overview/
│   ├── 10_current_state/
│   ├── 20_runbooks/
│   ├── 30_design/
│   ├── 90_ops/
│   ├── 99_archive/
│   ├── Artifacts/
│   ├── plans/
│   └── Postman/
├── postman/                    # API testing collection
│   ├── collection.json
│   ├── environment_local.json
│   ├── environment_staging.json
│   └── environment_production.json
└── assets/                     # Static assets
    ├── cover.png
    └── logo.png

Key Files

File Purpose
app/main.py FastAPI application entry point, CORS, middleware setup
app/core/config.py Settings loaded from .env via pydantic-settings
app/core/auth.py JWT verification, admin authorization, service client
app/core/dependencies.py Singleton service instances for dependency injection
app/core/middleware.py Request ID, logging, metrics middleware
app/agents/teacher_agent.py LangGraph workflow: check_wallet → retrieve/clarify → finalize
app/services/llm.py GPT-4o answer generation with exercise detection
app/services/wallet_reservation.py Atomic reserve/finalize billing operations + top-up
app/services/retrieval_pipeline.py Full RAG pipeline: embed → Pinecone search → rerank → fetch chunks
app/services/embedding_service.py OpenAI embeddings with Pinecone upsert and reference tracking
app/services/pinecone_adapter.py Pinecone client wrapper (lightweight metadata only)
app/services/scraper_service.py Web scraping orchestration
tests/conftest.py Library-level mocks for Supabase/OpenAI/Pinecone
db/bootstrap.sql Complete database schema for fresh installations

Registered Routers (in app/api/router.py)

The following routers are actively registered and available:

Router Prefix Domain Status
auth.py /auth Authentication (signup, signin, logout) ✅ ACTIVE
me.py /me User profile (GET/PATCH) ✅ ACTIVE
chat.py / Chat endpoint with streaming ✅ ACTIVE
curriculum.py /curriculum Subjects, textbooks, textbook detail ✅ ACTIVE
admin.py /admin User management, ingestion, references, stats ✅ ACTIVE
wallet.py /wallet Balance, reservations, top-up, transactions ✅ ACTIVE
scraping.py /scraping Scraper sync, runs, references ✅ ACTIVE

Unregistered Routers

Router Reason Implementation Status
quiz.py Not included in app/api/router.py Stub only - returns mock data

The quiz router exists with proper schemas and endpoint structure but: - Contains TODO comments for full implementation - Returns hardcoded stub responses - Not wired into the main API router - Missing actual quiz generation logic integration

Directory Summary

Directory File Count Purpose
app/ 56 Main application code
app/agents/ 2 LangGraph conversational agents
app/api/routers/ 9 FastAPI route handlers
app/core/ 8 Core infrastructure (auth, config, DI)
app/models/ 3 Database entity models
app/schemas/ 8 Pydantic request/response schemas
app/services/ 22 Business logic and integrations
app/services/scrapers/ 3 Source-specific web scrapers
tests/ 24 Test suite
db/migrations/ 22 Database migration scripts

Partially Implemented Features

Quiz Generation

  • Router: app/api/routers/quiz.py exists but not registered
  • Service: app/services/quiz_generator.py exists with logic
  • Status: Endpoint returns stub data with TODO comments
  • Missing: Router registration, dependency injection wiring

Ingestion Pipeline

  • Status: Core services implemented (chunking, deduplication, embedding)
  • Components: PDF processing, quality checking, text normalization
  • Integration: Admin endpoints for triggering ingestion exist

Web Scraping

  • Status: Base scraper framework and Koutoubi scraper implemented
  • Router: Registered and active (/scraping)
  • Components: Scraper service orchestration, run tracking

Notes

  • No scripts/ directory exists in the repository
  • All 22 database migrations are present and sequentially numbered
  • Test coverage spans routers, services, and agents
  • Documentation uses MkDocs with organized sections
  • Postman collection includes local, staging, and production environments
  • Docker support via Dockerfile and docker-compose.yml
  • Deployment configured for Render.com via render.yaml