DocuGardenerDocs

Architecture

DocuGardener is a two-plane system: a Python analysis engine and a Next.js control plane, backed by PostgreSQL and Redis.

Service Map

ServiceTechPortRole
webNext.js 14 App Router3003Dashboard, auth (NextAuth), settings, billing, docs site
apiPython FastAPI8000Webhook handler, analysis API, health checks
workerPython RQAsync job processor for PR analysis and fix-PR creation
schedulerAPSchedulerPeriodic jobs: stale sweeper (60s), nightly rollup
postgresPostgreSQL 155433Primary database for both planes (shared schema)
redisRedis 76379RQ job queue and caching
weaviateWeaviate8080Vector DB for document embeddings (optional)

Analysis Pipeline

When a pull request is opened or updated in a monitored repository:

  1. Webhook received — GitHub sends a pull_request event to POST /webhooks/github. FastAPI validates the HMAC-SHA256 signature and returns 200 immediately.
  2. Job enqueued — A analyze_pr job is pushed to the RQ default queue. Quota is checked before enqueue; exceeded quota creates a QUOTA_EXCEEDED job and stops.
  3. Worker picks up jobprocess_pull_request() in src/pipeline/handler.py runs. It clones the PR branch (shallow fallback on network error), parses changed files, embeds documents into Weaviate, and calls the LLM for drift analysis.
  4. LLM analysissrc/agents/verifier.py constructs the prompt from code diff + semantic search results and calls the configured LLM provider (Gemini, OpenAI, Anthropic, or Ollama). Response is parsed into a structured DriftAnalysis object with per-file scores.
  5. Results stored — Job record updated in PostgreSQL with status=COMPLETED, drift score, reasons, and suggested fixes. The result JSON field stores the full analysis payload.
  6. GitHub check run postedsrc/pipeline/reporter.py posts a check run on the PR with the drift score, per-file breakdown, and suggested fixes. Always runs in a finally block — the check run resolves even if analysis fails.
  7. AI Author Mode (optional) — If enabled and drift is above threshold, a fix-PR job is enqueued to the high priority queue.

Database Schema (key models)

ModelPurpose
TenantAn organisation account. Holds plan, Stripe IDs, workflowConfig (feature grants, quota ceiling), llmConfig (BYOK keys).
UserA member of a tenant. Has role (OWNER/ADMIN/MEMBER/AUDITOR/BILLING_ADMIN).
RepositoryA GitHub repository registered for monitoring. enabled flag controls whether events are processed.
JobOne PR analysis run. status: PENDING → PROCESSING → COMPLETED / FAILED / QUOTA_EXCEEDED. result JSON holds full analysis payload.
AuditLogTamper-evident audit chain. SHA-256 hash chains each entry to the previous one.

Full schema: web/prisma/schema.prisma

Multi-Tenancy

Each authenticated GitHub user is provisioned into exactly one Tenant. All database reads and writes in the Next.js API routes are scoped to the authenticated tenant via NextAuth session. The FastAPI backend identifies the tenant from the GitHub App installation ID on the webhook event.

In self-hosted single-tenant mode, SINGLE_TENANT_ID pins all backend writes to one tenant, bypassing multi-tenant lookups.

LLM Routing

The LLM provider is configurable per-tenant via tenant.llmConfig. The src/agents/llm.py factory resolves the provider at job runtime:

  • Hosted — uses the bundled Gemini key from BUNDLED_GEMINI_KEY
  • BYOK Cloud — uses tenant-supplied key for Gemini, OpenAI, Anthropic, or Azure OpenAI
  • BYOK Local — routes to Ollama at the tenant-configured base URL

All LLM calls go through _llm_call_with_retry() — exponential backoff (max 3 attempts) on transient HTTP errors (429, 502, 503, 504, 529). Per-tenant token-bucket rate limiting (60 req/min) applies on top.