Files
community-rule/docs/guides/ops-runbook.md
T
2026-05-24 16:00:11 -06:00

11 KiB

Steady-state operator runbook

Day-to-day deploy, rollback, and recovery for CommunityRule on MEDLab Cloudron. Assumes staging or production is already installed and smoke-tested.

First-time install, apex cutover, and legacy decommission live in ops-backend-deploy.md. Use this doc once an environment is already running.

1. Quick reference

Item Value
Cloudron dashboard https://my.medlab.host
Cloudron CLI login cloudron login my.medlab.host
Staging app staging.communityrule.info
Production app communityrule.info (after apex cutover)
Container image git.medlab.host/communityrule/community-rule:<tag>
Health check GET /api/health200 {"ok":true,"database":"connected"}
Manifest version CloudronManifest.json version field (must increase for each release)
Current manifest 0.1.8 at time of writing — always read the file before deploying

Replace <app> below with the Cloudron location (staging.communityrule.info or communityrule.info).

2. Prerequisites (one-time per operator)

  1. Cloudron access — admin login on my.medlab.host and a CLI API token (Profile → API Tokens on the dashboard). Save the token in 1Password.
  2. Cloudron CLI — logged in:
    cloudron login my.medlab.host
    
  3. Docker + buildx — Docker Desktop or equivalent with docker buildx.
  4. Gitea registry auth — personal access token on git.medlab.host with read:package + write:package; then:
    docker login git.medlab.host
    
  5. Repo checkout — clone CommunityRule/community-rule and work from a clean commit that matches the release you intend to ship.

Images are linux/amd64 only (Cloudron host is x86_64). On Apple Silicon, the release script still builds amd64 via buildx; a bare docker pull without --platform linux/amd64 failing on arm64 is expected.

3. Deploy a new version

Typical release flow: bump manifest → build/push image → cloudron update → smoke.

3.1 Build and push

  1. Check out the commit to release (main or a release branch).

  2. Bump CloudronManifest.json version (e.g. 0.1.80.1.9). Cloudron requires the manifest version to increase for cloudron update --image to be accepted.

  3. From the repo root, build and push (tag should match the manifest version for sanity):

    TAG=0.1.9 ./scripts/docker-release.sh
    # equivalent:
    TAG=0.1.9 npm run docker:release
    

    Omit TAG= to push git rev-parse --short HEAD instead — only do that for ad-hoc staging experiments; production releases should use semver tags aligned with the manifest.

  4. Verify anonymous pull (simulates Cloudron):

    docker logout git.medlab.host
    docker pull --platform linux/amd64 \
      git.medlab.host/communityrule/community-rule:0.1.9
    
  5. Commit the manifest bump in git alongside the code that shipped in this build.

Registry details and one-time Gitea setup: ops-backend-deploy.md §9.

3.2 Update Cloudron

cloudron update --app staging.communityrule.info \
  --image git.medlab.host/communityrule/community-rule:0.1.9

Use communityrule.info for production. Cloudron pulls the image (no registry credentials on the host), restarts the container, and runs scripts/start.sh, which:

  1. chowns /app/data (localstorage mount),
  2. runs prisma migrate deploy,
  3. execs the Next.js standalone server.

Watch the app Logs tab in the Cloudron dashboard for a clean migration and Listening on port 3000.

3.3 Migrations

Normal case: migrations apply automatically on container start — no separate step.

Manual re-run (only if debugging a failed deploy or verifying before traffic):

cloudron exec --app staging.communityrule.info -- npm run db:deploy

(npm run db:deployprisma migrate deploy.)

Policy: never run prisma migrate reset against staging or production. Never edit migration files already applied to a shared database. Fix schema drift by adding a new migration locally (prisma migrate dev) and deploying a new image. See backend-roadmap.md §8.

3.4 Seed data (not every deploy)

Template + facet seed (MethodFacet rows for create-flow “Recommended” tags) is not applied at boot. Run once per environment after first install, or when recommendations return all-zero scores:

cloudron exec --app staging.communityrule.info -- \
  node prisma/seed.bundle.cjs

Re-running is safe (idempotent upserts). JSON lives at /app/seed-data/ in the image — not under /app/data (Cloudron localstorage overwrites that mount).

3.5 Smoke after deploy

Automated (from your laptop, repo root):

./scripts/staging-smoke.sh staging.communityrule.info
# production:
./scripts/staging-smoke.sh communityrule.info

# optional — exercises magic-link request (check inbox manually):
EMAIL=you@example.com ./scripts/staging-smoke.sh staging.communityrule.info

Manual (still required for full acceptance):

  • Click a magic link → signed in → GET /api/auth/session returns a user.
  • Publish a rule end-to-end → public detail page loads.
  • Optional: Save & Exit draft sync; upload with UPLOAD_ROOT set.

Full checklist and failure table: ops-backend-deploy.md §10.

4. Roll back (code-only)

To revert application code without touching the database:

cloudron update --app staging.communityrule.info \
  --image git.medlab.host/communityrule/community-rule:<previous-tag>

Pick a tag you know was healthy (previous manifest version or git tag recorded at last good deploy).

Database implications:

  • Rolling back the image does not undo migrations already applied.
  • If the bad release added a migration, rolling back to an older image may leave the DB schema ahead of what that code expects — usually safe if the migration was additive (new nullable columns, new tables).
  • If the bad release broke because of a destructive or incompatible migration, do not reset production. Restore from a Cloudron backup (§5) or fix forward with a corrective migration.

Never prisma migrate reset on staging or production.

5. Restore drill (quarterly)

Verify Cloudron backups are restorable without touching the live app.

Cadence: at least once per quarter, or after any backup-policy change.

Steps:

  1. In the Cloudron dashboard, pick a recent automatic backup of <app> (Backups tab).
  2. Restore to a scratch location — e.g. restore-drill-YYYYMMDD.communityrule.info — not over the live app.
  3. After restore completes, confirm the container starts and migrations are current:
    curl -sS "https://restore-drill-YYYYMMDD.communityrule.info/api/health"
    
    Expect 200 with "database":"connected".
  4. Optional: cloudron exec --app restore-drill-YYYYMMDD.communityrule.info -- npm run db:deploy if logs show pending migrations on an older snapshot.
  5. Run ./scripts/staging-smoke.sh restore-drill-YYYYMMDD.communityrule.info.
  6. Uninstall the scratch app when done.

Record the drill date and outcome in your ops notes. Cloudron retains automatic backups per platform defaults; confirm retention in the dashboard.

6. Single-instance limitations

The current Cloudron deploy runs one container per environment. Do not scale to multiple app instances without addressing these per-process limits:

6.1 In-memory rate limiter

lib/server/rateLimit.ts stores windows in process memory. Limits apply per container, not globally across instances.

Route / action Key Min interval
Magic-link request per email 60 s
Magic-link request per IP 20 s
Email change request per email / IP / user 60 s
Organizer inquiry per email / IP 60 s / 20 s
Publish with stakeholder invites per IP 60 s
Stakeholder add / resend per IP / invite 60 s
File upload per user 5 s

Before horizontal scale-out, replace with a shared store (e.g. Redis) or edge rate limits. See backend-roadmap.md §5.

6.2 Web vitals storage

Production defaults to external mode: vitals are structured log lines, not written to Postgres or local files. Setting WEB_VITALS_STORAGE=local uses a per-process file store under .next/web-vitals — suitable for dev/admin only, not multi-instance. See backend-roadmap.md §7.

7. Environment variables (steady-state)

Cloudron auto-injects addon vars (CLOUDRON_POSTGRESQL_URL, CLOUDRON_MAIL_SMTP_*). Operators set these manually once per app; they persist across image updates unless changed:

Variable Purpose
SESSION_SECRET Session cookie signing (≥ 16 chars). Rotating logs everyone out.
SMTP_FROM Visible From on sign-in emails (e.g. Community Rule <hello@communityrule.info>).
NEXT_PUBLIC_ENABLE_BACKEND_SYNC true in staging/production — Postgres draft persistence.
UPLOAD_ROOT /app/data/uploads on Cloudron — required for file uploads.

Full detail: ops-backend-deploy.md §3.

8. Troubleshooting

Symptom Likely cause Action
Image pull error on update Private repo, wrong tag, or amd64 manifest missing Confirm repo is public; verify pull with --platform linux/amd64 (§3.1)
Health 503 / database: disconnected Postgres addon or CLOUDRON_POSTGRESQL_URL missing Cloudron app → Environment
Container crash on start Migration failure App logs around prisma migrate deploy; fix forward with new migration
Magic link not sent Mail addon or SMTP_FROM Cloudron mail logs; CLOUDRON_MAIL_SMTP_* vars
Upload server_misconfigured UPLOAD_ROOT unset cloudron env set --app <app> UPLOAD_ROOT=/app/data/uploads
No “Recommended” on method cards Seed not run §3.4 — node prisma/seed.bundle.cjs
Rate limit too aggressive after deploy Expected per §6.1 Single instance only; limits reset on container restart

App logs: Cloudron dashboard → Logs tab, or cloudron logs --app <app> -f.