Files
community-rule/docs/guides/ops-backend-deploy.md
T
2026-05-23 19:35:38 -06:00

23 KiB
Raw Blame History

Backend deploy — admin handoff + cutover plan

This doc captures everything needed to deploy the new CommunityRule (Next.js + Postgres) onto MEDLab's Cloudron and replace the legacy LAMP-packaged service at communityrule.info. Cloudron admin access has been granted, CR-96 (Cloudron-native env vars) and CR-97 (container registry + first image push) are done; the remaining gate is CR-98 (staging install + smoke — §10).

For a plain-language summary to hand to MEDLab's Cloudron admin, see ../relaunch-brief.md. This doc is the technical version.

1. Context

  • This app fully replaces the existing communityrule.info service — both the marketing site and the backend API.
  • The existing service is a single Cloudron LAMP app (lamp.cloudronapp.php74@5.1.2, installed at the communityrule.info apex, 512 MiB) that hosts three things stuffed into one container under /app/data/public/:
    1. The static marketing site (HTML / CSS / images).
    2. The Express/MySQL backend at CommunityRule/CommunityRuleBackend, kept alive by a 30-min lsof-based run.sh watchdog on port 3000. MySQL is the LAMP package's bundled MySQL, persisted inside /app/data (not a Cloudron addon).
    3. A Flask chatbot at CommunityRule/CommunityRuleChatBot on port 5000, also watchdog-supervised; currently crash-looping with ModuleNotFoundError: No module named 'flask' and last touched in May 2024. Not migrated. Dies with the LAMP container at decommission.
  • The new app is a properly packaged Cloudron app (Docker image + CloudronManifest.json, postgresql + sendmail + localstorage addons). Cloudron's container supervisor replaces the watchdog.
  • Greenfield Postgres. No data migration from the LAMP container's internal MySQL. Old auth (4-digit OTP in email_otp) is replaced by hashed magic-link tokens. Old API and rules / version_history tables do not map to anything in the new app.

2. Access — granted

Cloudron admin login on my.medlab.host granted (note: this is the Cloudron dashboard, not cloud.medlab.host, which is MEDLab's Nextcloud file portal). From the dashboard the deployer can self-serve:

  • Cloudron admin login (full admin on the MEDLab instance).
  • DNS for communityrule.info — domain is managed inside Cloudron, so new subdomains and TLS certs are one-click.
  • App log access — Cloudron web log viewer.
  • Read of legacy app config — visible in admin UI.
  • cloudron CLI token — generate at Profile → API Tokens before first install. Save in 1Password.

3. Environment variables

Cloudron auto-injects (provisioned by addons declared in CloudronManifest.json)

Cloudron addons are not "enabled" platform-wide; they are requested per-app in the manifest and provisioned at install time.

  • CLOUDRON_POSTGRESQL_URL — from the postgresql addon. The app reads this name directly (Prisma + lib/server/env.ts).
  • CLOUDRON_MAIL_SMTP_SERVER / _PORT / _USERNAME / _PASSWORD — from the sendmail addon. The platform Mail server is configured for communityrule.info with Amazon SES relay + "allow custom from address" on, so SMTP_FROM of our choice will deliver. The app assembles a Nodemailer transport URL from these four vars in lib/server/env.ts.

I set manually via cloudron env set --app <id/location>

  • SESSION_SECRET — long random (openssl rand -hex 32). Required, ≥ 16 chars. Rotating it logs everyone out.
  • SMTP_FROM — visible "From:" address on sign-in emails. Cloudron does not inject this. Use hello@communityrule.info (continuity with the legacy service; SES relay accepts it).
  • NEXT_PUBLIC_ENABLE_BACKEND_SYNC=true — turns on Postgres draft persistence for signed-in users. Required in production.
  • UPLOAD_ROOT — absolute path to a writable directory on the Cloudron localstorage mount for POST /api/uploads (community photo + custom-method attachments). Use /app/data/uploads on Cloudron (start.sh chowns /app/data for the node user). When unset, upload routes return server_misconfigured. See CONTRIBUTING.md API table.

4. Platform settings

  • Container httpPort: 3000 (matches Dockerfile ENV PORT=3000).
  • Health-check path: /api/health (app/api/health/route.ts returns 200 {"ok":true,"database":"connected"} when healthy, 503 otherwise).
  • Memory limit: 768 MiB in CloudronManifest.json (memoryLimit: 805306368). The legacy LAMP app ran at 512 MiB; raise further only if Next.js standalone OOMs under load.
  • Backups: Cloudron's automatic backups are already on for the host (legacy app shows weekly snapshots ~451 MB each). Same default applies to new apps.
  • TLS / DNS / SPF / DKIM: handled by Cloudron for any subdomain of communityrule.info.

5. Cutover plan (side-by-side, never in-place)

The legacy app is at the apex communityrule.info and is still serving real traffic. Best practice is side-by-side cutover — new app gets validated at a fresh subdomain before any swap touches the apex.

Phases

  1. Staging install — from a checkout whose CloudronManifest.json version matches the pushed image tag, run:
    cloudron install --location staging.communityrule.info \
      --image git.medlab.host/communityrule/community-rule:<tag>
    
    Set manual env vars from §3. prisma migrate deploy runs automatically in scripts/start.sh on container start. Smoke per CR-98 (§12).
  2. Soft launch / acceptance — share the staging URL with a small group, exercise sign-in + publish + draft sync end-to-end. Hold here until confident.
  3. Apex cutover at a scheduled low-traffic window — this is the only step with brief downtime (~515 min). Sequence:
    1. Take one final manual backup of the legacy LAMP app (Cloudron Backups tab → Backup now).
    2. cloudron uninstall the legacy app at communityrule.info.
    3. cloudron configure --location communityrule.info to move the validated staging install to the apex (or cloudron install fresh at apex if cleaner).
    4. Re-run prisma migrate deploy, re-set production env vars if not preserved by the move, smoke again.
  4. Decommission — see CR-101. Hold the final LAMP backup ≥ 90 days for safety.

Why not in-place?

Uninstalling the legacy app and installing the new one at apex without a staging step means the live site is down for the entire duration of the first install — and the first install is exactly when all the env-var / addon / port surprises happen. Side-by-side keeps those surprises out of view.

6. Decisions — status

Product decisions (closed):

  1. Final URL — communityrule.info apex. New app fully replaces the legacy site, including the marketing surface. Brief cutover downtime (~515 min) is accepted.
  2. Legacy rules data — not migrated. No data moves into the new app's Postgres. A pre-cutover read-only export of the rules + version_history MySQL tables is under consideration; approach depends on the actual row count, which we'll pull as part of the CR-99 pre-cutover backup. Tracked in CR-102.

Infra decision closed:

  1. Container registry — Gitea Container Registry on git.medlab.host. Same host as Cloudron (193.46.198.90). The CommunityRule/community-rule repo must be public so the container package inherits public visibility (Gitea does not expose per-package visibility toggles — visibility follows the owning repo). Public pull sidesteps the same-host docker-login "socket hangup" bug, so Cloudron pulls without credentials. Push auth from operator laptops uses a Gitea personal access token (read:package + write:package). Canonical image ref: git.medlab.host/communityrule/community-rule:<tag>. Images are built linux/amd64 only (Cloudron host is x86_64). Operator build/push workflow lives in §9. First verified image: …:0.1.0 (digest sha256:e652f9f4bfa4154412cc9d8b63d55c94a128e8935579d101b5ab8977e2080e52). Tracked in CR-97 (Done). Fallback if same-host pull ever breaks: install the Cloudron Container Registry app and re-tag against its hostname; no other changes required.

7. Old vs new deltas

So nothing surprises anyone at cutover:

  • Legacy is a LAMP package with bundled MySQL inside the container. New app uses the Cloudron postgresql + sendmail + localstorage addons — entirely different storage, no shared state.
  • Legacy stuffs three apps (marketing + Node backend + Python chatbot) into one container with a run.sh watchdog. New app is one Next.js process, supervised by Cloudron natively.
  • Old auth = plaintext 4-digit OTP. New auth = hashed magic link in email. If users report "I'm not getting a code," remind them to look for a link instead.
  • Old code hardcoded from: 'hello@communityrule.info' in controllers/emailController.js because Cloudron does not inject a MAIL_FROM. New app reads SMTP_FROM — see §3.
  • Old API surface (/api/send_otp, /api/publish_rule, etc.) and schema (rules + version_history tables, soft-delete via deleted column) do not overlap with the new app. No data migration.
  • The Flask chatbot at CommunityRule/CommunityRuleChatBot is currently crash-looping inside the LAMP container and is not being migrated — confirmed with admin. It dies when the LAMP container is uninstalled in CR-101.

8. Follow-up tickets

All filed in Linear, titled [Backend] …, assigned to me, in the Community-rule team, Backlog state.

  1. CR-96[Backend] Cloudron-native env vars (Done — app reads CLOUDRON_POSTGRESQL_URL and CLOUDRON_MAIL_SMTP_* only).
  2. CR-97[Backend] Container image registry: choose, build, push (Done). Registry decided (§6.3); packaging + build/push workflow shipped (§9). First image pushed and verified via anonymous docker pull (§9).
  3. CR-98[Backend] Cloudron staging install + smoke at staging.communityrule.info. Next — checklist in §10. Requires Cloudron CLI token (§2) only; CR-96 and CR-97 are done.
  4. CR-99[Backend] Cloudron production install + apex cutover. Side-by-side cutover at scheduled low-traffic window per §5. Blocked by CR-98 green + CR-102 resolved.
  5. CR-100[Backend] Steady-state operator runbook. Blocked by CR-98 (write what we actually did).
  6. CR-101[Backend] Decommission legacy CommunityRule LAMP app. Uninstall the entire LAMP slot (marketing + Express backend + chatbot in one go); preserve final backup ≥ 90 days. Blocked by CR-99 + sign-off window. Priority: Low.
  7. CR-102[Backend] Decide fate of legacy rules table (read-only export?). Count rows + decide whether to publish a static archive before CR-99 uninstalls the legacy MySQL. Priority: Low.

9. Build and push image workflow

The repo is packaged as a Cloudron app via CloudronManifest.json, Dockerfile, scripts/start.sh, and scripts/docker-release.sh. The manifest declares httpPort 3000, healthCheckPath /api/health, memoryLimit 768 MiB, minBoxVersion 9.0.0, and the postgresql + sendmail + localstorage addons. The Dockerfile reuses the base image's node user (uid 1000), installs gosu for the privilege drop, and symlinks .next/cache → /tmp/next-cache so Next.js ISR works on Cloudron's read-only rootfs. start.sh runs as root to chown /app/data (localstorage mount), then drops to node:node, applies prisma migrate deploy, and execs the Next.js standalone server.

One-time setup (per operator)

  1. Generate a Gitea PAT. In Gitea web UI: avatar → Settings → Applications → Manage Access Tokens → Generate New Token. Check read:package and write:package. Save in 1Password.
  2. docker login git.medlab.host with your Gitea username and the PAT as password. Expect Login Succeeded.
  3. Confirm you have package-write rights on the CommunityRule org (you do if you can push commits to the repo).

Per-release workflow

  1. Bump the manifest version. Edit CloudronManifest.json:

    • increment version (e.g. 0.1.00.1.1) — Cloudron requires it to increase for cloudron update --image to be accepted.
  2. Run the release script from the repo root:

    ./scripts/docker-release.sh
    # or, equivalently:
    npm run docker:release
    

    Override the tag with TAG=0.1.1 ./scripts/docker-release.sh for semver releases. The script prints the exact cloudron install / cloudron update --image … commands to run next.

  3. First push only: confirm the CommunityRule/community-rule repo is Public (Settings → General). Gitea inherits container-package visibility from the repo — there is no per-package visibility toggle. Org owners are not required if you have repo-admin rights on this repo.

  4. Verify the pull works without credentials (simulates Cloudron's anonymous pull):

    docker logout git.medlab.host
    # Image is linux/amd64 only. On Apple Silicon, add --platform:
    docker pull --platform linux/amd64 git.medlab.host/communityrule/community-rule:<tag>
    

    A bare docker pull on arm64 Macs fails with "no matching manifest for linux/arm64" — that is expected and does not indicate an auth problem. Cloudron (x86_64) pulls the amd64 manifest without --platform.

  5. Commit the manifest change alongside any code changes that shipped in this build, so the manifest and image stay in lockstep.

Install / update on Cloudron

From the repo dir on the operator's machine, with cloudron CLI logged in to my.medlab.host:

# First install (staging):
cloudron install --location staging.communityrule.info \
  --image git.medlab.host/communityrule/community-rule:<tag>

# Subsequent updates:
cloudron update --app staging.communityrule.info \
  --image git.medlab.host/communityrule/community-rule:<tag>

Pass the registry image with --image; it is not a field in CloudronManifest.json.

CI — deferred (stretch goal)

CR-97 acceptance lists a stretch goal of building and pushing on merge to main via Gitea Actions. Deferred: no hosted runners are available today, and the manual workflow above is acceptable for v1 staging and production. Revisit when runners return or when release cadence justifies the runner cost.

10. Staging install + smoke (CR-98)

Goal: Install the pushed image at staging.communityrule.info, configure production env vars, and verify the vertical slice before apex cutover (CR-99).

Prerequisites (all satisfied unless noted):

  • CR-96 — app reads CLOUDRON_POSTGRESQL_URL and CLOUDRON_MAIL_SMTP_* only (lib/server/env.ts, prisma/schema.prisma). No DATABASE_URL / SMTP_URL shim.
  • CR-97 — image pushed to git.medlab.host/communityrule/community-rule:0.1.0 (or current tag in manifest); repo is public; anonymous amd64 pull verified (§9).
  • Cloudron CLI token — generate at Profile → API Tokens on my.medlab.host; save in 1Password (§2).
  • Cloudron admin login on my.medlab.host (§2).
  • DNScommunityrule.info managed in Cloudron; staging subdomain will be provisioned at install time.

Install steps:

  1. Checkout a commit whose CloudronManifest.json version and memoryLimit match the image you intend to run (currently 0.1.1git.medlab.host/communityrule/community-rule:0.1.1).
  2. Log in to Cloudron CLI:
    cloudron login my.medlab.host
    
  3. Install or update from the repo root (manifest is read for addons; image comes from --image):
    cloudron update --app staging.communityrule.info \
      --image git.medlab.host/communityrule/community-rule:0.1.1
    
    (Use cloudron install --location … --image … only if staging is not already installed.) Cloudron provisions postgresql, sendmail, and localstorage addons from the manifest, pulls the image (no registry credentials needed), and starts the container. scripts/start.sh chowns /app/data, runs prisma migrate deploy, then execs the Next.js server.
  4. Set manual env vars (Cloudron does not inject these):
    cloudron env set --app staging.communityrule.info \
      SESSION_SECRET="$(openssl rand -hex 32)" \
      SMTP_FROM="Community Rule <hello@communityrule.info>" \
      NEXT_PUBLIC_ENABLE_BACKEND_SYNC=true \
      UPLOAD_ROOT=/app/data/uploads
    
    Rotating SESSION_SECRET logs everyone out. SMTP_FROM must be an address SES accepts on communityrule.info (platform mail addon is SES-relayed with custom-from allowed — §3).
  5. Confirm the app is running in the Cloudron dashboard (Logs tab). Look for a clean prisma migrate deploy and Next.js listening on port 3000.
  6. Seed facet data (one-time per environment) — templates + MethodFacet rows for create-flow "Recommended" tags are not applied at boot. After first install (or when recommendations return all-zero scores), run:
    cloudron exec --app staging.communityrule.info -- \
      node prisma/seed.bundle.cjs
    
    JSON lives at /app/seed-data/ (SEED_DATA_DIR); do not use /app/data (Cloudron localstorage overwrites it). Re-run after deploy is safe (idempotent upserts / per-section swaps).

Smoke checklist (acceptance):

Automated curl checks: ./scripts/staging-smoke.sh staging.communityrule.info (optional EMAIL=you@example.com to exercise magic-link request). Manual UI steps below are still required.

  • Health: curl -sS https://staging.communityrule.info/api/health returns 200 with {"ok":true,"database":"connected"}.
  • Magic link: request sign-in from the UI → email arrives at a real inbox → click link → land signed in → GET /api/auth/session returns a user. Confirm the link host matches staging.communityrule.info (reverse proxy / Host alignment).
  • Publish: complete create flow → publish a rule → public rule detail loads.
  • Draft sync (optional): signed-in Save & Exit persists to Postgres; resume works after re-login.
  • Upload (optional): with UPLOAD_ROOT set, attach a community photo in create flow and confirm it renders after publish.

If something fails:

Symptom Likely cause Check
Image pull error on install Repo still private, or wrong tag in manifest §6.3; docker pull --platform linux/amd64 … from laptop
Health 503 / database: disconnected Postgres addon not provisioned or URL missing Cloudron app → Environment; expect CLOUDRON_POSTGRESQL_URL
Magic link not sent Mail addon or SMTP_FROM Cloudron mail logs; CLOUDRON_MAIL_SMTP_* vars
Upload server_misconfigured UPLOAD_ROOT unset Set to /app/data/uploads (§3)
Container crash on start Migration failure App logs around prisma migrate deploy
No "Recommended" on method cards MethodFacet not seeded §10 step 6; API should return matches.score > 0 for some methods when facet.* set
seed.bundle.cjs ENOENT on /app/data/... Old image without /app/seed-data Deploy ≥ 0.1.8; JSON is at SEED_DATA_DIR=/app/seed-data

Done when: all smoke checklist items pass. Then proceed to soft-launch (§5 phase 2) and, when ready, CR-99 apex cutover.

11. Rate limiting (single-instance deploys)

The app uses an in-memory rate limiter in lib/server/rateLimit.ts (magic-link requests, organizer inquiry, etc.). This is sufficient for the current single Cloudron container per environment.

Before horizontal scale-out (multiple app instances behind a load balancer), replace or back the limiter with a shared store (e.g. Redis) so per-IP / per-user windows apply across instances. Until then, document expected limits in the steady-state runbook (CR-100).