Update Docs
This commit is contained in:
@@ -0,0 +1,274 @@
|
||||
# Steady-state operator runbook
|
||||
|
||||
Day-to-day deploy, rollback, and recovery for CommunityRule on MEDLab
|
||||
Cloudron. Assumes staging or production is already installed and smoke-tested.
|
||||
|
||||
> **First-time install, apex cutover, and legacy decommission** live in
|
||||
> [`ops-backend-deploy.md`](ops-backend-deploy.md). Use this doc once an
|
||||
> environment is already running.
|
||||
|
||||
## 1. Quick reference
|
||||
|
||||
| Item | Value |
|
||||
| ---- | ----- |
|
||||
| Cloudron dashboard | `https://my.medlab.host` |
|
||||
| Cloudron CLI login | `cloudron login my.medlab.host` |
|
||||
| Staging app | `staging.communityrule.info` |
|
||||
| Production app | `communityrule.info` (after apex cutover) |
|
||||
| Container image | `git.medlab.host/communityrule/community-rule:<tag>` |
|
||||
| Health check | `GET /api/health` → `200 {"ok":true,"database":"connected"}` |
|
||||
| Manifest version | [`CloudronManifest.json`](../../CloudronManifest.json) `version` field (must increase for each release) |
|
||||
| Current manifest | `0.1.8` at time of writing — always read the file before deploying |
|
||||
|
||||
Replace `<app>` below with the Cloudron location (`staging.communityrule.info`
|
||||
or `communityrule.info`).
|
||||
|
||||
## 2. Prerequisites (one-time per operator)
|
||||
|
||||
1. **Cloudron access** — admin login on `my.medlab.host` and a CLI API token
|
||||
(*Profile → API Tokens* on the dashboard). Save the token in 1Password.
|
||||
2. **Cloudron CLI** — logged in:
|
||||
```bash
|
||||
cloudron login my.medlab.host
|
||||
```
|
||||
3. **Docker + buildx** — Docker Desktop or equivalent with `docker buildx`.
|
||||
4. **Gitea registry auth** — personal access token on `git.medlab.host` with
|
||||
`read:package` + `write:package`; then:
|
||||
```bash
|
||||
docker login git.medlab.host
|
||||
```
|
||||
5. **Repo checkout** — clone
|
||||
[`CommunityRule/community-rule`](https://git.medlab.host/CommunityRule/community-rule)
|
||||
and work from a clean commit that matches the release you intend to ship.
|
||||
|
||||
Images are **`linux/amd64` only** (Cloudron host is x86_64). On Apple
|
||||
Silicon, the release script still builds amd64 via buildx; a bare
|
||||
`docker pull` without `--platform linux/amd64` failing on arm64 is expected.
|
||||
|
||||
## 3. Deploy a new version
|
||||
|
||||
Typical release flow: bump manifest → build/push image → `cloudron update`
|
||||
→ smoke.
|
||||
|
||||
### 3.1 Build and push
|
||||
|
||||
1. Check out the commit to release (`main` or a release branch).
|
||||
2. **Bump** [`CloudronManifest.json`](../../CloudronManifest.json) `version`
|
||||
(e.g. `0.1.8` → `0.1.9`). Cloudron requires the manifest version to
|
||||
**increase** for `cloudron update --image` to be accepted.
|
||||
3. From the repo root, build and push (tag should match the manifest version
|
||||
for sanity):
|
||||
|
||||
```bash
|
||||
TAG=0.1.9 ./scripts/docker-release.sh
|
||||
# equivalent:
|
||||
TAG=0.1.9 npm run docker:release
|
||||
```
|
||||
|
||||
Omit `TAG=` to push `git rev-parse --short HEAD` instead — only do that
|
||||
for ad-hoc staging experiments; production releases should use semver tags
|
||||
aligned with the manifest.
|
||||
|
||||
4. **Verify anonymous pull** (simulates Cloudron):
|
||||
```bash
|
||||
docker logout git.medlab.host
|
||||
docker pull --platform linux/amd64 \
|
||||
git.medlab.host/communityrule/community-rule:0.1.9
|
||||
```
|
||||
|
||||
5. **Commit the manifest bump** in git alongside the code that shipped in
|
||||
this build.
|
||||
|
||||
Registry details and one-time Gitea setup: [`ops-backend-deploy.md` §9](ops-backend-deploy.md#9-build-and-push-image-workflow).
|
||||
|
||||
### 3.2 Update Cloudron
|
||||
|
||||
```bash
|
||||
cloudron update --app staging.communityrule.info \
|
||||
--image git.medlab.host/communityrule/community-rule:0.1.9
|
||||
```
|
||||
|
||||
Use `communityrule.info` for production. Cloudron pulls the image (no registry
|
||||
credentials on the host), restarts the container, and runs
|
||||
[`scripts/start.sh`](../../scripts/start.sh), which:
|
||||
|
||||
1. `chown`s `/app/data` (localstorage mount),
|
||||
2. runs **`prisma migrate deploy`**,
|
||||
3. execs the Next.js standalone server.
|
||||
|
||||
Watch the app **Logs** tab in the Cloudron dashboard for a clean migration and
|
||||
`Listening on port 3000`.
|
||||
|
||||
### 3.3 Migrations
|
||||
|
||||
**Normal case:** migrations apply automatically on container start — no
|
||||
separate step.
|
||||
|
||||
**Manual re-run** (only if debugging a failed deploy or verifying before
|
||||
traffic):
|
||||
|
||||
```bash
|
||||
cloudron exec --app staging.communityrule.info -- npm run db:deploy
|
||||
```
|
||||
|
||||
(`npm run db:deploy` → `prisma migrate deploy`.)
|
||||
|
||||
**Policy:** never run `prisma migrate reset` against staging or production.
|
||||
Never edit migration files already applied to a shared database. Fix schema
|
||||
drift by adding a **new** migration locally (`prisma migrate dev`) and
|
||||
deploying a new image. See [`backend-roadmap.md` §8](backend-roadmap.md#8-prisma-migrations-policy).
|
||||
|
||||
### 3.4 Seed data (not every deploy)
|
||||
|
||||
Template + facet seed (`MethodFacet` rows for create-flow “Recommended” tags)
|
||||
is **not** applied at boot. Run once per environment after first install, or
|
||||
when recommendations return all-zero scores:
|
||||
|
||||
```bash
|
||||
cloudron exec --app staging.communityrule.info -- \
|
||||
node prisma/seed.bundle.cjs
|
||||
```
|
||||
|
||||
Re-running is safe (idempotent upserts). JSON lives at `/app/seed-data/` in
|
||||
the image — not under `/app/data` (Cloudron localstorage overwrites that
|
||||
mount).
|
||||
|
||||
### 3.5 Smoke after deploy
|
||||
|
||||
**Automated** (from your laptop, repo root):
|
||||
|
||||
```bash
|
||||
./scripts/staging-smoke.sh staging.communityrule.info
|
||||
# production:
|
||||
./scripts/staging-smoke.sh communityrule.info
|
||||
|
||||
# optional — exercises magic-link request (check inbox manually):
|
||||
EMAIL=you@example.com ./scripts/staging-smoke.sh staging.communityrule.info
|
||||
```
|
||||
|
||||
**Manual** (still required for full acceptance):
|
||||
|
||||
- Click a magic link → signed in → `GET /api/auth/session` returns a user.
|
||||
- Publish a rule end-to-end → public detail page loads.
|
||||
- Optional: Save & Exit draft sync; upload with `UPLOAD_ROOT` set.
|
||||
|
||||
Full checklist and failure table: [`ops-backend-deploy.md` §10](ops-backend-deploy.md).
|
||||
|
||||
## 4. Roll back (code-only)
|
||||
|
||||
To revert application code without touching the database:
|
||||
|
||||
```bash
|
||||
cloudron update --app staging.communityrule.info \
|
||||
--image git.medlab.host/communityrule/community-rule:<previous-tag>
|
||||
```
|
||||
|
||||
Pick a tag you know was healthy (previous manifest version or git tag recorded
|
||||
at last good deploy).
|
||||
|
||||
**Database implications:**
|
||||
|
||||
- Rolling back the **image** does **not** undo migrations already applied.
|
||||
- If the bad release added a migration, rolling back to an older image may
|
||||
leave the DB schema **ahead** of what that code expects — usually safe if
|
||||
the migration was additive (new nullable columns, new tables).
|
||||
- If the bad release broke because of a **destructive or incompatible**
|
||||
migration, do **not** reset production. Restore from a Cloudron backup
|
||||
(§5) or fix forward with a corrective migration.
|
||||
|
||||
**Never** `prisma migrate reset` on staging or production.
|
||||
|
||||
## 5. Restore drill (quarterly)
|
||||
|
||||
Verify Cloudron backups are restorable without touching the live app.
|
||||
|
||||
**Cadence:** at least once per quarter, or after any backup-policy change.
|
||||
|
||||
**Steps:**
|
||||
|
||||
1. In the Cloudron dashboard, pick a recent automatic backup of
|
||||
`<app>` (*Backups* tab).
|
||||
2. **Restore to a scratch location** — e.g.
|
||||
`restore-drill-YYYYMMDD.communityrule.info` — not over the live app.
|
||||
3. After restore completes, confirm the container starts and migrations are
|
||||
current:
|
||||
```bash
|
||||
curl -sS "https://restore-drill-YYYYMMDD.communityrule.info/api/health"
|
||||
```
|
||||
Expect `200` with `"database":"connected"`.
|
||||
4. Optional: `cloudron exec --app restore-drill-YYYYMMDD.communityrule.info -- npm run db:deploy`
|
||||
if logs show pending migrations on an older snapshot.
|
||||
5. Run `./scripts/staging-smoke.sh restore-drill-YYYYMMDD.communityrule.info`.
|
||||
6. **Uninstall** the scratch app when done.
|
||||
|
||||
Record the drill date and outcome in your ops notes. Cloudron retains
|
||||
automatic backups per platform defaults; confirm retention in the dashboard.
|
||||
|
||||
## 6. Single-instance limitations
|
||||
|
||||
The current Cloudron deploy runs **one container per environment**. Do not
|
||||
scale to multiple app instances without addressing these per-process limits:
|
||||
|
||||
### 6.1 In-memory rate limiter
|
||||
|
||||
[`lib/server/rateLimit.ts`](../../lib/server/rateLimit.ts) stores windows in
|
||||
process memory. Limits apply **per container**, not globally across instances.
|
||||
|
||||
| Route / action | Key | Min interval |
|
||||
| -------------- | --- | ------------ |
|
||||
| Magic-link request | per email | 60 s |
|
||||
| Magic-link request | per IP | 20 s |
|
||||
| Email change request | per email / IP / user | 60 s |
|
||||
| Organizer inquiry | per email / IP | 60 s / 20 s |
|
||||
| Publish with stakeholder invites | per IP | 60 s |
|
||||
| Stakeholder add / resend | per IP / invite | 60 s |
|
||||
| File upload | per user | 5 s |
|
||||
|
||||
Before horizontal scale-out, replace with a shared store (e.g. Redis) or edge
|
||||
rate limits. See [`backend-roadmap.md` §5](backend-roadmap.md#5-session-and-authentication-v1).
|
||||
|
||||
### 6.2 Web vitals storage
|
||||
|
||||
Production defaults to **`external`** mode: vitals are structured log lines, not
|
||||
written to Postgres or local files. Setting `WEB_VITALS_STORAGE=local` uses a
|
||||
**per-process** file store under `.next/web-vitals` — suitable for dev/admin
|
||||
only, not multi-instance. See [`backend-roadmap.md` §7](backend-roadmap.md#7-api-responses-errors-and-observability).
|
||||
|
||||
## 7. Environment variables (steady-state)
|
||||
|
||||
Cloudron **auto-injects** addon vars (`CLOUDRON_POSTGRESQL_URL`,
|
||||
`CLOUDRON_MAIL_SMTP_*`). Operators set these manually once per app; they
|
||||
persist across image updates unless changed:
|
||||
|
||||
| Variable | Purpose |
|
||||
| -------- | ------- |
|
||||
| `SESSION_SECRET` | Session cookie signing (≥ 16 chars). Rotating logs everyone out. |
|
||||
| `SMTP_FROM` | Visible From on sign-in emails (e.g. `Community Rule <hello@communityrule.info>`). |
|
||||
| `NEXT_PUBLIC_ENABLE_BACKEND_SYNC` | `true` in staging/production — Postgres draft persistence. |
|
||||
| `UPLOAD_ROOT` | `/app/data/uploads` on Cloudron — required for file uploads. |
|
||||
|
||||
Full detail: [`ops-backend-deploy.md` §3](ops-backend-deploy.md#3-environment-variables).
|
||||
|
||||
## 8. Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Action |
|
||||
| ------- | ------------ | ------ |
|
||||
| Image pull error on update | Private repo, wrong tag, or amd64 manifest missing | Confirm repo is public; verify pull with `--platform linux/amd64` (§3.1) |
|
||||
| Health `503` / `database: disconnected` | Postgres addon or `CLOUDRON_POSTGRESQL_URL` missing | Cloudron app → Environment |
|
||||
| Container crash on start | Migration failure | App logs around `prisma migrate deploy`; fix forward with new migration |
|
||||
| Magic link not sent | Mail addon or `SMTP_FROM` | Cloudron mail logs; `CLOUDRON_MAIL_SMTP_*` vars |
|
||||
| Upload `server_misconfigured` | `UPLOAD_ROOT` unset | `cloudron env set --app <app> UPLOAD_ROOT=/app/data/uploads` |
|
||||
| No “Recommended” on method cards | Seed not run | §3.4 — `node prisma/seed.bundle.cjs` |
|
||||
| Rate limit too aggressive after deploy | Expected per §6.1 | Single instance only; limits reset on container restart |
|
||||
|
||||
App logs: Cloudron dashboard → *Logs* tab, or `cloudron logs --app <app> -f`.
|
||||
|
||||
## 9. Related docs
|
||||
|
||||
- [`ops-backend-deploy.md`](ops-backend-deploy.md) — first install, cutover
|
||||
plan, legacy rules archive, build/push deep dive.
|
||||
- [`backend-roadmap.md`](backend-roadmap.md) — migrations policy (§8),
|
||||
rate limiting (§5), environments (§11).
|
||||
- [`../relaunch-brief.md`](../relaunch-brief.md) — plain-language summary
|
||||
for MEDLab admin.
|
||||
- [`../../CONTRIBUTING.md`](../../CONTRIBUTING.md) — local dev setup.
|
||||
Reference in New Issue
Block a user