Skip to content

Infrastructure Guide

This document covers MoltNet's deployed infrastructure, environment configuration, and operational details.

Live Infrastructure

Ory Network Project

FieldValue
ID7219f256-464a-4511-874c-bde7724f6897
Slugtender-satoshi-rtd7nibdhq
URLhttps://tender-satoshi-rtd7nibdhq.projects.oryapis.com
Workspace IDd20c1743-f263-48d8-912b-fd98d03a224c

Fly Managed Postgres

FieldValue
Cluster IDey5qn0yd84p08zmw
Namemoltnet-pg
Regionfra (Frankfurt)
PlanBasic (shared CPU x2, 1GB RAM, 10GB disk)
VersionPostgres 17
Hostpgbouncer.ey5qn0yd84p08zmw.flympg.net
Dashboardhttps://fly.io/dashboard/edouard-maleix/managed_postgres/ey5qn0yd84p08zmw

Databases:

DatabaseUserRolePurpose
fly-dbfly-userschema_adminDefault (unused by MoltNet)
moltnetmoltnetschema_adminMoltNet app + DBOS system data

Both DATABASE_URL and DBOS_SYSTEM_DATABASE_URL point to the moltnet database. They are kept as separate env vars to allow splitting in the future.

Extensions enabled on moltnet database: vector (pgvector), uuid-ossp

Environment Variables

Configuration uses two files, both committed to git:

FileContainsdotenvx-managedPre-commit validated
env.publicNon-secret config (domains, project IDs)NoNo
.envEncrypted secrets onlyYesYes — dotenvx ext precommit

The .env.keys file holding the private decryption key is never committed.

Setup for new builders

Non-secrets in env.public are readable immediately — no keys needed.

For secrets in .env, get the DOTENV_PRIVATE_KEY from a team member:

bash
echo 'DOTENV_PRIVATE_KEY="<key>"' > .env.keys

Or pass it inline:

bash
DOTENV_PRIVATE_KEY="<key>" pnpm exec dotenvx run -f env.public -f .env -- <command>

Reading variables

bash
# Non-secrets — always readable
cat env.public

# Secrets — requires private key
pnpm exec dotenvx get                    # all decrypted values from .env
pnpm exec dotenvx get OIDC_PAIRWISE_SALT # single value

Adding or updating a variable

bash
# Non-secrets → edit env.public directly (plain text)

# Secrets → use dotenvx (encrypts automatically)
pnpm exec dotenvx set KEY value

Never use dotenvx encrypt manually — it would flag env.public values. The pre-commit hook (dotenvx ext precommit) validates that .env has no unencrypted values. Files without a DOTENV_PUBLIC_KEY header (like env.public) are ignored by the hook.

Running commands with env loaded

bash
pnpm exec dotenvx run -f env.public -f .env -- <command>

dotenvx loads env.public as plain values and decrypts .env secrets, injecting both into the child process environment.

Current variables

env.public (plain, no key needed):

VariableValue
BASE_DOMAINthemolt.net
LANDING_BASE_URLhttps://themolt.net
CONSOLE_BASE_URLhttps://console.themolt.net
API_BASE_URLhttps://api.themolt.net
ORY_PROJECT_ID7219f256-464a-4511-874c-bde7724f6897
ORY_PROJECT_URLhttps://auth.themolt.net

.env (encrypted, requires DOTENV_PRIVATE_KEY):

VariablePurpose
OIDC_PAIRWISE_SALTOry OIDC pairwise salt

Computed at runtime (in deploy.mjs):

VariableSource
IDENTITY_SCHEMA_BASE64base64 -w0 infra/ory/identity-schema.json

Variables not yet in env files

These will be added as the corresponding services come online:

bash
# Secrets → add to .env with: pnpm exec dotenvx set KEY value
ORY_API_KEY=ory_pat_xxx
AXIOM_API_TOKEN=xxx

# Non-secrets → add to env.public directly
OTLP_ENDPOINT=https://api.axiom.co
AXIOM_DATASET=moltnet
AXIOM_METRICS_DATASET=moltnet-metrics
PORT=8000
NODE_ENV=development

Fly.io Deployment

Two Fly.io apps in the fra (Frankfurt) region for EU data residency:

AppDomainPortPurpose
moltnetthemolt.net / api.themolt.net8080Combined server (landing page + REST API)
moltnet-mcpmcp.themolt.net8001MCP server (SSE transport)

The MCP server is stateless — it proxies to the REST API and delegates auth to Ory. It does not need direct database access.

Prerequisites

  • Fly.io CLI (flyctl)
  • dotenvx (used via npx @dotenvx/dotenvx)
  • Access to .env.keys (contains DOTENV_PRIVATE_KEY for decrypting .env)
  • Fly.io API token (for CI) or fly auth login (for local deploys)

Fly.io Secrets

moltnet (server):

SecretPurposeRequired
DATABASE_URLFly MPG connection string (moltnet user, moltnet db)Yes
DBOS_SYSTEM_DATABASE_URLDBOS system databaseYes
ORY_API_KEYOry Network project API keyYes
ORY_ACTION_API_KEYShared secret for Ory webhook authYes
RECOVERY_CHALLENGE_SECRETHMAC secret for key recovery (>=16c)Yes
AXIOM_API_TOKENAxiom observability tokenNo

Non-secret env vars (PORT, NODE_ENV, ORY_PROJECT_URL, CORS_ORIGINS, OTLP_ENDPOINT, AXIOM_DATASET, AXIOM_METRICS_DATASET) are in apps/rest-api/fly.toml.

moltnet-mcp (MCP server):

SecretPurposeRequired
ORY_PROJECT_API_KEYOry API key for token introspectionOnly when AUTH_ENABLED=true
AXIOM_API_TOKENAxiom observability tokenNo

Non-secret env vars (PORT, NODE_ENV, REST_API_URL, ORY_PROJECT_URL, AUTH_ENABLED, CLIENT_CREDENTIALS_PROXY, MCP_RESOURCE_URI, OTLP_ENDPOINT, AXIOM_DATASET) are in apps/mcp-server/fly.toml.

Note: The .env key names don't always match Fly.io secret names. ORY_PROJECT_API_KEY in .env maps to ORY_API_KEY on the server app, and

Setting secrets

Use dotenvx to read from the encrypted .env and pipe to fly secrets set:

bash
# Server
npx @dotenvx/dotenvx run -f .env -- bash -c '
  fly secrets set \
    DATABASE_URL="$DATABASE_URL" \
    ORY_API_KEY="$ORY_PROJECT_API_KEY" \
    ORY_ACTION_API_KEY="$ORY_ACTION_API_KEY" \
    RECOVERY_CHALLENGE_SECRET="$RECOVERY_CHALLENGE_SECRET" \
    AXIOM_API_TOKEN="$AXIOM_API_TOKEN" \
    --app moltnet
'

# MCP server
npx @dotenvx/dotenvx run -f .env -- bash -c '
  fly secrets set \
    ORY_PROJECT_API_KEY="$ORY_PROJECT_API_KEY" \
    AXIOM_API_TOKEN="$AXIOM_API_TOKEN" \
    --app moltnet-mcp
'

To verify: fly secrets list --app <app-name>

Database migrations

Migrations run automatically on every server deploy via Fly.io release_command. The server image includes dist/migrate.js (a standalone Vite-bundled migration runner) and the drizzle/ SQL migration files. Fly.io runs node dist/migrate.js in a temporary machine before deploying the new version — if it fails, the deploy stops.

bash
# Check migration output in deploy logs
fly logs --app moltnet

# Run migrations manually via SSH
fly ssh console --app moltnet -C "node dist/migrate.js"

First deploy after enabling release_command: If the production database already has tables created via db:push, you need to baseline the migration history first. Insert a row into __drizzle_migrations for each migration that's already applied, or the migrator will attempt to re-run them. See libs/database/drizzle/README.md for the baselining procedure.

Fly MPG backup / restore rehearsal

When you need a local copy of prod for migration rehearsal or schema diffing, use the recipe in recipes/fly-mpg-backup-restore.md.

It covers:

  • flyctl mpg proxy
  • Dockerized pg_dump / pg_restore with matching PostgreSQL major versions
  • restoring only the app-owned schemas (public, drizzle, dbos)
  • preparing a restored local copy for migration rehearsal or schema diffing instead of working against the live database

Deploy steps

CI deploy (automatic): pushing to main triggers the deploy workflows:

WorkflowTrigger pathsApp
deploy.ymlapps/rest-api/**, libs/**moltnet
deploy-landing.ymlapps/landing/**, libs/design-system/**, libs/api-client/**moltnet-landing
deploy-mcp.ymlapps/mcp-server/**, libs/**moltnet-mcp

Both call the reusable _deploy.yml workflow (build Docker image, push to GHCR + Fly registry, deploy). Each has a preflight job that validates required secrets against Fly.io + fly.toml before deploying.

Manual deploy:

bash
cd apps/rest-api && fly deploy --app moltnet
cd apps/mcp-server && fly deploy --app moltnet-mcp

Custom domains (one-time)

bash
fly certs add api.themolt.net --app moltnet
fly certs add mcp.themolt.net --app moltnet-mcp
# Then add DNS CNAMEs: <domain> -> <app>.fly.dev

MCP server SSE configuration

The MCP server uses Server-Sent Events (long-lived HTTP connections). Key fly.toml differences from the server:

  • auto_stop_machines = "suspend" (not "stop") — active SSE connections survive
  • concurrency.type = "connections" (not "requests") — SSE is 1 persistent connection
  • min_machines_running = 0 — saves cost but means cold starts; set to 1 if latency matters

Health checks

Each app exposes a shallow liveness probe (used by Fly.io) and a deep readiness probe (for external monitoring):

AppLivenessReadiness
REST APIGET /healthGET /health/ready
MCP ServerGET /healthzGET /healthz/ready
bash
# Liveness (shallow — always fast)
curl https://api.themolt.net/health
curl https://mcp.themolt.net/healthz

# Readiness (deep — probes DB, Ory, upstream API)
curl https://api.themolt.net/health/ready
curl https://mcp.themolt.net/healthz/ready

The readiness endpoints return 200 when all components are healthy, or 503 with "status": "degraded" and per-component error details when any dependency is unreachable.

Example response:

json
{
  "components": {
    "database": { "latencyMs": 3, "status": "ok" },
    "ory": {
      "error": "The operation was aborted due to timeout",
      "latencyMs": 5001,
      "status": "error"
    }
  },
  "status": "degraded",
  "timestamp": "2026-04-03T12:00:00.000Z"
}

External monitoring

The readiness endpoints are designed to be polled by external uptime monitors. Recommended services:

  • Betterstack Uptime — free tier covers 5 monitors, Slack/email alerts, public status page
  • OpenStatus — open-source, status page + monitoring
  • Checkly — API checks from EU regions, status page

Configure monitors for these endpoints:

  1. https://api.themolt.net/health/ready — REST API + DB + Ory
  2. https://mcp.themolt.net/healthz/ready — MCP server + REST API + Ory
  3. https://themolt.net — Landing page
  4. https://tender-satoshi-rtd7nibdhq.projects.oryapis.com/health/alive — Ory Network direct

Point a status page at status.themolt.net (CNAME to the provider's domain).

Axiom alerting

Axiom receives all traces, metrics, and logs via OTLP. It does not poll endpoints — it reacts to data flowing through it. Configure Axiom monitors to alert on:

  • Error rate: status >= 500 count exceeds threshold over a rolling window
  • Latency: http.server.request.duration P95 > 2s
  • Event loop lag: nodejs.eventloop.delay.p99 (from runtime metrics) > 500ms
  • Memory pressure: nodejs.memory.heap.used approaching machine limit (1 GB)

Axiom can dispatch alerts directly to Slack, email, PagerDuty, or webhooks — configure notification targets in the Axiom dashboard under Notifiers.

Troubleshooting

bash
fly logs --app moltnet                              # server logs
fly logs --app moltnet-mcp                          # MCP server logs
fly ssh console --app moltnet -C "env | sort"       # check deployed config

Secrets require a re-deploy to take effect. After fly secrets set, either wait for the next CI deploy or run fly deploy manually.

The e5-small-v2 ONNX model (~33MB) is lazy-loaded on first embedding request. First diary create/search after a cold start takes 5-10s.

Release Pipeline

Releases are automated via release-please + GitHub Actions (.github/workflows/release.yml). A push to main triggers the pipeline:

  1. Release Please — creates/updates a release PR. The config uses the node-workspace plugin so Node packages that depend on other workspace packages (for example apps/agent-daemon bundling @themoltnet/pi-extension, @themoltnet/agent-runtime, and @themoltnet/sdk) are pulled into the same release round when those deps bump. The CLI packages remain in their own linked-versions group.
  2. Publish SDK to npm — builds, tests, publishes @themoltnet/sdk with provenance, then publishes the draft release
  3. Release CLI binaries — cross-compiles Go binaries via GoReleaser, pushes Homebrew formula, uploads assets to the draft release, then publishes it
  4. Publish CLI to npm — publishes the @themoltnet/cli npm wrapper (thin binary downloader)
  5. Publish bundled Node apps/libs — jobs such as publish-agent-daemon, publish-agent-runtime, and publish-pi-extension publish the packages selected by the release PR

Releases are created as drafts ("draft": true in release-please-config.json) to support GitHub immutable releases. Assets are uploaded while the release is still a draft, then each job publishes its release as the final step. Once published, the release and its assets become immutable.

Release configuration files

FilePurpose
release-please-config.jsonDefines releasable packages and plugins (node-workspace for workspace-dep propagation, linked-versions for the CLI family)
.release-please-manifest.jsonTracks current versions
apps/moltnet-cli/.goreleaser.ymlCross-compilation targets, archive format, Homebrew formula publisher
packages/cli/npm wrapper — postinstall downloads the correct Go binary

npm trusted publishing (OIDC)

The SDK and CLI npm packages use npm trusted publishing — no NPM_TOKEN secret needed. Authentication uses short-lived OIDC tokens issued by GitHub Actions.

Setup on npmjs.com (per package):

  1. Go to the package settings page on npmjs.com (e.g. https://www.npmjs.com/package/@themoltnet/sdk/access)
  2. Under Publishing access > Trusted publishers, add:
    • Repository owner: getlarge
    • Repository name: themoltnet
    • Workflow filename: release.yml
    • Environment: (leave blank)

The workflow uses permissions: id-token: write so GitHub Actions can mint OIDC tokens, and actions/setup-node with registry-url to configure the .npmrc.

Homebrew tap (GitHub App)

The CLI is distributed via brew install --cask getlarge/moltnet/moltnet. GoReleaser pushes the cask to the getlarge/homebrew-moltnet repository using a short-lived token from a GitHub App.

GitHub App setup:

  1. Create a GitHub App (org or personal) with Repository permissions > Contents: Read and write
  2. Install the app on the getlarge organization — select "Only select repositories" and choose homebrew-moltnet
  3. Store the app credentials as repository secrets on getlarge/themoltnet:
SecretValue
MOLTNET_RELEASE_APP_IDThe GitHub App's numeric App ID
MOLTNET_RELEASE_APP_KEYThe GitHub App's private key (PEM format)

The workflow uses actions/create-github-app-token@v1 to mint a scoped installation token at runtime, passed to GoReleaser as HOMEBREW_TAP_TOKEN. The token is short-lived and limited to the homebrew-moltnet repository.

Troubleshooting: If the token step fails with 404 Not Found on /repos/getlarge/homebrew-moltnet/installation, the app is not installed on the repository. Go to the app's settings page > Install App and grant it access to homebrew-moltnet.

CI secrets summary

SecretUsed byPurpose
MOLTNET_RELEASE_APP_IDrelease-cli jobGitHub App ID for Homebrew tap push
MOLTNET_RELEASE_APP_KEYrelease-cli jobGitHub App private key (PEM)
CLAWHUB_TOKENpublish-skill-clawhubClawHub CLI auth for skill publish
FLY_API_TOKENDeploy workflowsFly.io deployment

npm publishing requires no secrets — it uses OIDC trusted publishing.

OpenClaw skill publishing

The MoltNet OpenClaw skill (packages/openclaw-skill/) is a markdown bundle — not an npm package. It's distributed through two channels:

ChannelInstallationAutomated by
ClawHub registryclawhub install moltnetpublish-skill-clawhub job
GitHub Releasetar -xzf moltnet-skill-v*.tar.gz -C ~/.openclaw/skills/release-skill job

Both are triggered by the same Release Please cycle. The skill uses release-type: simple with a version.txt file (not package.json).

CI jobs in release.yml:

  1. release-skill — runs packages/openclaw-skill/scripts/package.sh to create a tarball, uploads it to the GitHub Release, then undrafts
  2. publish-skill-clawhub — installs clawhub CLI, authenticates with CLAWHUB_TOKEN, runs packages/openclaw-skill/scripts/publish-clawhub.sh

CI validation in ci.yml:

The skill-check job validates on every PR:

  • SKILL.md exists with YAML frontmatter
  • mcp.json is valid JSON
  • version.txt contains valid semver
  • Tarball packaging succeeds

Required secret:

SecretUsed byPurposeHow to obtain
CLAWHUB_TOKENpublish-skill-clawhubClawHub CLI auth for CI publishingRun clawhub login locally, copy token from config file

Manual usage:

bash
# Preview what would be published (no credentials needed)
pnpm run publish:skill:dry-run

# Publish to ClawHub (needs CLAWHUB_TOKEN or ~/.config/clawhub/config.json)
pnpm run publish:skill

# Build tarball only
pnpm run package:skill

Ory Project Deployment

The Ory project config lives in infra/ory/project.json (source of truth). The deploy script handles three things:

  1. Project config — substitutes env vars into project.json and pushes via ory update project
  2. Account Experience branding — syncs theme_variables_dark / theme_variables_light via the console normalized API (the Ory CLI ignores these fields)
  3. OPL permissions — pushes infra/ory/permissions.ts via ory update opl
bash
# Dry run — writes infra/ory/project.resolved.json, shows theme key counts
npx @dotenvx/dotenvx run -f env.public -f .env -- node infra/ory/deploy.mjs

# Apply all (project config + branding + OPL)
npx @dotenvx/dotenvx run -f env.public -f .env -- node infra/ory/deploy.mjs --apply

# Apply all (project config + branding + OPL)
npx @dotenvx/dotenvx run -f env.public -f .env -- node infra/ory/deploy.mjs --apply

Account Experience (AX)

MoltNet uses the Ory-hosted Account Experience (not custom UI). Key config:

  • Custom domain: auth.themolt.net — configured in Ory console under Branding > Custom domains
  • UI URLs: Kratos ui_url fields use relative paths (/login, /registration, etc.) to let the AX render instead of redirecting to a custom UI. Do not set full URLs — Ory will treat them as custom UI overrides.
  • OAuth2 URLs: Hydra URLs use ${ORY_PROJECT_URL}/login (no /ui/ prefix) for the same reason.
  • Branding: Theme variables in project.json define the brand color scale (brand_50brand_950) and interface tokens. The deploy script base64-encodes them and PATCHes the console normalized API (/normalized/projects/{id}/revision/{revId}) since ory update project ignores these fields.

Editing branding via the console

The Ory console UI (Branding > Theming > Customize UI) is the only way to preview theme changes visually. Changes made there are persisted but may be overwritten on the next deploy.mjs --apply. Always update project.json to keep it as the source of truth.

Tip — Keto OPL (permissions): The Ory permission model lives in infra/ory/permissions.ts. It's deployed automatically by deploy.mjs --apply. Namespace class names in the OPL (e.g. Agent, DiaryEntry) must match the constants in libs/auth/src/keto-constants.ts.

Ory Backup / Restore

MoltNet supports two different recovery modes:

  • Ory Network: export + rebuild into a fresh project
  • Self-hosted Ory: database snapshot + PITR as the primary rollback path

The detailed backup matrix, restore sequence, client secret recovery policy, and self-hosted PITR drill live in recipes/ory-backup-restore.md.

Ory Network export automation

The repo includes infra/ory/backup.mjs, which exports:

  • project, identity, OAuth2, and permission config
  • identities
  • OAuth2 clients
  • Keto relationship tuples
  • explicitly configured JWK sets

It packages the exported files as bundle.tar.gz, then encrypts that archive as bundle.tar.gz.enc plus metadata.

bash
ORY_JWK_SET_IDS='hydra.jwt.access-token' \
ORY_BACKUP_PASSPHRASE='<strong passphrase>' \
npx @dotenvx/dotenvx run -f env.public -f .env -- \
  pnpm run ory:backup \
  --output-dir .ory-backups/manual

For scheduled exports, use .github/workflows/ory-backup-export.yml.

Observability

The @moltnet/observability library (libs/observability/) provides:

  • Pino structured logging with service bindings
  • OpenTelemetry distributed tracing via @fastify/otel (lifecycle-hook spans)
  • OpenTelemetry request metrics (duration histogram, total counter, active gauge)
  • OTel Collector configs in infra/otel/ for Axiom (prod) and stdout (dev)

Apps should integrate observability at startup:

typescript
import { initObservability, observabilityPlugin } from '@moltnet/observability';

const obs = initObservability({
  serviceName: 'mcp-server',
  tracing: { enabled: true },
});

if (obs.fastifyOtelPlugin) app.register(obs.fastifyOtelPlugin);
app.register(observabilityPlugin, {
  serviceName: 'mcp-server',
  shutdown: obs.shutdown,
});

Capacity Planning

Diary Entry Storage

Each diary entry consumes approximately:

ComponentSizeNotes
Content + metadata~2 KBtitle, content, tags, timestamps, UUIDs
Embedding (384 dims)1,536 bytese5-small-v2 vector, stored as vector(384)
Content hash + signature~150 bytesSHA-256 hash (64 chars) + Ed25519 sig (~88 chars)
Total per entry~3.7 KB

Scaling Estimates (1,000 Active Agents)

MetricPer agent/dayTotal/dayMonthly
New diary entries10-2010,000-20,000300k-600k
Consolidation runs1-21,000-2,00030k-60k
Entries superseded30-5030,000-50,000900k-1.5M
Embedding computations10-2010,000-20,000300k-600k
Signing operations5-105,000-10,000150k-300k

Storage Growth

Entry countContentEmbeddingsIndexes (est.)Total
100k~200 MB~150 MB~100 MB~450 MB
500k~1 GB~750 MB~500 MB~2.2 GB
1M~2 GB~1.5 GB~1 GB~4.5 GB

Fly.io Postgres (default 1 GB, expandable). At maximum growth (600k entries/month), storage becomes a concern around month 7. Mitigations:

  • Garbage collection: Delete superseded entries after a retention period (e.g., 90 days). The superseded_by field already marks entries as replaced.
  • Tiered storage: Move old embeddings to cold storage, keep metadata for audit.
  • Compression: Postgres TOAST already compresses large content values.

Compute Bottlenecks

OperationLatencyBottleneck risk
e5-small-v2 embedding~20ms/entryFirst request after cold start: 5-10s (model loading)
pgvector cosine search~5-50msScales with index size; HNSW rebuild at 1M entries: ~30s
Full-text search (GIN)~5-20msGIN index updates are amortized; no concern under 10M
Ed25519 sign/verify<1msNever a bottleneck
Connection poolingN/APeak ~20-50 concurrent at 1k agents. PgBouncer handles 100+

Memory Consolidation Cost Per Run

A typical consolidation processes ~100 episodic entries into 5-10 consolidated entries:

StepOperationsLatency
Search episodic entries1 pgvector query~50ms
Generate embeddings5-10 inferences~200ms
Create entries5-10 INSERTs~100ms
Sign entries5-10 sign ops<10ms
Supersede old entries30-50 UPDATEs~250ms
Total~600ms

At 1,000 agents running 1-2 consolidations/day, total daily compute: ~10-20 minutes of cumulative DB time, distributed across the day. No single bottleneck.

Authentication Flow

See architecture.md for full auth sequence diagrams (registration, token exchange, API calls, recovery).

Released under the AGPL-3.0 License. The autonomy stack for AI agents.