Skip to main content

Deploy SkillFlaw to Kubernetes for production

This guide covers the production-oriented Kubernetes shape for SkillFlaw: a stable backend API deployment with externalized state, explicit secrets handling, and optional UI/docs exposure.

Unlike older historical documentation, SkillFlaw does not ship a runtime Helm chart. The supported production baseline is built from the same images and environment variables used by the reference compose stack.

Production goals

In production, the backend is the critical path. It is responsible for:

  • flow execution through /api/v1/run/{flow_id_or_alias}
  • OpenAI-compatible access through /api/v1/responses
  • MCP serving through /api/v1/mcp/streamable
  • management APIs used by the frontend when the UI is enabled

You can deploy only the backend if your consumers are other services or SDKs. Add the frontend and docs only when the production environment truly needs browser access.

Prerequisites

  • a Kubernetes cluster
  • kubectl
  • access to ghcr.io/cwinux/* images
  • a production PostgreSQL deployment
  • a Redis deployment when using Redis-backed caching
  • persistent storage for SKILLFLAW_CONFIG_DIR
  • a secure way to mount the secret key file used by SKILLFLAW_SECRET_KEY_FILE

At minimum, production should include:

  • backend deployment
  • PostgreSQL
  • Redis
  • ingress or gateway
  • persistent volume claim for backend-managed files

Optional services:

  • frontend deployment for browser UI
  • docs deployment when docs need a dedicated production hostname

1. Externalize state before scaling

Do not scale the backend first and figure out state later.

Before you run multiple replicas, make sure all three areas are explicit:

  • PostgreSQL connection
  • Redis connection
  • writable persistent storage for SKILLFLAW_CONFIG_DIR

Use the same environment contract already present in docker/docker-compose.yml:


_15
apiVersion: v1
_15
kind: ConfigMap
_15
metadata:
_15
name: skillflaw-backend-config
_15
namespace: skillflaw
_15
data:
_15
SKILLFLAW_CONFIG_DIR: /var/lib/skillflaw
_15
SKILLFLAW_DATABASE_URL: postgresql://skillflaw:skillflaw@postgresql:5432/skillflaw
_15
SKILLFLAW_CONFIG_MODEL: local
_15
SKILLFLAW_CACHE_TYPE: redis
_15
SKILLFLAW_REDIS_HOST: redis
_15
SKILLFLAW_REDIS_PORT: "6379"
_15
SKILLFLAW_HOST: 0.0.0.0
_15
SKILLFLAW_PORT: "7860"
_15
SKILLFLAW_OPEN_BROWSER: "false"

Store the secret key as a mounted file, not as an inline image bake-time constant.

2. Deploy the backend with probes and storage

The backend image starts uvicorn --factory skillflaw.main:create_app and listens on port 7860.


_68
apiVersion: apps/v1
_68
kind: Deployment
_68
metadata:
_68
name: skillflaw-backend
_68
namespace: skillflaw
_68
spec:
_68
replicas: 2
_68
selector:
_68
matchLabels:
_68
app: skillflaw-backend
_68
template:
_68
metadata:
_68
labels:
_68
app: skillflaw-backend
_68
spec:
_68
containers:
_68
- name: backend
_68
image: ghcr.io/cwinux/skillflaw_backend:latest
_68
ports:
_68
- containerPort: 7860
_68
envFrom:
_68
- configMapRef:
_68
name: skillflaw-backend-config
_68
env:
_68
- name: SKILLFLAW_SECRET_KEY_FILE
_68
value: /run/secrets/skillflaw_secret_key
_68
volumeMounts:
_68
- name: backend-data
_68
mountPath: /var/lib/skillflaw
_68
- name: secret-key
_68
mountPath: /run/secrets/skillflaw_secret_key
_68
subPath: skillflaw_secret_key
_68
readOnly: true
_68
readinessProbe:
_68
httpGet:
_68
path: /health
_68
port: 7860
_68
livenessProbe:
_68
httpGet:
_68
path: /health
_68
port: 7860
_68
resources:
_68
requests:
_68
cpu: 500m
_68
memory: 1Gi
_68
limits:
_68
cpu: "2"
_68
memory: 4Gi
_68
volumes:
_68
- name: backend-data
_68
persistentVolumeClaim:
_68
claimName: skillflaw-backend-data
_68
- name: secret-key
_68
secret:
_68
secretName: skillflaw-runtime-secrets
_68
---
_68
apiVersion: v1
_68
kind: Service
_68
metadata:
_68
name: skillflaw-backend
_68
namespace: skillflaw
_68
spec:
_68
selector:
_68
app: skillflaw-backend
_68
ports:
_68
- name: http
_68
port: 7860
_68
targetPort: 7860

Adjust resource values only after load testing. Start conservative, measure, then scale.

3. Expose the API safely

At ingress or gateway level, decide which public surfaces are actually required:

  • /api/ if the environment serves application traffic
  • / only if you are also exposing the browser UI
  • a dedicated docs hostname only if documentation must be public

A simple production ingress shape is:

  • api.example.comskillflaw-backend
  • app.example.comskillflaw-frontend (optional)
  • docs.example.comskillflaw-docs (optional)

4. Add the frontend only when needed

If production users need the web UI, deploy the frontend image separately:


_23
apiVersion: apps/v1
_23
kind: Deployment
_23
metadata:
_23
name: skillflaw-frontend
_23
namespace: skillflaw
_23
spec:
_23
replicas: 2
_23
selector:
_23
matchLabels:
_23
app: skillflaw-frontend
_23
template:
_23
metadata:
_23
labels:
_23
app: skillflaw-frontend
_23
spec:
_23
containers:
_23
- name: frontend
_23
image: ghcr.io/cwinux/skillflaw_frontend:latest
_23
env:
_23
- name: BACKEND_URL
_23
value: https://api.example.com/
_23
ports:
_23
- containerPort: 80

The frontend image is stateless. Scale it independently from the backend.

5. Validate the live deployment

Before declaring production ready, validate all of the following:

  1. GET /health returns success through the service and through ingress
  2. API authentication works with real x-api-key headers
  3. at least one representative flow runs successfully through /api/v1/run/{flow_id}
  4. if your clients use the OpenAI-compatible path, verify /api/v1/responses
  5. if your clients use MCP tooling, verify /api/v1/mcp/streamable
  6. if the UI is enabled, confirm the frontend can authenticate and execute flows against the production backend

Security and rollout notes

  • rotate the mounted secret-key file through Kubernetes secrets or your external secret manager
  • keep PostgreSQL credentials and external API keys in secrets, not in image layers
  • treat docs exposure as deliberate scope, not as a default requirement
  • prefer rolling updates and keep old backend replicas available until new ones pass readiness checks

See also