Deploy SkillFlaw to Kubernetes for production
This guide covers the production-oriented Kubernetes shape for SkillFlaw: a stable backend API deployment with externalized state, explicit secrets handling, and optional UI/docs exposure.
Unlike older historical documentation, SkillFlaw does not ship a runtime Helm chart. The supported production baseline is built from the same images and environment variables used by the reference compose stack.
Production goals
In production, the backend is the critical path. It is responsible for:
- flow execution through
/api/v1/run/{flow_id_or_alias} - OpenAI-compatible access through
/api/v1/responses - MCP serving through
/api/v1/mcp/streamable - management APIs used by the frontend when the UI is enabled
You can deploy only the backend if your consumers are other services or SDKs. Add the frontend and docs only when the production environment truly needs browser access.
Prerequisites
- a Kubernetes cluster
kubectl- access to
ghcr.io/cwinux/*images - a production PostgreSQL deployment
- a Redis deployment when using Redis-backed caching
- persistent storage for
SKILLFLAW_CONFIG_DIR - a secure way to mount the secret key file used by
SKILLFLAW_SECRET_KEY_FILE
Recommended production topology
At minimum, production should include:
- backend deployment
- PostgreSQL
- Redis
- ingress or gateway
- persistent volume claim for backend-managed files
Optional services:
- frontend deployment for browser UI
- docs deployment when docs need a dedicated production hostname
1. Externalize state before scaling
Do not scale the backend first and figure out state later.
Before you run multiple replicas, make sure all three areas are explicit:
- PostgreSQL connection
- Redis connection
- writable persistent storage for
SKILLFLAW_CONFIG_DIR
Use the same environment contract already present in docker/docker-compose.yml:
_15apiVersion: v1_15kind: ConfigMap_15metadata:_15 name: skillflaw-backend-config_15 namespace: skillflaw_15data:_15 SKILLFLAW_CONFIG_DIR: /var/lib/skillflaw_15 SKILLFLAW_DATABASE_URL: postgresql://skillflaw:skillflaw@postgresql:5432/skillflaw_15 SKILLFLAW_CONFIG_MODEL: local_15 SKILLFLAW_CACHE_TYPE: redis_15 SKILLFLAW_REDIS_HOST: redis_15 SKILLFLAW_REDIS_PORT: "6379"_15 SKILLFLAW_HOST: 0.0.0.0_15 SKILLFLAW_PORT: "7860"_15 SKILLFLAW_OPEN_BROWSER: "false"
Store the secret key as a mounted file, not as an inline image bake-time constant.
2. Deploy the backend with probes and storage
The backend image starts uvicorn --factory skillflaw.main:create_app and listens on port 7860.
_68apiVersion: apps/v1_68kind: Deployment_68metadata:_68 name: skillflaw-backend_68 namespace: skillflaw_68spec:_68 replicas: 2_68 selector:_68 matchLabels:_68 app: skillflaw-backend_68 template:_68 metadata:_68 labels:_68 app: skillflaw-backend_68 spec:_68 containers:_68 - name: backend_68 image: ghcr.io/cwinux/skillflaw_backend:latest_68 ports:_68 - containerPort: 7860_68 envFrom:_68 - configMapRef:_68 name: skillflaw-backend-config_68 env:_68 - name: SKILLFLAW_SECRET_KEY_FILE_68 value: /run/secrets/skillflaw_secret_key_68 volumeMounts:_68 - name: backend-data_68 mountPath: /var/lib/skillflaw_68 - name: secret-key_68 mountPath: /run/secrets/skillflaw_secret_key_68 subPath: skillflaw_secret_key_68 readOnly: true_68 readinessProbe:_68 httpGet:_68 path: /health_68 port: 7860_68 livenessProbe:_68 httpGet:_68 path: /health_68 port: 7860_68 resources:_68 requests:_68 cpu: 500m_68 memory: 1Gi_68 limits:_68 cpu: "2"_68 memory: 4Gi_68 volumes:_68 - name: backend-data_68 persistentVolumeClaim:_68 claimName: skillflaw-backend-data_68 - name: secret-key_68 secret:_68 secretName: skillflaw-runtime-secrets_68---_68apiVersion: v1_68kind: Service_68metadata:_68 name: skillflaw-backend_68 namespace: skillflaw_68spec:_68 selector:_68 app: skillflaw-backend_68 ports:_68 - name: http_68 port: 7860_68 targetPort: 7860
Adjust resource values only after load testing. Start conservative, measure, then scale.
3. Expose the API safely
At ingress or gateway level, decide which public surfaces are actually required:
/api/if the environment serves application traffic/only if you are also exposing the browser UI- a dedicated docs hostname only if documentation must be public
A simple production ingress shape is:
api.example.com→skillflaw-backendapp.example.com→skillflaw-frontend(optional)docs.example.com→skillflaw-docs(optional)
4. Add the frontend only when needed
If production users need the web UI, deploy the frontend image separately:
_23apiVersion: apps/v1_23kind: Deployment_23metadata:_23 name: skillflaw-frontend_23 namespace: skillflaw_23spec:_23 replicas: 2_23 selector:_23 matchLabels:_23 app: skillflaw-frontend_23 template:_23 metadata:_23 labels:_23 app: skillflaw-frontend_23 spec:_23 containers:_23 - name: frontend_23 image: ghcr.io/cwinux/skillflaw_frontend:latest_23 env:_23 - name: BACKEND_URL_23 value: https://api.example.com/_23 ports:_23 - containerPort: 80
The frontend image is stateless. Scale it independently from the backend.
5. Validate the live deployment
Before declaring production ready, validate all of the following:
GET /healthreturns success through the service and through ingress- API authentication works with real
x-api-keyheaders - at least one representative flow runs successfully through
/api/v1/run/{flow_id} - if your clients use the OpenAI-compatible path, verify
/api/v1/responses - if your clients use MCP tooling, verify
/api/v1/mcp/streamable - if the UI is enabled, confirm the frontend can authenticate and execute flows against the production backend
Security and rollout notes
- rotate the mounted secret-key file through Kubernetes secrets or your external secret manager
- keep PostgreSQL credentials and external API keys in secrets, not in image layers
- treat docs exposure as deliberate scope, not as a default requirement
- prefer rolling updates and keep old backend replicas available until new ones pass readiness checks