Skip to content

Hetzner Cloud

Set up Hetzner Cloud infrastructure for Ironflow, then deploy using deployment templates. This guide walks through every step from a blank Hetzner account to a running Ironflow instance.

For deploying Ironflow on an existing Kubernetes cluster (any provider), see Kubernetes Deployment.

Prerequisites

Install these tools before starting:

Environment Variables

Set these environment variables before starting. The required variables are set in Step 1 and Step 2, but this table serves as a complete reference.

Required

VariablePurposeWhere to get itUsed in
HCLOUD_TOKENHetzner Cloud API authenticationHetzner Console → Security → API TokensStep 1, Step 4 (provisioning)
KUBECONFIGPath to the cluster’s kubeconfig fileGenerated by ironflow provision createStep 4-7 (all kubectl commands)
GITHUB_USERNAMEGitHub username for container registryYour GitHub accountStep 5 (image pull secret)
GITHUB_PATGitHub Personal Access Token (read:packages scope)GitHub Token SettingsStep 5 (image pull secret)
HETZNER_S3_ACCESS_KEYObject storage authentication (access key)Hetzner Console → Object Storage → Manage credentialsStep 2, Step 5 (S3 backup secret)
HETZNER_S3_SECRET_KEYObject storage authentication (secret key)Hetzner Console → Object Storage → Manage credentialsStep 2, Step 5 (S3 backup secret)
HETZNER_S3_ENDPOINTObject storage endpoint URLHetzner Console → Object Storage → bucket detailsStep 5 (S3 backup secret), Step 6 (deploy)
HETZNER_S3_BUCKETObject storage bucket nameHetzner Console → Object StorageStep 6 (deploy, backup destination path)

Optional (Terraform overrides)

These override values in terraform.tfvars. You generally don’t need them since the per-template tfvars files are provided, but they’re available for CI/CD or scripted provisioning:

VariablePurposeDefault
TF_VAR_cluster_nameKubernetes cluster name"ironflow"
TF_VAR_locationHetzner datacenter (fsn1, nbg1, hel1)"fsn1"
TF_VAR_control_plane_typeControl plane server type"cpx22"
TF_VAR_control_plane_countControl plane node count (must be odd)3
TF_VAR_worker_typeWorker node server type"cpx32"
TF_VAR_worker_countWorker node count2

HCLOUD_TOKEN is automatically passed to Terraform as TF_VAR_hcloud_token by the ironflow provision command. You don’t need to set both.

Step 1: Set Up Your Hetzner Project

Create a project in the Hetzner Cloud Console if you don’t have one, then generate an API token.

  1. Go to your project → SecurityAPI Tokens
  2. Click Generate API Token with Read & Write permissions
  3. Save the token
Terminal window
export HCLOUD_TOKEN=your-token-here
hcloud context create ironflow # saves the token for hcloud CLI

Step 2: Create Backup Storage

Ironflow backs up PostgreSQL to S3-compatible object storage daily. Set this up before provisioning the cluster so everything is ready when you deploy.

Create a bucket

In the Hetzner Cloud Console:

  1. Go to Object Storage in the left sidebar
  2. Click Create Bucket
  3. Name: ironflow-backups
  4. Visibility: Private
  5. Click Create & Buy now

Hetzner Object Storage is not yet supported by the Terraform provider, so bucket creation is a manual step.

Generate S3 credentials

  1. Go to Object Storage → your bucket
  2. Click Manage credentials under S3 Credentials
  3. Click Generate credentials
  4. Note the endpoint URL from your bucket details page
  5. Export the credentials as environment variables:
Terminal window
export HETZNER_S3_ACCESS_KEY=your-access-key
export HETZNER_S3_SECRET_KEY=your-secret-key
export HETZNER_S3_ENDPOINT=https://fsn1.your-objectstorage.com # from bucket details
export HETZNER_S3_BUCKET=ironflow-backups

Step 3: Choose Your Template and Node Sizing

There are two independent choices: your Ironflow deployment template (what Ironflow runs) and your Kubernetes cluster size (what hardware it runs on).

Deployment templates

Templates control the Ironflow application: replica count, PostgreSQL HA, NATS topology, and connection pooling. You select a template when you run ironflow deploy --template <name>.

TemplateIronflow replicasPostgreSQLNATSPgBouncerUse case
Small1Bundled, 1 instanceBundled, 1 nodeNoDev, staging, small teams
Medium3 (HA)Bundled, 2 instances (HA)Bundled, 3-node clusterYes (2 pods)Production
Large2-10 (HPA)ExternalExternalNo (BYO)Enterprise with managed deps

Kubernetes cluster sizing

Cluster sizing controls the Hetzner servers: how many nodes and how powerful. You configure this in deploy/terraform/hetzner/terraform.tfvars before provisioning. The cluster is provisioned once and you can deploy any template onto it (as long as the hardware has enough resources).

Minimum recommended cluster per template:

TemplateMin worker RAMMin worker CPURecommended clusterEst. server cost
Small2 GB2 vCPU1 control + 1 worker (cpx22 + cpx32)~€15/month
Medium4 GB3 vCPU3 control + 2 workers (cpx22)~€38/month
Large8 GB+4+ vCPU3 control + 2 workers (cpx32)~€52/month

Server costs only. Additional costs apply for load balancer (~€6/month), volumes, Object Storage, and network traffic.

You can deploy the Small template on a Large cluster (safe, just overprovisioned) or upgrade from Small to Medium without reprovisioning — as long as the cluster has enough resources. However, switching from Small to Medium requires deleting and redeploying because NATS topology changes (1 node to 3-node cluster) can’t be upgraded in place.

Configure node sizes

Pre-built Terraform variable files are provided for each template:

deploy/terraform/hetzner/
├── terraform.small.tfvars # 1 control + 1 worker
├── terraform.medium.tfvars # 3 control + 2 workers
├── terraform.large.tfvars # 3 control + 2 workers
└── terraform.tfvars.example # Reference with all options

The ironflow provision create command uses these files automatically via the --template flag. If using Terraform directly, copy the one that matches your template:

Terminal window
cd deploy/terraform/hetzner
cp terraform.small.tfvars terraform.tfvars
# Edit terraform.tfvars to customize cluster_name, location, etc.

Available locations: fsn1 (Falkenstein), nbg1 (Nuremberg), hel1 (Helsinki). Control plane count must be odd (1, 3, or 5) for etcd quorum. For higher throughput, edit the worker type or count in your terraform.tfvars.

Step 4: Provision the Kubernetes Cluster

Terminal window
ironflow provision create --provider hetzner --template small --name ironflow

This runs Terraform to create the cluster (~5-8 minutes), then writes kubeconfig and talosconfig to deploy/terraform/hetzner/. A durable copy is saved to ~/.kube/clusters/hetzner-<name>.yaml.

Terminal window
export KUBECONFIG=~/.kube/clusters/hetzner-ironflow.yaml
kubectl get nodes

Provision --name vs Deploy --name

The --name you use with ironflow provision is the cluster name (Hetzner servers, networks, firewalls are named after it). The --name you use with ironflow deploy is the Helm release name (an application install within a cluster). They serve different purposes and don’t need to match.

Deploy commands default to your current kubectl context. If you manage multiple clusters, always pass --kubeconfig to deploy commands to ensure you target the correct cluster:

Terminal window
ironflow deploy --template small --name dev \
--kubeconfig ~/.kube/clusters/hetzner-ironflow.yaml

Step 5: Create Kubernetes Secrets

With the cluster running, create the namespace and secrets that Ironflow needs.

Terminal window
export KUBECONFIG=~/.kube/clusters/hetzner-ironflow.yaml
# Create the namespace
kubectl create namespace ironflow

Image pull secret

Required if the Ironflow container image is in a private registry (e.g., private GHCR):

Terminal window
kubectl create secret docker-registry ghcr-pull-secret \
--namespace ironflow \
--docker-server=ghcr.io \
--docker-username=$GITHUB_USERNAME \
--docker-password=$GITHUB_PAT

The GITHUB_PAT requires a GitHub Personal Access Token with the read:packages scope.

S3 backup credentials

Uses the environment variables from Step 2:

Terminal window
kubectl create secret generic ironflow-s3-creds -n ironflow \
--from-literal=ACCESS_KEY_ID="$HETZNER_S3_ACCESS_KEY" \
--from-literal=SECRET_ACCESS_KEY="$HETZNER_S3_SECRET_KEY"

The default Small and Medium values files reference this secret name (ironflow-s3-creds) and are pre-configured for Hetzner Object Storage. The S3 destination path is auto-derived from the Helm release name (s3://ironflow-backups/<release-name>), so each deployment gets an isolated backup path. The S3 endpoint URL is passed during deploy via --set (see Step 6).

Step 6: Deploy Ironflow

Terminal window
ironflow deploy --template small --name dev

The ironflow deploy command automatically:

  • Reads HETZNER_S3_ENDPOINT and HETZNER_S3_BUCKET from environment variables and configures the S3 backup destination

  • Installs these prerequisites on first deploy:

    • CloudNativePG operator — manages PostgreSQL clusters (Small and Medium only)
    • Barman Cloud Plugin — S3-compatible backups (Small and Medium only)
    • cert-manager — TLS certificate management (all templates)
    • kube-prometheus-stack — Prometheus, Grafana, and alerting (all templates)

If these are already installed, the command detects them and skips installation.

Or for Medium/Large:

Terminal window
# Medium — 3 replicas, NATS cluster, HA PostgreSQL
ironflow deploy --template medium --name staging
# Medium with Hetzner load balancer — adds Traefik ingress + LB optimizations
ironflow deploy --template medium --name prod --hetzner-location fsn1
# Large — HPA, external PostgreSQL + NATS
ironflow deploy --template large --name prod \
--set externalDatabase.url=postgres://user:pass@host:5432/ironflow \
--set externalNats.url=nats://nats-1:4222,nats://nats-2:4222

The --hetzner-location flag installs Traefik as the ingress controller with Hetzner-optimized load balancer settings (proxy protocol, private network routing, health checks). Match the location to your cluster’s datacenter (fsn1, nbg1, or hel1). See Step 8 for enabling Ingress after deploy.

If you are deploying with helm install directly (instead of the ironflow deploy CLI), you must install the prerequisites manually. See Kubernetes Deployment for manual installation commands.

For detailed deploy options (CLI vs Helm, customization, upgrades), see Kubernetes Deployment.

Step 7: Verify

Terminal window
# Check Ironflow pods
ironflow deploy status --name dev
# Check all pods are running
kubectl get pods -n ironflow
# Check PostgreSQL cluster health
kubectl get cluster -n ironflow
# Check backups are scheduled
kubectl get scheduledbackups -n ironflow
# Verify health endpoints
kubectl port-forward svc/dev-ironflow -n ironflow 9123:9123 &
curl -s http://localhost:9123/health
curl -s http://localhost:9123/ready
# Open the dashboard at http://localhost:9123

Retrieve the admin API key from the first-boot logs:

Terminal window
kubectl logs -n ironflow $(kubectl get pods -n ironflow \
-l app.kubernetes.io/component=server -o name | head -1) | grep -A8 "Admin API Key"

Verify monitoring

Terminal window
# Check CNPG PodMonitor
kubectl get podmonitors -n ironflow
# Check Ironflow ServiceMonitor
kubectl get servicemonitors -n ironflow
# Check PostgreSQL alert rules
kubectl get prometheusrules -n ironflow
# Verify Ironflow exposes metrics
curl -s http://localhost:9123/metrics | head -5

Step 8: External Access via Load Balancer

By default, Ironflow is only accessible inside the cluster (ClusterIP). If you need external access for push-mode webhooks, the dashboard, or API clients, set up a load balancer. Skip this step if port-forward is sufficient (dev/staging) or if your cluster is only accessed via VPN.

When do you need a load balancer?

  • Yes: Push-mode functions (external services POST to Ironflow), dashboard access for teams outside the cluster, HA failover across nodes.
  • No: Dev/staging accessed via kubectl port-forward, pull-mode only (workers connect outbound), single-team with VPN.

If you deployed with --hetzner-location in Step 6, Traefik and a Hetzner Load Balancer are already installed. If not, re-run deploy with the flag:

Terminal window
ironflow deploy upgrade --template medium --name prod --hetzner-location fsn1

Enable Ingress

Once the load balancer has an external IP (shown during deploy), enable Ingress with your domain:

Terminal window
ironflow deploy upgrade --template medium --name prod \
--set ingress.enabled=true \
--set ingress.host=ironflow.example.com

Point your DNS A record to the load balancer IP (shown during deploy).

TLS certificates are automatically issued by cert-manager via Let’s Encrypt.

Verify

Terminal window
# Check load balancer IP
kubectl get svc -n traefik traefik
# Check Ingress
kubectl get ingress -n ironflow
# Test access
curl -k https://ironflow.example.com/health

Option B: Direct LoadBalancer Service (simple alternative)

For simple deployments without Ingress routing, you can expose the Ironflow service directly:

Terminal window
ironflow deploy upgrade --template medium --name prod \
--set service.type=LoadBalancer \
--set service.annotations."load-balancer\.hetzner\.cloud/location"=fsn1

This creates a dedicated Hetzner LB for the Ironflow service. No hostname routing, no TLS termination at the LB level.

Load Balancer Costs

ResourceCost
Hetzner LB11~€6/month
Additional bandwidthIncluded (30 TB/month)

Load Balancer Troubleshooting

Load balancer stuck in <pending>:

  • Check Hetzner CCM is running: kubectl get pods -n kube-system -l app.kubernetes.io/name=hcloud-cloud-controller-manager
  • Check HCLOUD_TOKEN is set in the CCM deployment
  • Check Hetzner API status: hcloud load-balancer list

All requests return 400 Bad Request:

  • Proxy protocol mismatch. Both sides must be enabled or both disabled.
  • Check Traefik args: kubectl get deploy -n traefik traefik -o yaml | grep proxyProtocol
  • Check LB annotation: kubectl get svc -n traefik traefik -o yaml | grep proxyprotocol

TLS certificate not issuing:

  • Check cert-manager: kubectl get certificate -n ironflow
  • Check ClusterIssuer: kubectl get clusterissuer
  • DNS must point to the LB IP for ACME HTTP-01 challenge to work

Multi-Tenant Load Balancing

With Option A (Traefik Ingress), a single Hetzner Load Balancer serves all tenants on the cluster. Traefik reads Ingress resources across all namespaces and routes traffic by hostname.

Internet → Hetzner LB (one, ~€6/mo)
→ Traefik pods (NodePort, private network)
→ Ingress: acme.ironflow.example.com → tenant-acme/acme-ironflow
→ Ingress: globex.ironflow.example.com → tenant-globex/globex-ironflow
→ Ingress: ironflow.example.com → ironflow/prod-ironflow

Install Traefik once per cluster (see Option A above), then deploy each tenant with Ingress enabled:

Terminal window
# First tenant
helm install acme ./deploy/helm/ironflow \
-n tenant-acme --create-namespace \
-f deploy/helm/ironflow/values-multi-tenant.yaml \
--set ingress.enabled=true \
--set ingress.host=acme.ironflow.example.com \
--set ironflow.masterKey=$(openssl rand -hex 32)
# Install Traefik with Hetzner LB (once per cluster)
# See Option A in the section above for Traefik installation
# Additional tenants — reuse the existing LB
helm install globex ./deploy/helm/ironflow \
-n tenant-globex --create-namespace \
-f deploy/helm/ironflow/values-multi-tenant.yaml \
--set ingress.enabled=true \
--set ingress.host=globex.ironflow.example.com \
--set ironflow.masterKey=$(openssl rand -hex 32)

Each tenant gets its own TLS certificate (auto-issued by cert-manager) and is network-isolated via NetworkPolicy (defaultDeny: true in values-multi-tenant.yaml). The Traefik namespace is in allowNamespaces so ingress traffic can reach tenant pods.

Avoid Option B for multi-tenant

With Option B (direct LoadBalancer service), each tenant with service.type=LoadBalancer creates a separate Hetzner LB (~€6/mo each). At 10 tenants that’s ~€60/mo in LBs alone, with no hostname routing or shared TLS. Use Option A for multi-tenant deployments.

DNS Configuration

Point your domain to the load balancer IP so Traefik can route traffic and cert-manager can issue TLS certificates.

Find the load balancer IP

Terminal window
# From kubectl
kubectl get svc -n traefik traefik -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
# Or from hcloud CLI
hcloud load-balancer list -o columns=name,ipv4

Option 1: Wildcard DNS (simplest for multi-tenant)

Create a single wildcard A record and all tenant subdomains resolve automatically:

*.ironflow.example.com → A <LB_IP>

New tenants work immediately with --set ingress.host=<tenant>.ironflow.example.com — no DNS changes needed per tenant.

Option 2: Per-tenant DNS records

Create individual A records for each tenant:

acme.ironflow.example.com → A <LB_IP>
globex.ironflow.example.com → A <LB_IP>
ironflow.example.com → A <LB_IP>

This gives you explicit control but requires a DNS change for each new tenant.

Hetzner DNS

If your domain uses Hetzner DNS, create records in the Hetzner DNS Console or via the API:

Terminal window
# Wildcard for all tenants
# Hetzner DNS Console → your zone → Add Record → Type: A, Name: *, Value: <LB_IP>
# Or per-tenant
# Type: A, Name: acme.ironflow, Value: <LB_IP>

External DNS providers

For Cloudflare, Route53, Google Cloud DNS, or other providers, create A records pointing to the LB IP using your provider’s dashboard or CLI.

Cloudflare

If using Cloudflare, disable the proxy (orange cloud → grey cloud) for the initial setup so cert-manager’s HTTP-01 ACME challenge can reach the LB directly. You can re-enable the proxy after certificates are issued if you switch to DNS-01 challenges.

Automatic DNS with external-dns (optional)

external-dns can auto-create DNS records from Ingress resources. When a new tenant is deployed with ingress.host=acme.ironflow.example.com, external-dns automatically creates the A record at your DNS provider.

Terminal window
# Install external-dns (example for Hetzner DNS)
helm repo add external-dns https://kubernetes-sigs.github.io/external-dns
helm install external-dns external-dns/external-dns \
-n external-dns --create-namespace \
--set provider.name=hetzner \
--set env[0].name=HETZNER_DNS_API_TOKEN \
--set env[0].value=$HETZNER_DNS_TOKEN

external-dns supports Hetzner DNS, Cloudflare, Route53, Google Cloud DNS, and many others.

TLS certificates

TLS is handled automatically. The Helm chart sets cert-manager.io/cluster-issuer: letsencrypt-prod on every Ingress resource when tls: true (the default). Once DNS points to the LB IP:

  1. cert-manager detects the new Ingress with TLS enabled
  2. Requests a Let’s Encrypt certificate via HTTP-01 challenge
  3. Stores the certificate as a Secret (<release>-ironflow-tls) in the tenant’s namespace
  4. Traefik serves HTTPS automatically

Each tenant gets its own TLS certificate. Check certificate status with:

Terminal window
kubectl get certificate -n tenant-acme
kubectl describe certificate -n tenant-acme

Cluster Management

Check status

Terminal window
ironflow provision status --provider hetzner --name ironflow

Upgrade Ironflow

Terminal window
ironflow deploy upgrade --template small --name dev

Tear down

Terminal window
ironflow provision destroy --provider hetzner --name ironflow

File Structure

deploy/terraform/hetzner/
├── main.tf # Cluster module + providers
├── variables.tf # Input variables (token, cluster name, node sizes)
├── outputs.tf # kubeconfig path, talosconfig path, cluster info
├── terraform.tfvars.example # Reference with all options
├── terraform.small.tfvars # Small cluster: 1 control + 1 worker
├── terraform.medium.tfvars # Medium cluster: 3 control + 2 workers
├── terraform.large.tfvars # Large cluster: 3 control + 2 workers
├── .terraform.lock.hcl # Provider lock file (committed for reproducibility)
├── teardown.sh # Clean destroy with hcloud CLI fallback
└── .gitignore # Ignores state files, kubeconfig, talosconfig

Troubleshooting

Placement Groups Already Exist

If terraform apply fails with placement_group not unique, leftover resources from a previous run exist:

Terminal window
hcloud placement-group list
hcloud placement-group delete <id>

Terraform State Issues

If Terraform state gets out of sync, run ironflow provision destroy --provider hetzner --name ironflow (or ./teardown.sh directly) to force-clean all resources, then start fresh.

Node Not Ready

Talos Linux nodes take 1-2 minutes after provisioning to register with the Kubernetes API. If kubectl get nodes shows NotReady, wait and retry.

ImagePullBackOff

Container image is in a private registry. Create the pull secret as described in Step 5.

Firewall Blocks API Access

If kubectl times out connecting to the cluster, your IP may have changed since provisioning. The Hetzner firewall restricts port 6443 to the IP that ran Terraform. Update it:

Terminal window
# Find your current IP
curl -s ifconfig.me
# Update the firewall rules
hcloud firewall describe ironflow
# Update Source IPs for the "Allow Incoming Requests to Kube API" rule
# with your current IP via the Hetzner Cloud Console or hcloud CLI