Skip to content

Latest commit

 

History

History
215 lines (177 loc) · 9.2 KB

File metadata and controls

215 lines (177 loc) · 9.2 KB

Node Autoscaling

Automatic horizontal and vertical scaling of Hetzner worker nodes via the Kubernetes Cluster Autoscaler with the Hetzner Cloud provider.

KSail natively installs and manages the Cluster Autoscaler when spec.cluster.autoscaler.node.enabled: true is set in the ksail config.


Architecture

KSail (static baseline)
├── 3 control planes (cx33, 4 vCPU / 8 GB, never autoscaled)
└── 3 static workers (cx33, 4 vCPU / 8 GB, guaranteed minimum, Longhorn storage nodes)

Cluster Autoscaler (dynamic workers, managed by KSail)
├── Pool: autoscale-cx23 → 0-4 × CX23 (2 vCPU, 4 GB)
├── Pool: autoscale-cx33 → 0-4 × CX33 (4 vCPU, 8 GB)
├── Pool: autoscale-cx43 → 0-4 × CX43 (8 vCPU, 16 GB)
├── Pool: autoscale-cx53 → 0-4 × CX53 (16 vCPU, 32 GB)
├── maxNodesTotal: 10 (total cluster nodes incl. baseline: 6 static + 4 autoscaler)
└── Expander: [LeastNodes, LeastWaste] (priority chain)
  • Horizontal scaling — autoscaler adds workers when pods are Pending due to insufficient resources, and removes underutilized workers after a configurable cooldown.
  • Vertical scaling — multiple node pools with different server types. The expander is an ordered priority chain, [LeastNodes, LeastWaste] (upstream --expander=least-nodes,least-waste). LeastNodes runs first and keeps the pools that satisfy the pending pods with the fewest total new nodes — preferring the largest adequate type so a burst consolidates onto fewer, bigger servers; LeastWaste then breaks any tie by least idle CPU/memory. (Price is not supported on Hetzner: the cluster-autoscaler hcloud provider implements no pricing API, so KSail rejects it and the autoscaler crashes on startup.) See cluster-autoscaler FAQ.
  • KSail integration — KSail installs the Cluster Autoscaler Helm chart, generates the worker config secret (cluster-autoscaler-config), and manages the Talos snapshot lifecycle. Node pool configuration lives in ksail.prod.yaml, not in Flux manifests.
  • Storage architecture — autoscaler nodes are compute-only (no Hetzner volume, no Longhorn storage). Static KSail workers have dedicated Hetzner volumes and serve as Longhorn storage nodes. Pods on autoscaler nodes access Longhorn PVCs via the CSI driver (network). The Hetzner Cluster Autoscaler does not support volume attachment.

How new nodes join

  1. Cluster Autoscaler detects Pending pods with unmet resource requests.
  2. It calls the Hetzner API to create a new server using:
    • HCLOUD_IMAGE — a Talos snapshot managed by KSail.
    • HCLOUD_CLOUD_INIT — base64-encoded Talos worker machine config generated by KSail.
  3. The server boots Talos, applies the machine config, and joins the cluster.
  4. Once the node is Ready, pending pods are scheduled.

Configuration

All autoscaler configuration lives in ksail.prod.yaml under spec.cluster.autoscaler.node:

spec:
  cluster:
    autoscaler:
      node:
        enabled: true
        expander: [LeastNodes, LeastWaste]
        maxNodesTotal: 10
        scaleDownUnneededTime: "10m"
        pools:
          - name: autoscale-cx23
            serverType: cx23
            location: fsn1
            min: 0
            max: 4
          - name: autoscale-cx33
            serverType: cx33
            location: fsn1
            min: 0
            max: 4
          - name: autoscale-cx43
            serverType: cx43
            location: fsn1
            min: 0
            max: 4
          - name: autoscale-cx53
            serverType: cx53
            location: fsn1
            min: 0
            max: 4
Field Default Description
enabled false Enable/disable node autoscaling
expander LeastWaste Node selection strategy: LeastWaste, LeastNodes, Random (Price is unsupported on Hetzner — no pricing API). Accepts a single value or an ordered priority chain, e.g. [LeastNodes, LeastWaste] (requires KSail expander-list support; scalar-only up to v7.57.0)
maxNodesTotal 0 (unlimited) Hard ceiling on total cluster nodes, including the static baseline (see ksail#5017)
scaleDownUnneededTime 10m Time before an underutilized node is eligible for removal
pools[].name DNS-1123 pool identifier
pools[].serverType Hetzner server type (e.g., cx23, cx33)
pools[].location Hetzner datacenter (e.g., fsn1)
pools[].min Minimum nodes in pool
pools[].max Maximum nodes in pool

Cost guardrails

  • Hard max per poolpools[].max caps each pool independently. Set to 4 so any single CX type can serve a full burst (e.g. 4× cx23) instead of forcing larger types; the maxNodesTotal total ceiling still caps the autoscaler at 4 (10 − 6 baseline), so this changes only the type distribution, never the node count.
  • Total node ceilingmaxNodesTotal caps the total cluster node count, including the static baseline. Set to 10 (6 static + up to 4 autoscaler). It is passed straight to cluster-autoscaler's --max-nodes-total, so the runtime already enforces this total.
  • serverLimit (spec.provider.hetzner.serverLimit) — the Hetzner account hard cap (10). Under the in-progress KSail change (ksail#5017) the autoscaler validation becomes maxNodesTotal ≤ serverLimit (10 ≤ 10); until it ships, the old validation (CP + workers + min(maxNodesTotal, Σ pool.max)) rejects this config, so the KSail change must land first.
  • Expander[LeastNodes, LeastWaste] (current) is an ordered priority chain. LeastNodes runs first and keeps the pools that scale up with the fewest total new nodes (preferring the largest adequate type to keep the node count down); LeastWaste then breaks any remaining tie by least idle CPU/memory. The list form needs KSail expander-list support — releases up to v7.57.0 are scalar-only and reject a list, so the pinned KSAIL_VERSION must be bumped to a release that ships it first. (Price is unsupported on Hetzner.)
  • Scale-down — underutilized nodes are removed after 10 minutes.

Adding more pools

Add a new entry to the pools list in ksail.prod.yaml and run ksail cluster update. KSail updates the Helm release automatically.


Troubleshooting

Autoscaler not scaling up

# Check autoscaler logs
kubectl -n kube-system logs -l app.kubernetes.io/name=cluster-autoscaler --tail=100

# Check for unschedulable pods
kubectl get pods -A --field-selector=status.phase=Pending

# Check autoscaler status ConfigMap
kubectl -n kube-system get cm cluster-autoscaler-status -o yaml

Autoscaler nodes not joining

# Check if the Hetzner server was created
hcloud server list --selector cluster.autoscaler.nodeGroupLabel

# Check Talos bootstrap status (if server IP is known)
talosctl -n <node-ip> health

# Verify the machine config is valid
talosctl validate --config worker.yaml --mode cloud

Cluster rebuild

After a full cluster rebuild (ksail cluster delete + create):

  1. KSail regenerates the worker config secret automatically.
  2. KSail manages the Talos snapshot lifecycle — no manual snapshot creation is needed.

Maintenance

Talos version upgrades

When bumping the Talos (or Kubernetes) version in ksail.prod.yaml, the deploy's ksail cluster update brings both node classes onto the new baseline:

  1. KSail's snapshot lifecycle manager creates or updates the Talos snapshot automatically during cluster update, and the worker machine config is regenerated to match — so new autoscaler nodes boot the new version.
  2. The static control planes and workers are upgraded in place (rolling).
  3. Existing autoscaler nodes are recycled automatically so they follow the new baseline instead of drifting on the old version: after the refreshed cluster-autoscaler is ready, KSail cordons and drains each autoscaler node one at a time (via the Kubernetes eviction API, honoring PodDisruptionBudgets) and deletes its Hetzner server; the autoscaler then re-provisions any still-needed capacity from the new snapshot on demand. This runs only when the version actually changes — a no-op cluster update leaves autoscaler nodes untouched. See the KSail Autoscaler Node Upgrades docs.

A strict PodDisruptionBudget on a workload running on autoscaler nodes can slow or block the drain (the update fails rather than force-evicting), so keep PDBs realistic for compute-only/burstable workloads.

Hetzner server type changes

Hetzner periodically renames or retires server types. Check the Hetzner Cloud changelog and update the pools[].serverType values in ksail.prod.yaml.