Engineering Blog

Publiziert am 12. Juni 2026 von

Kubernetes-Cluster auf cloudscale mit Cluster API provisionieren

Dieser Inhalt ist nur auf Englisch verfügbar:

With the new Cluster API infrastructure provider for cloudscale (CAPCS), Kubernetes clusters become declarative resources you can provision, scale, upgrade, and delete through reconciliation. We cover the architecture and a hands-on walkthrough, from setup to Day 2 operations.

If you are operating Kubernetes clusters, you probably know the burden of managing multiple tools and manual steps spread across the different lifecycle operations. Cluster API (CAPI) solves this issue by managing clusters in Kubernetes itself. Instead of managing clusters through ad-hoc scripts and infrastructure tooling, clusters themselves become Kubernetes resources that can be created, upgraded, scaled, and deleted declaratively. This is done using Kubernetes-style controllers which continuously reconcile the desired cluster state against the actual infrastructure state.

This post assumes working knowledge of Kubernetes — in particular controllers, custom resources, and the reconciliation model. You won't need prior Cluster API experience; we'll introduce its components as we go.

With the cloudscale Cluster API infrastructure provider (CAPCS), it is now possible to provision Cluster API clusters on cloudscale infrastructure. CAPCS complements the existing cloudscale CCM and CSI integrations by extending cloudscale support into cluster lifecycle management itself.

Cluster API provides the following:

a unified declarative way to bootstrap and manage Kubernetes clusters
an abstraction over infrastructure providers such as cloudscale, AWS, Google, Azure, OpenStack, etc.

In practice, Cluster API allows platform teams to manage Kubernetes clusters similarly to how Kubernetes manages applications.

CAPCS plugs into the Cluster API controller stack and manages cloudscale resources. Green marks cloudscale-maintained components.

Key Terms:

a management cluster owns all cluster custom resources and it's where the CAPI controllers run
one or more workload clusters are managed by the management cluster and are where the workload applications run

You may wonder why Kubernetes is required to create Kubernetes. Cluster API controllers run inside a Kubernetes cluster themselves, which is why a management cluster is required. However, management clusters are not special. They can be workload clusters first, and then promoted to be self-managed later.

Cluster API is intentionally modular. Different providers can implement infrastructure provisioning, node bootstrapping, and control plane management independently:

CAPI core: Core lifecycle management controllers
Infrastructure Providers: Responsible for creating and managing infrastructure resources such as virtual machines, networks, and load balancers. For example: cloudscale (CAPCS), Azure (CAPZ)
Bootstrap Providers: Generate the configuration required for machines to join a Kubernetes cluster as nodes. For example: Kubeadm (CABPK), MicroK8s, Talos.
Control Plane Providers: Manage the control plane component of Kubernetes clusters and make it possible to run lifecycle operations with ease. For example: Kubeadm (KCP), Talos.

So far, we stated that cloudscale is "Kubernetes ready" by providing CCM and CSI integrations, but customers had to setup the whole stack and lifecycle management themself. Now, Cluster API infrastructure provider cloudscale (CAPCS) enables you to seamlessly provision and manage clusters on our infrastructure. Follow the tutorial below to try it out!

Tutorial

Prerequisites

An existing management cluster. We'll use kind
kubectl
clusterctl version >= 1.13.0.
A read/write cloudscale API token in the project where clusters will live. You can create this in the Control Panel

We'll be setting up a kubeadm based cluster.

1. Build and import a custom OS image

The node image must already be kubeadm-ready (kubelet, kubeadm, containerd, ...). At the time of writing, CAPCS expects users to build a compatible image on their own. You can build one with image-builder for OpenStack by following their Quickstart tutorial. Once you've built an image, it needs to be imported into the cloudscale Control Panel "Custom images". Select "QCOW2" as the source format and "Passthrough" for user data handling. "BIOS" can be used for the Firmware Type.

Test Images

If you'd rather skip the image build for a quick try, we host pre-built test images until 2026-07-31.

⚠️ Not suitable for production. These images are provided as-is, and will stop being served after the date above.

To use one, go to the Control Panel → Custom Images → Import a Custom Image, paste the URL for your desired Kubernetes version below, and apply the same settings as in the previous step (QCOW2 source format, Passthrough user data, BIOS firmware).

Kubernetes version	Image URL (paste into Control Panel)
`v1.34.8`	`https://capcs-test-images.objects.lpg.cloudscale.ch/ubuntu-2404-kube-v1.34.8`
`v1.35.5`	`https://capcs-test-images.objects.lpg.cloudscale.ch/ubuntu-2404-kube-v1.35.5`
`v1.36.1`	`https://capcs-test-images.objects.lpg.cloudscale.ch/ubuntu-2404-kube-v1.36.1`

2. Initialize the management cluster

Create a local kind cluster, if you don't have an existing cluster yet, and wait until it's ready:

$ kind create cluster
# wait a couple of seconds/minutes until cluster-info returns a healthy endpoint
$ kubectl cluster-info

Make sure CLOUDSCALE_API_TOKEN is exported in your environment, or configure the variable in the clusterctl configuration file at $XDG_CONFIG_HOME/cluster-api/clusterctl.yaml. Then initialize the management cluster:

# export CLOUDSCALE_API_TOKEN if necessary
export CLOUDSCALE_API_TOKEN="REDACTED"

# initialize cluster api components
clusterctl init --infrastructure cloudscale-ch-cloudscale

This will set up several components. Feel free to explore them! Interesting namespaces:

capi-system - CAPI core
capcs-system - CAPCS
capi-kubeadm-bootstrap-system - Kubeadm bootstrap provider
capi-kubeadm-control-plane-system - Kubeadm control plane provider

3. Create your first workload cluster

Setup required configuration:

# SSH public key added to nodes
export CLOUDSCALE_SSH_PUBLIC_KEY="ssh-ed25519 AAAA..."
# cloudscale.ch region
export CLOUDSCALE_REGION="lpg"
# Server image for nodes
export CLOUDSCALE_MACHINE_IMAGE="custom:ubuntu-2404-kube-v1.34.8"
# Flavor for control plane nodes
export CLOUDSCALE_CONTROL_PLANE_MACHINE_FLAVOR="flex-4-2"
# Flavor for worker nodes
export CLOUDSCALE_WORKER_MACHINE_FLAVOR="flex-4-2"
# Root volume size in GB
export CLOUDSCALE_ROOT_VOLUME_SIZE="50"

Ensure the CLOUDSCALE_MACHINE_IMAGE matches the name of your imported custom image (minus the custom: prefix).

Generate cluster YAML:

clusterctl generate cluster quickstart \
  --kubernetes-version v1.34.8 \
  --control-plane-machine-count=3 \
  --worker-machine-count=3 \
  > quickstart.yaml

We provision three control plane nodes to ensure a highly available setup. Because the API server requires a single, stable endpoint regardless of which physical machines are active, CAPCS automatically provisions a Load Balancer (see the CloudscaleCluster resource below) with a dedicated IP address. This IP serves as the primary API endpoint.

Inspect the generated YAML to learn more about how the custom resources define a cluster together. Then, run the following command to apply the manifest:

kubectl apply -f quickstart.yaml

What actually gets added?

The cluster consists of a number of subresources. You can show all of them and their status with the following command:

kubectl get cluster,cloudscalecluster,kubeadmcontrolplane,cloudscalemachinetemplate,machinedeployment,machineset,kubeadmconfigtemplate,machine,cloudscalemachine

Resource explanation

Cluster: Workload cluster definition
CloudscaleCluster: Cloudscale-specific setup (Networks, LoadBalancer, FloatingIP API, ...)
KubeadmControlPlane: Kubeadm configuration for control plane nodes
CloudscaleMachineTemplate: Cloudscale server specifics: Flavor, Image, ...
MachineDeployment: Manages a set of Machines (akin to Kubernetes Deployment)
MachineSet: Similar to a Kubernetes ReplicaSet for Machines
Machine: Represents the lifecycle of a cluster node and its backing infrastructure
CloudscaleMachine: Represents the actual cloudscale server resource backing the Machine
KubeadmConfigTemplate: Kubeadm configuration for worker nodes

The API server endpoint is managed entirely by the CloudscaleCluster. CAPCS allocates a Load Balancer and uses this service as the control plane endpoint, ensuring stable connectivity even when underlying control plane machines are replaced (for instance, during an upgrade).

4. Make the cluster ready

The cluster will now start provisioning. You can observe the cluster status as follows:

kubectl get cluster quickstart
clusterctl describe cluster quickstart

As soon as the PHASE column from kubectl get cluster shows Provisioned, you can fetch the kubeconfig for accessing the workload cluster:

clusterctl get kubeconfig quickstart > ~/.kube/quickstart.yaml

In another terminal, switch to the new kubeconfig:

# or use kubie/kubectx etc.
export KUBECONFIG=~/.kube/quickstart.yaml

If you inspect the workload cluster nodes, you'll realize the nodes are in status "NotReady":

$ kubectl get nodes
NAME                               STATUS     ROLES           AGE   VERSION
quickstart-control-plane-jncqv   NotReady   control-plane   99m   v1.34.8
quickstart-control-plane-7p4mx   NotReady   control-plane   99m   v1.34.8
quickstart-control-plane-q2lnd   NotReady   control-plane   98m   v1.34.8
quickstart-md-0-tzcm9-mlvkj      NotReady   <none>          98m   v1.34.8
quickstart-md-0-tzcm9-8s4rk      NotReady   <none>          98m   v1.34.8
quickstart-md-0-tzcm9-dwn2c      NotReady   <none>          97m   v1.34.8

Cluster API provisions the cluster infrastructure and bootstrap process, but cluster add-ons such as CCM, CNI, and CSI are intentionally left configurable.

The provisioning lifecycle — everything is reconciled automatically except the add-on installation.

The Cluster add-ons

CCM (cloud controller manager) connects Kubernetes to the cloud provider: it sets provider IDs and node addresses, and provisions load balancers for Services. cloudscale ships its own CCM.
CNI (container network interface) implements pod-to-pod networking. Until a CNI is running, nodes stay NotReady. Cloud-agnostic. We'll use Cilium.
CSI (container storage interface) dynamically provisions and attaches volumes for PersistentVolumeClaims. cloudscale ships its own CSI driver. Optional for this tutorial.

CAPI's job ends at provisioned, bootstrapped machines; making them Ready is the cluster's add-ons' job. We need the CCM and a CNI for that, so let's install them in order.

Important: Make sure to execute the following commands in your workload cluster!

Install CCM

# Make sure to have the right kubectl config context
export KUBECONFIG=~/.kube/quickstart.yaml

# setup secret, you may want to use a different api token to be able to distinguish the actors in the audit log
kubectl create secret generic cloudscale \
  --from-literal=access-token='...' \
  --namespace kube-system
# apply CCM
kubectl apply -f https://github.com/cloudscale-ch/cloudscale-cloud-controller-manager/releases/latest/download/config.yml

The CCM configures cloud-specific node information such as provider IDs:

$ kubectl get nodes -o custom-columns='NAME:.metadata.name,PROVIDER_ID:.spec.providerID,READY:.status.conditions[?(@.type=="Ready")].status'
NAME                               PROVIDER_ID                                         READY
quickstart-control-plane-jncqv   cloudscale://2d3f1a8c-9b4e-4c7a-8f12-6e0a9d3b5c41   False
quickstart-control-plane-7p4mx   cloudscale://4e6b2c91-7d83-4a05-b2f1-8c3d0e5a7b62   False
quickstart-control-plane-q2lnd   cloudscale://9f0a3d12-5c47-4e89-a1b6-2d8f4c70e391   False
quickstart-md-0-tzcm9-mlvkj      cloudscale://7a1b9e0d-3c46-4f82-91ad-b5e2f4c80a96   False
quickstart-md-0-tzcm9-8s4rk      cloudscale://1c5d7e23-8a94-4b60-9f3e-6b0a2d4c8f57   False
quickstart-md-0-tzcm9-dwn2c      cloudscale://6b3f9a40-2e51-4d78-83c2-9a7e1f5b0d34   False

Install CNI

We'll use cilium in this example. Follow the official documentation for installing it. Then run:

# Make sure to have the right kubectl config context
export KUBECONFIG=~/.kube/quickstart.yaml

cilium install

This will take a few seconds (less than a minute) until:

$ kubectl get nodes
NAME                               STATUS   ROLES           AGE    VERSION
quickstart-control-plane-jncqv   Ready    control-plane   109m   v1.34.8
quickstart-control-plane-7p4mx   Ready    control-plane   109m   v1.34.8
quickstart-control-plane-q2lnd   Ready    control-plane   108m   v1.34.8
quickstart-md-0-tzcm9-mlvkj      Ready    <none>          107m   v1.34.8
quickstart-md-0-tzcm9-8s4rk      Ready    <none>          107m   v1.34.8
quickstart-md-0-tzcm9-dwn2c      Ready    <none>          106m   v1.34.8

Congratulations! You now have a fully working cluster provisioned using Cluster API.

Install CSI (optional)

To be able to dynamically provision volumes from cloudscale, follow the installation steps for csi-cloudscale.

Day 2 Operations

Unlike infrastructure provisioning tools which are typically executed externally, Cluster API continuously reconciles cluster state from inside Kubernetes itself.

Healthchecking

The provided templates do not configure Healthchecks for your Machines. Using MachineHealthChecks you can define under which conditions a Machine should be considered unhealthy and do automatic remediation on them.

Scaling Workers

Scaling workers is as simple as scaling the MachineDeployment:

kubectl scale machinedeployment quickstart-md-0 --replicas=5

Using the Cluster API provider for cluster-autoscaler it is also very simple to enable autoscaling for your workload. As a nice addition, CAPCS supports scale from zero natively. Follow the README to install it.

Upgrading Kubernetes Version

As mentioned in the introduction, Cluster API supports the full lifecycle of Kubernetes cluster management. This includes upgrading the Kubernetes version. In order to upgrade a Kubernetes cluster, you'll need to first build an OS image (see above and the image-builder repository for guidelines) with the target Kubernetes version, and import it to your custom images in the cloudscale Control Panel. Upgrading works by first upgrading the control plane machines, and then the worker machines.

1. Copy the CloudscaleMachineTemplate

Copy the existing CloudscaleMachineTemplate and adjust its name and set the updated image.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: CloudscaleMachineTemplate
metadata:
  name: "quickstart-control-plane-v1.35"
  namespace: "${NAMESPACE}"
spec:
  template:
    spec:
      image: "custom:ubuntu-2404-kube-v1.35.5"
      [...]

Make sure to apply the copied template.

2. Modify the existing KubeadmControlPlane

apiVersion: controlplane.cluster.x-k8s.io/v1beta2
kind: KubeadmControlPlane
metadata:
  name: "quickstart-control-plane"
  namespace: "${NAMESPACE}"
spec:
  version: "v1.35.5"                               # << updated
  machineTemplate:
    spec:
      infrastructureRef:
        name: "quickstart-control-plane-v1.35"     # << updated
  [...]

Cluster API performs upgrades through rolling machine replacements: new machines are created with the updated configuration before old machines are removed. As soon as the KubeadmControlPlane has been updated, you can see this in action:

$ kubectl get machine
NAME                               CLUSTER        NODE NAME                          FAILURE DOMAIN   READY   AVAILABLE   UP-TO-DATE   PHASE      AGE     VERSION
quickstart-control-plane-69cnk   quickstart   quickstart-control-plane-69cnk                    False   False       True         Running    104s    v1.35.5
quickstart-control-plane-jncqv   quickstart   quickstart-control-plane-jncqv                    True    True        False        Deleting   3h26m   v1.34.8
quickstart-control-plane-7p4mx   quickstart   quickstart-control-plane-7p4mx                    True    True        False        Running    3h26m   v1.34.8
quickstart-control-plane-q2lnd   quickstart   quickstart-control-plane-q2lnd                    True    True        False        Running    3h25m   v1.34.8

Cluster API replaces control plane machines one at a time, keeping a quorum of healthy nodes throughout: here a new v1.35.5 machine is Running, the first old v1.34.8 machine is Deleting, and the remaining two are still Running until their turn.

Once the new control plane is ready, it will continue with replacing the worker nodes. The adjustments to the MachineDeployment md-0 are left as an exercise to the reader but it works almost the same way: copy CloudscaleMachineTemplate and adjust name and image, edit MachineDeployment and adjust version and referenced infrastructureRef.

Once the cluster is upgraded, the old CloudscaleMachineTemplates can safely be removed.

Promote a workload cluster to be the management cluster

A regular workload cluster can be promoted to become the management cluster. Often, the initial cluster of a Cluster API setup is referred to as a "seed" cluster and is just used for bootstrapping Cluster API. It is later abandoned and the first workload cluster becomes the management cluster.

For example, if you started with kind as your seed cluster, you may want to promote the first workload cluster to be your management cluster where it self-manages itself and potentially other workload clusters. This complex-sounding operation is relatively simple and done in two steps:

Initialize the target cluster using clusterctl init --infrastructure cloudscale-ch-cloudscale.
Move the Cluster API objects defined in the current namespace to the target cluster: clusterctl move --to-kubeconfig="path-to-target-kubeconfig.yaml" Once this is done, you should be able to run all operations from the new management cluster and you can tear down the initial cluster.

Deleting workload clusters

Warning! Destructive Actions ahead!

Deleting a workload cluster is straightforward:

kubectl delete cluster <name>

Important: Always delete the cluster object and not any downstream resources from it. Doing so might render the cluster not processable and might require manual cleanup (e.g. if you delete the api token secret before the cluster has been deleted). This is especially important to note when using GitOps tools for managing clusters.

Going further

As you can see, the process outlined in this tutorial is to make sure you understand the basic operations available to Cluster API clusters. For production setups you may want to automate most if not all steps. GitOps can help tremendously, but also some additional concepts of Cluster API are very much worth looking into:

Healthchecking already mentioned above, is a crucial aspect of production clusters.
ClusterResourceSet allows to label clusters and automatically apply a set of resources to workload clusters. CAPCS maintains a couple of examples in the addons templates.
ClusterClass becomes especially useful once you're operating many clusters with largely identical configurations.
Cluster API Operator is an operator to manage Cluster API components in a management cluster, allowing to use GitOps mechanisms for component upgrades and configuration.
Cluster API Visualizer showcases a visual UI of workload clusters.

What did we learn

Cluster API turns Kubernetes cluster management itself into a declarative Kubernetes workflow. Provisioning, scaling, upgrades, and teardown are all handled through Kubernetes-style reconciliation instead of custom automation scripts.

With CAPCS, cloudscale infrastructure becomes part of the broader Cluster API ecosystem, complementing the existing CCM and CSI integrations. Together, these components cover infrastructure provisioning, Kubernetes cloud integration, and persistent storage management using Kubernetes-native APIs and tooling.

While a dedicated management cluster introduces additional operational overhead, Cluster API becomes especially powerful for ephemeral environments such as CI clusters, preview environments, or temporary testing infrastructure where entire Kubernetes clusters can be created and removed declaratively.

Wenn du uns Kommentare oder Korrekturen mitteilen möchtest, kannst du unsere Engineers unter engineering-blog@cloudscale.ch erreichen.

Zurück zur Übersicht