Raw Block Volumes via CSI Driver

If applications are to maintain persistent data in a Kubernetes setup, CSI is what you are looking for. The "container storage interface" makes it possible to automatically provide persistent volumes on the correct node so that they can be mounted into the desired pod. Some applications, however, are unable to store their data in the form of files in a mounted file system, but require direct disk access. Our CSI driver now supports so-called "raw block volumes" for such use cases.

Principle and advantages of CSI

Container setups offer numerous advantages. Among other things, container orchestrators such as Kubernetes can ensure that the required containers are started at any time. If a node leaves the cluster (whether for maintenance work or for other reasons), Kubernetes can start the containers in question elsewhere in the cluster, thus re-establishing the target state. In such a case, CSI allows the persistent volumes (PVs) to be immediately available again as well: an appropriate CSI driver not only initially prepares the required volumes on the underlying cloud infrastructure, but afterwards also connects them to the correct node so that the container can access the volume and its data.

To allow a container to use the storage space of a PV in the first place, our CSI driver normally formats the volume with ext4 at the time of creation and then mounts it in the file system of the node from where it is made available to the container. Some workloads, however, are unable to store their data in a file system, but only on entire block devices. Rook is an example of this: In a Kubernetes setup, the Cloud Native Computing Foundation project can install, among other things, a storage cluster with Ceph and provide a storage service for other applications. When operating on physical hardware, Ceph would write directly to the hard disks for actual data storage and use its own optimized "BlueStore" format rather than partitions and file systems. In a Kubernetes setup, the physical disks are replaced by PVs that need to be available as raw block volumes.

Using CSI with raw block volumes

Raw block volumes are officially supported in Kubernetes from version 1.18 onwards. We introduced this option with version 3.1.0 of our CSI driver. To create a raw block volume, simply add volumeMode: Block in the persistent volume claim (PVC) (otherwise the default Filesystem will be used), as shown in the following example:

apiVersion: v1
kind: PersistentVolumeClaim
  name: csi-pod-pvc-raw-block
  volumeMode: Block
  - ReadWriteOnce
      storage: 5Gi
  storageClassName: cloudscale-volume-ssd

In the pod definition please indicate the desired device path by using volumeDevices (instead of the mount point with volumeMounts):

apiVersion: v1
kind: Pod
  name: my-csi-app-raw-block
    - name: my-frontend
      image: busybox
        - devicePath: /dev/xvda
          name: my-cloudscale-volume
      command: [ "sleep", "1000000" ]
    - name: my-cloudscale-volume
        claimName: csi-pod-pvc-raw-block

The volume, which is thus created on our cloud infrastructure and is also visible in our cloud control panel or via API, is passed on "one to one" to the relevant pod and is accessible from the first to the last byte. It goes without saying that it is also possible to encrypt volumes of this kind within the pod, e.g. with LUKS.

As is the case for all volumes at, raw block volumes can either be stored on NVMe SSDs or on our bulk storage. You can also, as usual, scale up such volumes during live operation. Please be aware, however, that only the size of the block device is increased via the CSI driver and adjustments of partitions and file systems (if any) need to be performed in the relevant pod.

Different approaches for different use cases

Support for raw block volumes means that a Kubernetes/CSI setup is now also suitable for use cases that are not viable with purely file-based persistent storage. If you are interested in Ceph and would like to run trials, the above-mentioned Rook is an example of this. There are, of course, also reasons for using Rook with Ceph productively, e.g. if you require a CephFS backend for another application. Make sure you avoid potential single points of failure in this case. Where several pods (e.g. "mon" or "osd" in Ceph) guarantee redundancy, place these on Kubernetes nodes that are in anti-affinity to one another, as this means that an isolated hardware issue on one of the physical compute servers does not affect several of these pods at the same time.

Ceph is often used with a replication factor of 3, which is also the case for our own Ceph clusters that the NVMe-SSD and bulk volumes as well as our object storage are based on. If you are running your own Ceph setup on top of this infrastructure, this typically means that every data fragment filed in there is physically stored nine times. Evaluate for your individual use case how much (additional) redundancy you require and what level of overhead you are willing to accept. It might be that, for example, an NFS or database server of your own or, as an alternative, our object storage is also suitable as central data storage for your applications.

Cloud infrastructure and container setups are the method of choice for an increasing number of use cases as they offer maximum flexibility and scalability. Whether for a short test or for a productive HA setup, you can now make the most of these advantages in an even more versatile manner thanks to support for raw block volumes in our CSI driver. If you have any feedback or suggestions for improvement, please contact us via GitHub or directly.

Volumes to your taste, whether ready-formatted or raw!
Your team

Back to overview