The state of stateful applications on Kubernetes

Aug 7, 2023

Ben Hirschberg
CTO & Co-founder

Kubernetes has become one of the most popular platforms for running cloud-native applications. This popularity is due to several factors, including its ease of use and ability to handle stateless applications. However, running stateful applications, such as databases and storage systems, on Kubernetes clusters is still debatable. In other words, does Kubernetes and its containerized ecosystem provide a solid and reliable infrastructure to run such critical applications?

Stateful applications require a different set of capabilities than stateless applications. But despite these differences, there has been a growing trend to run stateful applications on Kubernetes because Kubernetes and its cloud-native infrastructure provide cutting-edge capabilities to support these apps.

In this blog post, we’ll start with the early challenges of running stateful applications on Kubernetes, as well as the improvements and advancements that have been made over time. We’ll also cover current trends, including some of the challenges and best practices when running stateful applications on Kubernetes. Finally, we’ll look ahead to the future of stateful applications on the platform, including emerging trends and technologies shaping this space.

By Security standards, at DevOps pace.

Actionable, contextual, <br/> end-to-end <br/> Kubernetes-native security

See ARMO in Action

The history of Kubernetes for stateful applications

Kubernetes created a disruptive technology in the cloud scene that has been adopted by many organizations and cloud providers. Each organization has its own business requirements and technical limitations when moving its applications to Kubernetes. Thus, like any new technology, some initial challenges are expected, but over time, it has evolved and matured to better handle any difficulties encountered.

Early challenges

When Google first introduced Kubernetes, it was primarily designed to support stateless containerized applications. The main idea was to streamline the management of containers over a scalable and easy-to-use API. This meant technical challenges related to storage, networking, and scaling when running stateful applications on the platform. Additionally, there needed to be more capabilities to manage these applications’ states, making it difficult to ensure data consistency and availability.

In other words, running a mission-critical, production-grade database system was not the forte of Kubernetes.

Evolution of Kubernetes

Over time, however, Kubernetes evolved to better support stateful applications. One of the most significant changes was the introduction of StatefulSets, which provides a way to manage stateful applications on the platform. This was followed by the introduction of Container Storage Interface (CSI) integration and volume snapshots, which improved storage management for them.

Kubernetes also made improvements to networking support, making it easier to connect and manage stateful apps.

The current state

Running stateful applications on Kubernetes is much easier today than in the early days. There has been broad adoption of such apps on the platform, and the capabilities of Kubernetes to support them have continued to improve.

The standardization of network and storage infrastructure with CSI (Container Storage Interface) and CNI (Container Network Interface) especially ensured infrastructure-level reliability. CSI provides a standard interface for connecting and managing storage systems in a containerized environment, while CNI offers a standard interface for connecting containers to networks.

These standards allow for the integration of various storage and network solutions with the Kubernetes ecosystem, letting users choose the best solution for their needs and helping to ensure compatibility between different components. They are also essential for running stateful applications on Kubernetes because they require stable and persistent storage and network connections. As a result, many cutting-edge stateful apps are running successfully on Kubernetes, including databases, caches, message brokers, and storage systems.

Although these days Kubernetes is mature enough to run stateful applications, there are still challenges related to storage and networking, scaling, and state management.

Get your Kubernetes Security Checklist now

Challenges and best practices for running stateful applications on Kubernetes

Kubernetes creates a consistent, reproducible, and resilient abstraction of the underlying infrastructure to run stateful applications. Because of this, Kubernetes users rely on its design principles and features to manage their apps. Due to both Kubernetes’ high adoption rate and its unique design characteristics, the current challenges related to running stateful apps on Kubernetes are well known, and widely accepted best practices exist to help deal with them.

Storage and Networking

Managing persistent storage is one of the biggest challenges when running stateful applications on Kubernetes. The Kubernetes persistent volume (PV) and persistent volume claim (PVC) abstractions provide a way to manage storage resources in a cluster, but they still require careful planning and configuration.

When running stateful applications, it’s essential to consider factors such as storage access modes, storage performance, and storage capacity. For example, when running databases, it’s crucial to ensure that the storage is configured for high performance since databases often need fast read and write access to data. When running message brokers, it’s essential to consider the durability of the data and ensure that the storage can survive node failures.

Another key factor is networking. Stateful applications often require stable network identities (e.g., IP or DNS name), low latency, and reliable communication. Kubernetes provides several networking primitives out of the box, like services and pod networking, to ensure that instances of stateful applications can communicate. However, it’s critical to carefully plan the network topology and security to ensure that the network is configured correctly for stateful apps.

For example, when running databases, the network must provide low latency and high bandwidth communication between nodes to ensure fast data access. When running message brokers, you have to make sure the network is appropriately secured to protect against unauthorized access to sensitive data.

Scaling and State Management

Scaling stateful applications while preserving data consistency is a significant challenge for distributed data-sensitive applications. Stateful apps must maintain the state of their data, even as they scale, and ensure that data is correctly replicated across nodes. Additionally, it’s essential to manage the state of the applications themselves, including their configuration and data, as they scale.

Kubernetes provides StatefulSets abstraction to scale stateful applications with ordered, graceful, and automated rolling updates. When using the abstractions, you must carefully plan the scaling strategy to make sure data is appropriately replicated and the applications can continue to operate as they scale. For example, when scaling databases, you need to determine how to store data, such as using a shared file system or a shared database, and how to manage data replication as the databases scale up or down.

Another important consideration is the performance characteristics of the underlying storage and the types of data that need to be stored. For example, when storing large amounts of data, it may be necessary to use a distributed file system, such as GlusterFS or Ceph, for storage that is fast and scalable.

Monitoring and Logging

Monitoring the state of stateful applications and the underlying infrastructure is crucial for ensuring that applications run smoothly. This includes monitoring the state of the storage and network, as well as the applications themselves. Additionally, collecting logs for troubleshooting and analysis is vital for identifying and fixing issues as they arise.

Kubernetes provides several tools and abstractions to help monitor and log stateful applications, including the Kubernetes API, the kubectl command line tool, and the Kubernetes Dashboard. Additionally, many third-party tools, such as Prometheus and Grafana, are available for monitoring and logging stateful apps.

Conclusion

Kubernetes originated as a stateless application orchestrator and evolved to run stateful applications as well. The journey has been a challenging one. Fortunately, the introduction of StatefulSets, persistent volumes and claims, improved networking support, and better state and data management have made it possible to run these applications on Kubernetes with high availability, data durability, and performance.

With the continued evolution of Kubernetes and its available support, it has become easier to manage, scale, and monitor stateful applications. Challenges such as storage and networking, scaling and state management, and monitoring and logging still exist, but best practices have been developed to overcome these difficulties.

Emerging trends such as cloud-native databases, edge computing, and AI-powered storage systems will continue to shape and improve the future of stateful applications on Kubernetes.