How to validate Kubernetes YAML files?

Oct 20, 2022

Bezalel
Brandwine

Kubernetes has taken center stage in how we now manage our containerized applications. As a result, many conventions to define our Kubernetes apps exist, including structures such as YAML, JSON, INI, and more.

This leaves us to consider what is the best strategy to follow for our applications. Additionally, we must then also ask how we can validate our application configurations depending on the path we’ve chosen in terms of file structure and especially security.

In this piece, we will explore defining Kubernetes applications using YAML configs and the various steps we can take to effectively validate these config definitions.

Kubernetes Config Definitions in YAML

YAML, compared to JSON and INI, is much more compact and readable. For example, if we were to define a pod that can be reached on port 80, then the configuration in YAML, JSON, and INI would be as shown in the table below.

YAML	JSON	INI
apiVersion: v1 kind: Pod metadata: name: spring-pod spec: containers: – image:armo/springapp:example name: spring-app ports: – containerPort: 80 protocol: TCP	{ “apiVersion”: “v1”, “kind”: “Pod”, “metadata”: { “name”: “spring-pod” }, “spec”: { “containers”: [ “image:armo/springapp:example\nname: spring-app\nports:\n- containerPort: 80\nprotocol: TCP” ] } }	apiVersion=v1 kind=Pod [metadata] name=spring-pod [spec] containers[]=”image:armo/springapp:example\nname: spring-app\nports:\n- containerPort: 80\nprotocol: TCP”

It is clearly visible that YAML simplifies how we can define our Kubernetes applications, especially considering that a normal application can involve dozens of config files. Additionally, the compact nature of YAML allows you to group objects together, reducing the number of files required.

However, there are major challenges with defining our Kubernetes config files, particularly when attempting to embed constraints and relationships between manifest files. For example, how can we ensure that the memory limits are configured to follow best practices?

Not only can the lack of validation result in unexpected behavior from our applications when edge cases are met, but it can also expose major security loopholes. Hence, it is necessary that we consider validation strategies for our YAML-based config files, and that’s what we will dive into in the following sections.

Various Aspects of Validation

There are three levels of validation that should be performed on our YAML files. These levels ensure that validation is performed in terms of the actual validity of the YAML file all the way to whether or not security practices have been met.

The first level is structural validation, which is the highest level of validation performed on Kubernetes configuration files. It involves simply validating the YAML file to ensure there are no syntax errors when writing it. This is something that can be validated by IDEs used when writing the config files.

The second level is semantic validation. This ensures that the content of the YAML file translates to the Kubernetes resources desired, hence validating the Kubernetes application itself.

Finally, the third and deepest level of validation is security validation, to ensure that the Kubernetes application defined does not have any vulnerabilities. We may have been successful in writing our YAML config successfully to achieve the required Kubernetes resources and connections, but this does not ensure our Kubernetes application is well-secured and follows best practices. The last two validations above are both Kubernetes configuration validations, and not only in the aspect of YAML format validation. Since they are application-specific validations, a dedicated treatment is necessary. Deep and professional knowledge of the Kubernetes domain is required to perform such validations, and we will take a short look at how to handle them easily using tools developed by Kubernetes domain experts.

For example, locking down hostPath mount permissions makes sure a container in the cluster with a writable hostPath volume is not accessed by attackers, as they may then gain persistence on the underlying host. This does not follow security best practices, and to avoid the issue, we should always ensure that the readOnly section under the hostPath attribute is set to true.

Another example is granting hostNetwork access to pods only when it is necessary. All pods that have access should also be whitelisted. Unnecessary access to hostNetwork increases the potential attack surface area.

So, it can be seen from the two examples above that even after our config files have passed the structural and semantic validations, leading to our Kubernetes resources successfully being orchestrated, security and functional vulnerabilities may still exist. Hence, we must consider how best to capture these vulnerabilities before being alerted to the consequences in production. Security validation is the method to do this.

Get your Kubernetes Security Checklist now

Validation and Best Practices

Considering that structural validation is fairly simple, it can be expected that the tooling available is now readily available in your IDEs. Kubernetes semantics and security validations require a bit more effort, especially in terms of strategy and tooling.

Let’s consider some of the best practices and strategies these entail to achieve overall validation of our YAML files.

Always Shift Left

We may always validate our K8s configurations, even though we already deployed them in the cluster, using tools like Kubescape, but it’s always better to “shift left”. Shift left is a term that has blown up in the past five or six years, and rightly so. You want to bring validation closer to the delivery and build stage of your applications.

For example, semantic validation is something that Kubernetes itself can handle. However, by the time you leave it to Kubernetes to handle, it’s too late.

You could perform a dry run (kubectl apply -f --dry-run='server’') to validate the semantic structure, but this still is an additional step that could slow down your overall velocity. However, a dry run requires you to have access to the Kubernetes cluster.

An alternative to this approach is Kubeval, which is an amazing tool that can be used to validate your config file semantics to ensure they meet Kubernetes’ object definition requirements. It can be part of your CI process and perform scanning locally, thereby ensuring that you semantically validate your config files before you go to production.

Security validation can also be achieved in the CI using Kubescape. This is an open-source tool that ensures your Kubernetes application definitions follow multiple security frameworks such as NSA-CISA or MITRE ATT&CK®. By using the Kubescape CLI, you can scan all of your YAML files for security vulnerabilities and even get a risk score and risk trends. It acts as a YAML validator with its main value being security validation.

DevOps to DevSecOps

You can run a combination of tools already discussed in your CI pipeline to achieve structural, semantic, and security validation. However, just utilizing these tools and their predefined checks is not enough.

One thing we’ve learned from DevOps is that there is always a cycle of improvement that can be adopted. That is why as your application evolves, and your security needs change, you should constantly review your security controls.

New security controls are something you should incorporate into your security validation steps. Kubescape, an open-source platform by AMRO, allows you to define your own framework of controls. Even though the out-of-the-box framework is robust, you will see your policy controls take shape per the exact needs of your business and Kubernetes resources.

Such a practice can only be achieved if security concerns are embedded into the culture of how you’re building apps. Thanks to open-source tools such as Kubeval and Kubescape, the barriers to your development team continuously thinking about validation, and especially security validation, has been lowered.

Conclusion

YAML config files have made building Kubernetes applications quite simple. However, YAML does have its limitations in terms of validation. It is therefore necessary that we are aware of all validation strategies to ensure that we build our Kubernetes applications to be healthy and secure.

Validating the structure of a YAML file is fairly simple with YAML tests in any IDE, but validating the correctness of Kubernetes resource object definitions and the security measures around them is difficult. Luckily, tools such as Kubescape close the gap, allowing you to shit left and constantly think about security across your entire application lifecycle.

With security being one of the prime concerns of building containerized applications today, the validation strategies discussed here are one step in the right direction.