How to Use Kubernetes Audit Logs to Identify Potential Security Issues?

Dec 11, 2021

Amir Kaushansky
VP Product

Audit logging involves recording transactions and system events, making it an invaluable tool for regulatory compliance, digital forensics, and information security. In a typical Kubernetes ecosystem, auditing involves providing chronological, activity-relevant records that document events and actions in a cluster. Modern logging tools come with aggregation and analytical functionalities so that teams can use log data to mitigate security threats.

In this post, I’ll explain the importance of Kubernetes audit logging in cloud-native security. I’ll also cover best practices for maintaining secure audit files.

Introduction to Audit Logs

An audit trail is a time-stamped record of events and system changes that provides a comprehensive history of activities performed by users, workloads, and cluster services. Audit logs are crucial for Kubernetes security because they document activities. Some activities might affect an application’s behavior, such as the time of operation, various component calls, and the users responsible for the tasks. By developing effective audit logging, you will be able to establish a foundation for accountability, security, and compliance.

Purpose of Audit Logs in Administering Security

While audit logging might be used for analysis and identifying trends over time, it’s most commonly used by organizations to monitor Kubernetes cluster performance and enforce security

Audit logs fundamentally help in the following areas:

Security Threat Detection

Audit logs record each activity that occurs in the cluster. For each activity, it adds metadata such as the IP address from which the action was created, user agent, and more. Using the audit log and the metadata, there are solutions that can look for indicators of attack (IoA) and define policies. For example, You can create a policy allowing changes to the production cluster only from the organization’s approved IP addresses, any action outside of this approved list will raise an alert.

Security Incident Response and Investigation

Audit logging provides deep insight into a cluster’s actions and events, so it’s easy to reconstruct a problem if there’s a security incident. Teams can utilize audit trails to understand why, when, and how components of a cluster underperformed during operations. By understanding the conditions that lead to a security incident, security professionals can create enhanced monitoring, damage assessment, and remediation strategies.

Monitoring Compliance and Policy Violations

Organizations can use audit logs to stay in compliance with regulations, such as PCI DSS, SOC 2, HIPAA, or GDPR. Since the audit trail serves as an official record of system activity, organizations can take necessary actions to remove gaps based on this information, or they can share these records with security researchers and auditors for deeper analysis. Some regulatory bodies also accept audit logs as proof of compliance.

Abnormal Activity

Through forensic analysis and real-time alerts, log files help system administrators and security professionals identify malicious user actions and behavior. Audit trails also flag unusual user and bot activities in real-time, thereby helping with intrusion detection and unusual user behaviors as they occur. There are solutions that use UEBA (user and entity behavior analysis) in order to identify abnormal activity. For example, a new user is creating a lot of objects, or the DevOps manager logs into the system from an abnormal location.

Types of Information Logged

Kubernetes audit records are generated by the kube-apiserver component. Every client request generates an audit event, which is processed using an audit policy before being written to the backend. Below is an outline of important fields covered in the audit log.

Client Requests and Server Responses

The audit log primarily records transactions between the Kubernetes API server and end users. As a server processes client requests, it sends certain information to the log file, including:

Username
Group
Source IP
Time of the request
Decision (allow/deny)

Account Activities

Audit logs capture important account activities and information, such as:

Successful authentication attempts
Failed login attempts
Use of application privileges
Changes to the account (e.g., deletion, creation, and privilege escalation)

Usage Information

The audit log records the original request that the user was asking the API server to perform.

Audit Policy

In Kubernetes, you need to pass the API server the audit-policy-file flag in order for the audit policy to be enforced. Policy is an object that defines the rules of events to be logged and what data the records should include. Once an event is logged, Kubernetes compares its characteristics against the list of rules. A sample audit policy specification would look similar to the following:

apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
– “RequestReceived”
rules:
    – level: RequestResponse
    resources:
    – group: “”
      resources: [“pods”]
    – level: Metadata
    resources:
    – group: “”
      resources: [“pods/log”, “pods/status”]

   – level: None
    resources:
    – group: “”
      resources: [“configmaps”]
      resourceNames: [“controller-leader”]

– level: None
    users: [“system:kube-proxy”]
    verbs: [“watch”]
    resources:
    – group: “”
      resources: [“endpoints”, “services”]

   – level: None
    userGroups: [“system:authenticated”]
    nonResourceURLs:
    – “/api*”
    – “/version”

    – level: Request
    resources:
    – group: “”
      resources: [“configmaps”].
    namespaces: [“kube-system”]

   – level: Metadata
    resources:
    – group: “”
      resources: [“secrets”, “configmaps”]

   – level: Request
    resources:
    – group: “”
    – group: “extensions”

    – level: Metadata
        omitStages:
      – “RequestReceived”

Audit log actions

None: Don’t log events that match this rule.
Metadata: Log request metadata (requesting user, timestamp, resource, verb, etc.) but not request or response body.
Request: Log event metadata and request body but not response body. This does not apply for non-resource requests.
RequestResponse: Log event metadata, request, and response bodies. This does not apply for non-resource requests.

Storing Audit logs

Kubernetes gives two options for saving the audit log:

Filesystem
WebHook (sends to third party using HTTP)

Filesystem

If you are saving the audit log to a local filesystem, you need to pass the following to the API server flags:

audit-log-path specifies the log file path that log backend uses to write audit events. Not specifying this flag disables the log backend. ”-” means standard out.
audit-log-maxage defined the maximum number of days to retain old audit log files.
audit-log-maxbackup defines the maximum number of audit log files to retain.
audit-log-maxsize defines the maximum size in megabytes of the audit log file before it gets rotated.

Third-Party Location

If you are sending the audit logs to a third-party system, you need to pass the following to the API server flags:

audit-webhook-config-file specifies the path to a file with a WebHook configuration. The webhook configuration is effectively a specialized kubeconfig (see the K8s documentation for more details).
audit-webhook-initial-backoff specifies the necessary wait time after the first failed request before retrying. Subsequent requests are retried with exponential backoff.

Optimizing Audit Logs

You can define that the K8s API server will buffer the audit logs before saving/streaming them. You can also define the buffer size, the batch size, the time the API server will wait before batch events in the queue, batches per second, and in case of a third-party system, the throttling burst (number of batches generated at the same moment).

There might be a case where your API server receives many requests per second and needs to handle and save/transmit a large number of records. You don’t want to define the audit log configuration parameters and cause logs to disappear due to a burst of requests that the API server can’t handle. The API server provides metrics to measure how often this happens. You can use these metrics to correctly set the parameters.

For more information read:https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#batching

Protect Audit Logs

Logs are only helpful if they are secure and untampered. A Kubernetes audit log becomes less effective if the information it records can be deleted or altered. Because logs are essentially JSON files, they are commonly susceptible to theft, alteration, or corruption. Some practices that organizations can embrace to protect log files include:

Log file encryption
Setting specific authorization requirements/permissions for log file access
Exporting and backing up logs to external systems
Access control for administrators
Alerts for log deletion, shutdown, and alteration (monitoring the policy file)
Journaling and archiving

Avoid Saving Logs Locally

Attackers target log files to keep their activities undetected. As a best practice, it’s important to record logs on a remote server so that it’s harder for hackers to access. Use the WebHook option to stream the audit logs records to a third-party solution that will not only store the records remotely, as required by some compliance frameworks but will also protect it by adding security. This comes in the form of policy, threat detection, abnormal activity detection, and incident response capabilities.

Summary

A Kubernetes API server can audit all the requests it gets. Audit logging helps organizations implement visibility for these ecosystems, enabling regulatory compliance and security. You can also use it as another security layer, as it is unintrusive and does not affect the performance of your cluster and applications.