Cryptominers in the Cloud
Over the past decade, Bitcoin’s value has increased more than 200-fold. Similarly, other cryptocurrencies have...
Jul 1, 2024
A high-severity remote code execution (RCE) vulnerability has been found in OpenSSH’s server (CVE-2024-6387) by the research team of Qualys. This issue is especially concerning because it brings back a problem that was originally fixed in 2006, showing that one of the most popular secure software still has hidden bugs. This discovery follows another major vulnerability found in the XZ Utils library just a few months ago, highlighting ongoing security challenges. Although CVE-2024-6387 is a serious flaw, it’s very hard to exploit in practice, and so far, no one has managed to use it to attack remote machines. Even so, it’s crucial to understand the risk and take steps to protect your systems. SSH is widely used for accessing Kubernetes nodes and they’re also used sometimes within Kubernetes workloads, therefore this has a huge impact on cloud systems.
A new vulnerability has been discovered in OpenSSH’s server (sshd), specifically a signal handler race condition. This issue occurs when a client fails to authenticate within the LoginGraceTime
period, which is set to 120 seconds by default in recent versions and 600 seconds in older versions. If this timeout is reached, sshd’s SIGALRM
handler is triggered. The problem arises because this handler calls various functions, like syslog()
, that are not safe to call from within a signal handler. This race condition poses a significant risk to systems running the default sshd configuration.
Interestingly, this vulnerability is not entirely new. It is a regression of a previously identified issue, CVE-2006-5051, reported by Mark Dowd in 2006. That earlier vulnerability also involved a signal handler race condition in OpenSSH versions prior to 4.4, which could lead to a denial of service or potentially allow remote code execution.
The regression occurred in October 2020 with the release of OpenSSH 8.5p1. During an update to the logging infrastructure, a crucial directive (#ifdef DO_LOG_SAFE_IN_SIGHAND
) was inadvertently removed from the sigdie()
function. This function, directly called by sshd’s SIGALRM
handler, was made unsafe again. To break it down:
#ifdef DO_LOG_SAFE_IN_SIGHAND
, which made sigdie()
call _exit(1)
safely.#ifdef DO_LOG_SAFE_IN_SIGHAND
directive.The implications of this vulnerability are particularly severe on glibc-based Linux systems, where syslog()
itself may invoke other unsafe functions like malloc()
and free()
. This creates a scenario where an attacker can potentially execute arbitrary code as root without needing to authenticate. This is because sshd’s privileged code operates with full system privileges and lacks sandboxing.
Exploiting the signal handler race condition vulnerability in OpenSSH requires a deep understanding of timing attacks and memory manipulation. The following section explains the steps an attacker would take to exploit this vulnerability, along with an example pseudocode to illustrate the process.
First, the attacker initiates multiple connections to the target OpenSSH server, repeatedly triggering the LoginGraceTime
limit without completing authentication. This causes the server to raise the SIGALRM
signal. The exploitation hinges on interrupting the server’s signal handler at the exact moment it is executing non-async-signal-safe operations, such as syslog()
. The attacker needs to send specially crafted inputs that manipulate the server’s memory layout, leading to heap corruption.
By manipulating the server’s memory, the attacker can create an inconsistent state in the heap. This is achieved by triggering the SIGALRM
signal during memory allocation or deallocation functions like malloc()
or free()
. Exploiting this vulnerability is not straightforward and typically requires around 10,000 attempts on average. Each attempt resets the LoginGraceTime
timer, giving the attacker a new window to trigger the vulnerability.
During the exploitation process, the attacker adjusts the timing of their inputs based on feedback from previous attempts. This helps to fine-tune the timing required to successfully interrupt the signal handler at the critical moment. Modern systems have defenses like Address Space Layout Randomization (ASLR) and No-eXecute (NX) to prevent such exploits. The attacker leverages predictable memory patterns and advanced timing techniques to bypass these protections. Successful exploitation allows the attacker to overwrite critical memory structures, leading to the execution of arbitrary code. This results in remote control of the server with root privileges.Below is an example taken from this github repo that outlines the steps an attacker might take to exploit this vulnerability:
int perform_exploit(const char *ip, int port) { int success = 0; double parsing_time = 0; double timing_adjustment = 0; for (int base_idx = 0; base_idx < NUM_GLIBC_BASES && !success; base_idx++) { uint64_t glibc_base = GLIBC_BASES[base_idx]; printf("Attempting exploitation with glibc base: 0x%lx\n", glibc_base); for (int attempt = 0; attempt < 10000 && !success; attempt++) { if (attempt % 1000 == 0) { printf("Attempt %d of 10000\n", attempt); } int sock = setup_connection(ip, port); if (sock < 0) { fprintf(stderr, "Failed to establish connection, attempt %d\n", attempt); continue; } if (perform_ssh_handshake(sock) < 0) { fprintf(stderr, "SSH handshake failed, attempt %d\n", attempt); close(sock); continue; } prepare_heap(sock); time_final_packet(sock, &parsing_time); // Implement feedback-based timing strategy parsing_time += timing_adjustment; if (attempt_race_condition(sock, parsing_time, glibc_base)) { printf("Possible exploitation success on attempt %d with glibc base 0x%lx!\n", attempt, glibc_base); success = 1; // In a real exploit, we would now attempt to interact with the shell } else { // Adjust timing based on feedback timing_adjustment += 0.00001; // Small incremental adjustment } close(sock); usleep(100000); // 100ms delay between attempts } } return success; }
The following practical examples are taken from the research paper published by the researchers and illustrate how this vulnerability can be exploited in different OpenSSH versions:
In OpenSSH 3.4p1 on Debian, the exploitation involves interrupting a free()
call with SIGALRM
during public-key parsing. This leaves the heap in an inconsistent state, which is then exploited in another free()
call inside the SIGALRM
handler. Approximately 10,000 attempts are required, typically taking about a week with 10 connections every 600 seconds.
In OpenSSH 4.2p1 on Ubuntu, the attacker targets a pam_start()
call with SIGALRM
, creating an inconsistent state in PAM’s structures. This state is then exploited during a pam_end()
call inside the SIGALRM
handler. About 10,000 attempts are needed, taking around 1-2 days with 10 connections every 120 seconds.
In OpenSSH 9.2p1 on Debian, the attack interrupts a malloc()
call with SIGALRM
during public-key parsing, leading to heap corruption. This is exploited during another malloc()
call inside syslog()
. Approximately 10,000 attempts are required, taking around 6-8 hours due to the need to guess the glibc address correctly about half of the time.
Researchers have primarily focused on virtual machines with mostly stable network conditions. Although significant progress has been made, further improvements are expected, particularly for exploiting newer amd64 systems where ASLR is stronger. The discovery of a related bug report led to immediate communication with OpenSSH developers, highlighting the importance of swift action in addressing such vulnerabilities.
To effectively address this vulnerability, upgrade to the latest OpenSSH release, which includes the necessary fix. Keeping your OpenSSH installation current is essential for security and protection against known vulnerabilities.
If an upgrade is not immediately possible, you can mitigate the risk by setting the LoginGraceTime
parameter to 0 in the OpenSSH configuration file. This prevents unauthenticated sessions from remaining open and being exploited. However, this setting may cause a denial of service if all connection slots become occupied.
Adding an additional layer of security by using seccomp (secure computing mode) can further mitigate risks. Seccomp
restricts the system calls that the sshd
process can execute, thereby limiting the attack surface and reducing the potential for exploiting unsafe functions like syslog()
. Implementing seccomp ensures that even if a vulnerability is triggered, the attacker’s ability to execute arbitrary code is significantly constrained.
By following these mitigation strategies, you can secure your systems against this vulnerability and enhance the overall security of your OpenSSH configuration.
For Kubernetes users, the discovery of the regreSSHion vulnerability in OpenSSH’s server underscores potential risks within containerized environments. While Kubernetes itself isn’t directly affected, containers running OpenSSH servers could be vulnerable if not properly secured. To minimize the risk, Kubernetes users should consider implementing several key strategies. Firstly, deploying strict seccomp profiles can restrict the system calls available to containerized processes, thereby limiting the attack surface in case of exploitation attempts.
Additionally, enforcing comprehensive network policies within Kubernetes can help control traffic flow and prevent unauthorized access to vulnerable services. Regularly updating OpenSSH to the latest patched version is critical, as is closely monitoring security advisories and promptly applying fixes. Implementing least privilege principles for container permissions and ensuring strong authentication mechanisms are also essential steps to enhance overall security posture in Kubernetes environments. By adopting these proactive measures, Kubernetes users can effectively mitigate the potential impact of vulnerabilities like regreSSHion and maintain a secure container orchestration environment.
Utilizing vulnerability management tools like ARMO Platform or our open-source Kubescape project, can significantly aid in detecting and managing vulnerabilities effectively within Kubernetes clusters. ARMO Platform even proactively notifies users and enhances this capability by offering additional filtering, such as the Exploit Prediction Scoring System (EPSS), which prioritizes vulnerabilities like regreSSHion based on their likelihood of exploitation. This ensures that critical vulnerabilities are addressed promptly while minimizing noise and focusing resources on mitigating the most pressing security threats, which in this case will save you time.
In conclusion, the discovery of the regreSSHion vulnerability in OpenSSH’s server highlights ongoing challenges in maintaining secure software environments. Despite its potential severity as a remote code execution flaw, exploiting it remains exceedingly difficult in practical scenarios.
System administrators are encouraged to stay vigilant, apply patches promptly, and implement additional safeguards to mitigate risks posed by such vulnerabilities. By staying informed and proactive, organizations can significantly enhance their resilience against emerging cybersecurity threats.
Over the past decade, Bitcoin’s value has increased more than 200-fold. Similarly, other cryptocurrencies have...
We are thrilled to announce the latest enhancement to ARMO Platform: Seccomp Profiles Leveraging eBPF....
Runtime anomaly detection is fast becoming a critical component for protecting containerized environments. Recent advancements...