< back to blog

Agentic AI Tooling: Why Runtime Security Is the Missing Layer

Alejandro Magallon
Agentic AI Tooling: Why Runtime Security Is the Missing Layer
Published by:
Alejandro Magallon
Agentic AI Tooling: Why Runtime Security Is the Missing Layer
Published:
May 19, 2026
falco feeds by sysdig

Falco Feeds extends the power of Falco by giving open source-focused companies access to expert-written rules that are continuously updated as new threats are discovered.

learn more
Green background with a circular icon on the left and three bullet points listing: Automatically detect threats, Eliminate rule maintenance, Stay compliant, with three black and white cursor arrows pointing at the text.

A developer asks an AI coding agent to refactor a microservice. Within seconds, the agent opens source files, executes shell commands, calls external APIs, and modifies cloud configuration. No human approves each step. No security tool monitors what the agent does between receiving the prompt and delivering the result.

This is not theoretical. Over the past year, researchers and security teams have disclosed a growing set of real-world attacks against agent infrastructure (from MCP tool poisoning to credential theft through coding agents) and industry surveys consistently find that most organizations have little visibility into the machine-to-machine traffic their agents generate. The tooling ecosystem powering these agents is maturing fast. The security controls governing them are not.

The stack at a glance

Agent infrastructure spans five layers, each introducing distinct attack surface:

  • Model Context Protocol (MCP): Standard interface for agents to discover and interact with external systems, each exposing a set of tools (callable functions the agent can invoke). Every MCP server an agent can reach is part of its effective attack surface.
  • Skills: Handbooks that shape agent behavior (methodology, tool ordering, escalation paths). Effectively policy-as-code. Tampering with skills changes what the agent does, not just what it can do.
  • Agent SDKs (Anthropic, LangChain, CrewAI): Frameworks where guardrails live or are delegated entirely to the developer. The SDK determines what's enforced by default.
  • Managed Agent Platforms (Claude Managed Agents, Amazon Bedrock Agents): Solve orchestration and sandboxing but not behavioral monitoring. Sandbox boundaries are not visibility boundaries.
  • Orchestration Layers: Coordinate multi-agent workflows. Delegation chains create lateral movement paths invisible to traditional network monitoring.

What this ecosystem does not solve is visibility into what agents do after they start executing.

How agents get compromised

The attack surface is not theoretical. MITRE ATLAS, extended through a 2025 collaboration with Zenity Labs, now catalogues a set of agent-specific techniques (AML.T0080–T0086), and the 2025 OWASP Top 10 for LLM Applications dedicates entry LLM06 to Excessive Agency, the risks of granting LLMs unchecked autonomy and tool access.

MCP tool poisoning

Invariant Labs disclosed this class of attack in April 2025. MCP tool descriptions are free-text fields that the LLM reads to understand how to use a tool. A malicious server can embed adversarial instructions directly in those descriptions, invisible to the user, processed by the model:

The user sees weather_lookup and approves. The LLM reads the full description, including the hidden payload, silently reads the SSH key, and passes it to the attacker-controlled server. Tool descriptions are fetched dynamically: they can change after user approval. A related variant, tool shadowing, injects descriptions that modify agent behavior with respect to trusted tools on other connected servers: cross-server exfiltration through the agent's own trust chain.

Indirect prompt injection via tool responses

An agent calls a legitimate tool, like a web browser, a file reader or a database query. The returned content contains hidden adversarial instructions:

Tool outputs and instructions share the same context window. The agent then uses a legitimate tool (email, HTTP request, Slack message) to exfiltrate data. This turns prompt injection from an information leak into an action execution vulnerability.

The EchoLeak incident, disclosed in June 2025 by Aim Security as CVE-2025-32711, demonstrated exactly this pattern: a crafted email injected instructions into Microsoft 365 Copilot, which then exfiltrated sensitive data via markdown rendering to an attacker-controlled server, bypassing Content Security Policy through a Microsoft-approved domain.

Credential theft via coding agents

AI coding agents in CI/CD pipelines routinely have shell access and run with the credentials of the CI system. A malicious code comment in a pull request can exploit this:

The agent, following its training to be helpful, executes the command. The script reads environment variables, harvests credentials from .git/config, exfiltrates everything, and cleans up.

This pattern was publicly demonstrated in July 2025 by researchers at General Analysis against a Supabase + Cursor setup: a prompt injection embedded in a support ticket caused the Cursor agent (running with Supabase service_role privileges) to read private tables and exfiltrate integration tokens back through the same support channel.

Threat model: Attack vectors by ecosystem layer

Every layer of the agentic stack introduces specific attack vectors. Mapping them to detection strategies is the first step toward a security posture that matches the architecture.

 

Ecosystem Layer

Attack Vector

MITRE ATLAS

Detection Strategy

MCP

Tool poisoning via hidden description payloads

AML.T0081

Monitor for sensitive file reads unrelated to task; audit tool descriptions at registration

MCP

Cross-server exfiltration through tool shadowing

AML.T0086

Detect outbound connections to unexpected endpoints from agent containers

Skills

Skill manipulation to alter agent behavioral policy

AML.T0081

Version-control skills; detect runtime modification of skill files

Agent SDKs

Prompt injection exploiting framework-level trust

AML.T0051

Tool-call auditing at SDK layer; input/output sanitization

Managed Platforms

Anomalous behavior within sandbox boundaries

AML.T0055

Runtime syscall monitoring inside sandbox; capability-scoped containers

Orchestration

Privilege escalation via agent delegation chains

AML.T0086

Monitor agent-to-agent credential passing; enforce delegation scope limits

Orchestration

Memory poisoning for persistent manipulation

AML.T0080

Audit agent memory stores; detect unexpected memory writes between sessions

 

Why baselines fail (and what works instead)

There is a hard truth most vendor content avoids: traditional behavioral baselines do not work for AI agents.

Container security relies on deterministic profiling. A web server always calls the same syscalls, reads the same files and connects to the same hosts. Deviations from baseline are anomalies. This is the foundation of drift detection and most runtime security products, including Sysdig's, which is why we’re extending the model for agentic workloads.

AI agents break this model. The same agent given the same task takes different action sequences each time. Temperature settings introduce randomness. User-driven prompts create unbounded variability. And the very actions you want to detect, like shell execution, file reads or outbound connections, are the agent's normal operating behavior. Worse: unlike traditional code, the decision-making lives in model weights and a context window you cannot inspect. There is no static analysis path, no policy you can write upfront that anticipates every input. What the agent does at runtime is the only ground truth.

So, agent behavior is nondeterministic, and traditional controls assume predictable execution. But this doesn't mean runtime security is useless for agents. It means the approach must be layered:

  • Known-bad detection: An agent reading SSH keys, connecting to known-malicious IPs, or executing post-exploitation tools is suspicious regardless of non-determinism. Syscall-level rules catch these reliably.
  • Capability scoping: Restricting agent containers with seccomp, AppArmor, network policies and read-only filesystems means detection monitors sandbox violations rather than behavioral deviations. The baseline becomes the sandbox boundary itself.
  • Tool-call-level auditing: Monitoring at the agent framework layer (which tools were called, with what arguments, in response to what prompt) provides the semantic context that syscall monitoring lacks.

Here's what known-bad detection looks like in practice using Falco, the open-source runtime security engine. An agent process reads cloud credentials unrelated to its task, a pattern consistent with MCP tool poisoning or prompt injection:

- rule: AI Agent Reads Sensitive Credentials
  desc: AI agent process reading cloud credentials, SSH keys, or API tokens.
  condition: >
    evt.type in (open, openat, openat2) and evt.dir = <
    and container.image.repository contains "ai-agent"
    and (fd.name startswith /root/.aws
         or fd.name startswith /root/.ssh
         or fd.name endswith .env
         or fd.name contains credentials)
  output: >
    AI agent read sensitive file
    (file=%fd.name command=%proc.cmdline container=%container.name)
  priority: CRITICAL
  tags: [container, ai, mitre_credential_access, AML.T0055]

Similar rules catch unexpected outbound connections from agent containers and writes outside the designated workspace, each mapping to the attack vectors in the table above. Agent behavior might be non-deterministic at the LLM layer, but every agent action ultimately produces deterministic infrastructure artifacts, like syscalls, file operations or network connections. Runtime security operates at that deterministic layer.

What security teams should do now

  • Instrument the runtime, especially inside sandboxes. Deploy syscall-level detection on every environment where agents operate. Managed platforms don't solve this for you.
  • Audit your MCP surface. Inventory every MCP server your agents can reach. Treat tool descriptions as untrusted input. Monitor for description changes after initial approval.
  • Scope capabilities aggressively. Use seccomp, AppArmor, network policies, and read-only filesystems to constrain agent containers. When baselines are unreliable, the sandbox boundary becomes the baseline.
  • Treat agent delegation as lateral movement. Multi-agent architectures create trust chains invisible to network monitoring. Log what agents delegate, to whom, and whether delegated tasks exceed the original scope.

Operationalizing this at scale with Sysdig

The Falco rule shown above is illustrative, a starting point, not a production ruleset. Running detection across a real agent fleet means keeping rules current as attack patterns evolve (new MCP servers, new SDK releases, fresh CVEs), mapping each alert to the right MITRE ATLAS technique, and correlating runtime signals with the rest of your cloud security posture. This is where Sysdig AI Workload Security extends what open-source Falco provides.

  • Managed detection rules for agentic workloads. Sysdig's Threat Research Team curates Falco Feeds, which is a continuously updated ruleset mapped to MITRE ATLAS  (including the AML.T0080–T0086 agent-specific techniques). Rules tuned for MCP tool poisoning, prompt injection exfiltration, and coding-agent credential theft arrive pre-packaged, pre-tested, and kept up to date.
  • Agent runtime inventory. Sysdig AI Workload Security shows where AI engines and packages are running in your environment (across providers like OpenAI, Anthropic, and Bedrock). Each workload is correlated with its in-use AI packages, their exploitable vulnerabilities, public-exposure status, and the runtime events firing on it.
  • Response integrated with your existing workflow. Detections route through the same incident pipeline your team already uses: isolate the container, revoke the agent's credentials, page the right on-call, or trigger an automated playbook.

Open-source Falco remains the foundation: the detection layer, the rule language and the community. What Sysdig adds is the scale, curation, and integration needed to run it as a production-grade program rather than a project.

The layer where anomalies become visible

Every other layer of your stack assumes deterministic behavior. Agents break that assumption. Security tooling has to break its assumptions too: not by chasing baselines that no longer hold, but by moving detection to the layer where agents, opaque at the LLM level but observable at the system level, still produce observable artifacts.

See how Sysdig applies runtime detection to AI agent workloads. Request a demo.

About the author

Cloud detection & response
Cloud Security
featured resources

Test drive the right way to defend the cloud
with a security expert