Vulnerability Analysis
Published: May 3, 2026

Securing MCP
Tool-Calling

“The most critical vulnerability in autonomous agents is not the model itself, but the authority granted to its tool-calling interface.”

Executive Summary

The Model Context Protocol (MCP) has rapidly become the standard for connecting AI agents to local and remote resources. However, the protocol's reliance on raw JSON-RPC tool calls creates significant attack surfaces. Without a dedicated security layer like McpVanguard, agents are susceptible to prompt injection, Server-Side Request Forgery (SSRF), and arbitrary path traversal.

Prompt injection is listed as LLM01 in the OWASP Top 10 for LLM Applications and as a primary risk vector in the NIST AI Risk Management Framework 1.0. In the context of MCP deployments, the attack surface extends beyond the model itself to the entire tool-calling interface.

The Injection Vector

Prompt injection occurs when an attacker manipulates the agent's context to execute unauthorized tool calls. In an MCP environment, this often manifests as:

  • 01.Unsanitized user input being passed directly to `read_file` or `run_command` arguments.
  • 02.Agents being manipulated into unsafe tool sequences or induced to call sensitive tools outside of policy.

Mitigation Strategies

To establish a zero-trust posture, we advocate for a three-layer defense-in-depth strategy:

Signature Filtering

Deterministic blocking of known malicious patterns using YAML signatures.

Semantic Intent

Heuristic scoring of tool-call arguments to detect adversarial intent.

Hardened Gateway

A hardened gateway layer that mediates all tool access, reducing implicit trust between agents and MCP servers.

Secure your agents today.

McpVanguard is our open-source reference implementation of the security layers discussed in this paper.