Unified tool gateway
Route all agent tool calls through a central gateway that handles authentication, rate limiting, logging, and policy enforcement, so tools register once and agents call uniformly.
TL;DR
- All agent tool calls route through a single gateway that handles authentication, rate limiting, logging, and policy enforcement. No agent ever holds raw API credentials.
- Tools register their schema and endpoint with the gateway once. The gateway generates LLM-facing tool descriptions, injects credentials per-tool per-user, and tracks all costs in one place.
- Hot-swap capability: change a tool's backend implementation (swap providers, upgrade versions) without modifying any agent code.
- Natural alignment with MCP (Model Context Protocol): the gateway IS the MCP server, exposing tools via a standard protocol.
- Limitation: single point of failure. Gateway downtime blocks all tool access. Multi-region deployment and circuit breakers are not optional.
The Problem It Solves
Your company runs 12 agents, each accessing a mix of 8 external APIs: Slack, GitHub, Jira, SendGrid, Stripe, Twilio, Google Drive, and a custom analytics service. Each agent integration manages its own API keys, retry logic, rate limiting, and error handling. That's 96 agent-to-API integration points, each with its own credential storage, its own retry policy, and its own logging format.
When the Slack API changes its rate limiting rules, you patch 12 agents. When you need to rotate the GitHub token, you update 12 configuration files. When finance asks "how much did our agents spend on external APIs last month?" you scrape logs from 12 different systems and hope the formats are compatible.
I've seen this exact scenario at two different companies. Both eventually built a gateway after the third incident where a stale API key took down multiple agents simultaneously. The pattern is predictable: scattered integration works until it doesn't, and the failure mode is always "nobody knew which agents were affected."
The fundamental issue: cross-cutting concerns (auth, rate limiting, logging, cost tracking) are duplicated across every agent-tool integration. This violates DRY at the infrastructure level and creates a maintenance burden that scales with agents x tools.
What Is It?
A unified tool gateway is a single service that sits between all agents and all tool providers. Agents make tool calls to the gateway. The gateway handles authentication, routing, rate limiting, logging, and policy enforcement, then forwards the call to the actual tool provider. Every tool registers once with the gateway; every agent connects to one endpoint.
Think of it as a hotel concierge desk. Guests (agents) don't call restaurants, taxis, and theaters directly. They tell the concierge (gateway) what they need, and the concierge handles the logistics: knows the phone numbers (credentials), manages reservations (rate limiting), keeps a log of all requests (audit trail), and can substitute providers without the guest knowing (hot-swap).
How It Works
The four gateway layers
The gateway is structured as a pipeline of four layers, each handling one cross-cutting concern. Every tool call passes through all four layers in sequence.
Layer 1: Authentication. The agent sends a tool call with its session token, not with the tool's API key. The gateway looks up the tool's credentials from a secure vault (HashiCorp Vault, AWS Secrets Manager, environment variables), injects them into the outbound request, and strips them from all logs and responses. The LLM prompt never contains API keys.
This is the highest-value layer. I've audited agent systems where API keys were embedded in system prompts. One prompt injection and every key leaks. The gateway eliminates this entire threat class.
Layer 2: Rate limiting. The gateway enforces per-tool, per-user, and per-agent rate limits. Slack allows 50 requests/minute; the gateway enforces this across all agents collectively, not per-agent. Per-user limits prevent a single user's agent from monopolizing shared API quotas.
Layer 3: Structured logging. Every tool call generates a structured log entry: tool name, agent ID, user ID, request timestamp, response latency, HTTP status, token cost (for LLM-based tools), and error details. This single log stream powers dashboards, alerting, cost attribution, and compliance audits.
Layer 4: Policy enforcement. Role-based access control at the tool level. The support agent can call Slack and Jira but not GitHub or deployment tools. The coding agent can call GitHub but not Stripe. Policies are defined centrally and enforced at the gateway, not scattered across agent configurations.
Tool registration
Tools register with the gateway by providing their schema (parameters, types, descriptions), their endpoint URL, and their authentication requirements. The gateway uses this registration to:
- Generate LLM-facing tool descriptions (the simplified schema the agent sees)
- Configure credential injection (which vault path, which auth method)
- Set up rate limiting rules (per the tool provider's API limits)
- Build the routing table (tool name to endpoint URL)
Registration is typically a config file or API call:
# Gateway tool registration
tools:
slack_post_message:
endpoint: "https://slack.com/api/chat.postMessage"
method: POST
auth:
type: bearer
vault_path: "secrets/slack/bot-token"
rate_limit:
requests_per_minute: 50
burst: 10
schema:
parameters:
channel: { type: string, required: true }
text: { type: string, required: true }
description: "Post a message to a Slack channel"
allowed_roles: ["support-agent", "coding-agent", "ops-agent"]
Gotcha: tool description quality matters at registration
The gateway generates LLM-facing tool descriptions from the registration schema. If your description field is vague ("Does Slack stuff"), the agent will misuse the tool. Write descriptions from the agent's perspective: "Post a text message to a named Slack channel."
Hot-swap and provider abstraction
One of the gateway's strongest capabilities is provider abstraction. The agent calls gateway.search_web(query), and the gateway routes to Bing, Google, or a custom search engine depending on configuration. Swap the search provider by changing one line in the gateway config. No agent code changes.
This matters in production. When Google Search API has an outage, switch to Bing in 30 seconds. When a cheaper email provider launches, migrate from SendGrid to Resend without touching any agent. The gateway is the abstraction layer that decouples agents from providers.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.