Most AI agent failures in production are not model failures. They are integration failures. A webhook arrives twice and the agent acts twice. An OAuth refresh fails silently and the agent runs unauthenticated. A polling job stalls and a critical event is missed for six hours. This guide is the integration playbook: the five patterns agents use to reach the rest of the world, the trade-offs of each, and the rules for combining them.
If you want a higher-level view of how agents talk to tools, see AI agent tool use explained and how to give an agent multiple tools. This piece sits one layer below: the protocols and patterns those tools ride on.
The five patterns
Production agents combine these patterns; few use only one. The shape of your integrations should follow the shape of the external systems you depend on, not a single template.
| Pattern | Direction | Latency | Best for |
|---|---|---|---|
| Webhook | Inbound (event) | Sub-second | Event triggers from systems that emit them |
| OAuth | Outbound (auth) | n/a | Accessing user accounts on third-party SaaS |
| MCP | Both | Sub-second | Tool exposure with standard schema |
| Polling | Outbound (read) | Seconds to minutes | Sources without webhooks |
| Queue | Internal + outbound | Async | Durable retries, fan-out, rate-limit smoothing |
Webhook
The simplest event channel. The source system makes an HTTP POST to a URL the agent host exposes. The body contains the event payload. The agent processes it and returns 2xx.
Three rules separate a working webhook from a broken one. Verify signatures. Every major provider (Stripe, GitHub, Slack, Shopify) signs webhook payloads. Verify the signature; reject anything else as an attack. Reply 2xx fast. The handler should enqueue the event and return immediately. Doing real work inline causes timeouts and triggers retries. Be idempotent. Every webhook spec assumes at-least-once delivery. Stripe explicitly documents that webhooks can be delivered more than once and recommends idempotency keys (Stripe webhooks documentation). Use the event ID as a deduplication key with at least 24-hour retention.
OAuth
Most agent integrations need to act on behalf of a user: read their email, post to their Slack, list their Salesforce records. OAuth 2.0 is the standard authorization framework for these flows (RFC 6749).
The authorization-code flow with refresh tokens is the right choice for agents. The agent host stores the refresh token; access tokens are short-lived (typically 1 hour) and minted on demand. Three controls matter. Encrypt refresh tokens at rest with a customer-scoped key. Rotate refresh tokens on use where the provider supports it. Scope narrowly: request only the scopes the agent actually needs. Over-scoping is the most common audit finding in agent OAuth implementations.
For per-user agent boundaries, see how to give an agent access to email safely.
MCP (Model Context Protocol)
The Model Context Protocol is an open protocol introduced by Anthropic in November 2024 that standardizes how AI applications expose tools, data sources, and prompts to language models (Anthropic, MCP announcement, 2024). It uses JSON-RPC over a transport (stdio or HTTP). An MCP server publishes a manifest of tools, resources, and prompts; an MCP client (the agent runtime) discovers them at connection time.
What MCP solves. Before MCP, every agent framework had its own way of declaring tools. Adding a new SaaS integration meant writing an adapter for each framework. With MCP, an integration author writes one MCP server, and any MCP-aware agent runtime can use it. The protocol is now supported by Claude desktop, multiple agent IDEs, and a growing ecosystem of third-party MCP servers (modelcontextprotocol/servers).
When to use MCP. New integrations that you want portable across runtimes. Internal company tools you want to expose to multiple agents. When NOT to use MCP: a single integration that ships as a built-in function call against your own runtime adds protocol overhead with no portability benefit.
Polling
The fallback when webhooks are unavailable or unreliable. The agent host makes a recurring API call to check for new state.
Three things separate efficient polling from a billing surprise. Use change tokens or cursors where the API supports them. Fetch only what changed since the last poll. Backoff on empty. When nothing new appears, double the poll interval up to a cap. Respect rate limits. Read the provider docs for rate-limit headers and back off proactively when you cross 70 percent of the budget. For deeper handling, see how to handle agent rate limits.
The most common polling pattern in production agents: webhook primary, 60-second poll as a sanity-check fallback. The poll catches webhooks that were dropped or delayed beyond their retry window. This is the integration belt-and-braces.
Queue
A message queue between the agent and downstream systems gives you durable retries, ordered processing, and rate-limit smoothing. Common implementations: Cloudflare Queues, AWS SQS, RabbitMQ, Kafka, Redis Streams. The choice depends on your hosting environment; the pattern is the same.
When to put a queue in front of an integration. Flaky write target. The downstream API has uptime issues. The queue absorbs retries so a failed deploy on the other side does not cascade. Fan-out. One trigger needs to produce many outputs (notify 1000 subscribers). Push the fan-out into the queue rather than holding the agent open. Burst rate limits. A burst of events exceeds the downstream API's rate limit. The queue paces the work.
When NOT to use a queue. Simple synchronous reads. Adding a queue here adds latency and a failure mode (queue itself) for no resilience gain.
Combining patterns
A realistic production agent looks like this. Inbound: webhook from Slack (primary) plus a 60-second poll of Slack messages API (fallback for missed webhooks). Auth: OAuth 2.0 to Slack and Google Workspace, refresh tokens encrypted at rest with per-tenant keys. Tool exposure: MCP for internal database queries that the agent runs against the company knowledge base. Outbound: writes to Salesforce go through a queue with exponential backoff, because Salesforce's API has bursty rate limits and occasional 503s.
The pattern: webhooks for events, OAuth for auth, MCP for tools, polling as a safety net, queue in front of every flaky write. None of these are exclusive; they layer.
| Failure mode | Pattern that catches it |
|---|---|
| Source provider drops a webhook | Polling fallback |
| Duplicate webhook delivery | Idempotency key on event ID |
| OAuth refresh token leak | Refresh-token rotation, narrow scope |
| Downstream API returns 503 | Queue + exponential backoff |
| Burst exceeds rate limit | Queue + token-bucket pacer |
| Tool schema changes underneath agent | MCP discovery on connection |
Field notes from production
Three integration failures recur often enough to mention explicitly. Webhook secret rotation. Many teams set the signing secret at integration time and never rotate it. When a developer leaves or a vendor breach happens, the secret should rotate within hours. Build the rotation flow before you need it; doing it under incident pressure is the worst time.
OAuth scope creep. The integration was scoped narrowly on day one. Two years and three feature releases later, the scope has expanded to "everything." Re-review OAuth scopes annually and prune anything the agent does not currently exercise; the smaller scope reduces breach impact and simplifies the SOC 2 review.
Queue ordering assumptions. Production agents that depend on FIFO ordering across a queue learn the hard way that most managed queue services provide ordering only within a partition or message group. If your agent must process events in a strict order, pick a queue with explicit FIFO guarantees and define the partition key, or build ordering at the application layer.
Frequently asked questions
What are the main integration patterns for AI agents?
Webhook, OAuth, MCP, polling, queue. Most production agents combine three or more.
When should I use webhooks vs polling for an AI agent?
Webhooks when supported, polling when not. In production, run both: webhooks as primary, a 60-second poll as fallback for missed deliveries.
What is the Model Context Protocol (MCP) and why does it matter for agents?
An open protocol from Anthropic (November 2024) that standardizes tool and resource exposure to language models. It removes per-framework adapter work.
How do AI agents handle OAuth for accessing user accounts?
Standard authorization-code flow with refresh tokens. Encrypt at rest, rotate on use, scope narrowly.
When do AI agents need a message queue?
For flaky write targets, fan-out work, and rate-limit smoothing. Skip the queue for simple synchronous reads.
Three things to ship this week
- Add idempotency keys to every webhook handler using the provider's event ID.
- Rotate refresh tokens on use for every OAuth integration that supports it.
- Add a 60-second polling fallback behind your primary webhook channel.
Sources
- Anthropic, "Introducing the Model Context Protocol", November 2024, anthropic.com
- Model Context Protocol specification, modelcontextprotocol.io
- Stripe, "Receive Stripe events in your webhook endpoint", docs.stripe.com
- IETF, "The OAuth 2.0 Authorization Framework", RFC 6749, datatracker.ietf.org
- GitHub, "Webhook events and payloads", docs.github.com