To set up AI agent notifications well, decide three things: which events deserve an alert, which channel each event goes to, and who needs to act on it. The events worth notifying on fall into four classes: completion, failure, approval-needed, and anomaly or threshold. Route each one by severity so that approvals and failures interrupt you immediately while routine completions batch into a quiet digest. That single discipline, severity-based routing, is what separates a notification setup you trust from one you mute after a week.
This guide covers each decision in order: the events, the channels, severity and batching, routing approvals to a human, and finally quiet hours and escalation so urgent alerts still reach someone when you are away. The aim is a setup where every interruption is one you would have wanted.
The short answer
A good agent notification setup answers four questions before any alert ever fires:
- What happened? The event class: did the agent finish, fail, need a decision, or cross a threshold you set?
- How urgent is it? The severity: does this need attention now, today, or never (just logged)?
- Where does it go? The channel: email, a chat tool like Slack, a webhook into another system, or some combination.
- Who acts on it? The recipient: yourself, a teammate, an on-call rotation, or an automated downstream system.
Get those four right and you have a notification layer that tells you what you need to know and stays silent the rest of the time. Get them wrong and you either miss the alert that mattered or drown in ones that did not. The rest of this guide works through each in practical terms.
Which events to notify on
Not every moment in an agent run deserves a notification. Pick from four event classes, and be deliberate about which ones you actually want pushed to you versus simply recorded in a log you can check later.
- Completion: the agent finished its run and produced a result. Useful when you are waiting on an output, but for scheduled or routine work it is often better batched into a daily summary than sent one at a time.
- Failure: the run errored, timed out, or could not finish. This almost always deserves an alert, because something you expected to happen did not. Pair failure notifications with your error handling and rollback rules so you know whether the agent already retried or rolled back before it told you.
- Approval-needed: the agent reached a step you marked as requiring a human decision and paused. This is the highest-priority class, because the agent is blocked and waiting on you. It is the notification side of a human-in-the-loop design.
- Anomaly or threshold: a value crossed a limit you defined. Examples: cost for a run exceeded a budget, the volume of items processed was far above or below normal, or an output looked unusual against a baseline. These catch problems that are not outright failures but still warrant a look.
A reliable default is to alert on failure and approval-needed every time, batch completion into a digest, and tune anomaly thresholds over the first few weeks until they fire only on genuine outliers. If you are still deciding which steps even need a pause, the safety and guardrails guide covers where to place the checkpoints that generate approval notifications in the first place.
Choosing channels: email, Slack, webhook
The channel should match how fast you need to act and who needs to see the alert. The three common channels each have a clear best use.
- Email suits low-frequency, low-urgency notifications you want a durable record of: daily or weekly summaries, completion receipts, and reports. Email is easy to file and search but slow to act on, so it is the wrong choice for anything time-sensitive.
- Slack or team chat suits notifications a person needs to see and respond to quickly. Threads keep context together, and you can act in place, for example approving a request with a reply or a button. This is the right home for approvals and failures that a human owns.
- Webhook suits cases where another system, not a person, should react. A webhook can open a ticket in your tracker, page an on-call engineer through an incident tool, post to a status dashboard, or trigger a downstream agent. Use it whenever the response should be automated rather than read by a human.
Most setups use more than one. A common pattern: completions go to email as a digest, approvals and failures go to a Slack channel where the responsible person can act, and critical failures additionally fire a webhook into an on-call system so they reach someone even outside chat. The point is to match each event's severity to a channel that fits its urgency, rather than sending everything everywhere.
Whatever channels you pick, the notification itself should carry enough context to act on without opening another tool: what the agent was doing, what happened, and a link to the full run. That link matters most when you pair notifications with monitoring and observability, so a single click takes you from the alert to the complete trace of the run.
Severity and batching to avoid alert fatigue
Alert fatigue is the failure mode that quietly breaks most notification setups. When every event sends a push, people start ignoring all of them, including the one that mattered. The fix is to treat severity as a first-class property of every notification and route by it.
A workable three-tier scheme:
- Critical: interrupt immediately, on every channel that reaches a human fast. Approval-needed and hard failures belong here. These should be rare enough that each one earns the interruption.
- Warning: deliver promptly but do not interrupt. Anomalies, soft failures the agent already retried, and threshold crossings go here. A single channel, checked regularly, is enough.
- Info: do not push at all; batch into a digest or write only to the log. Routine completions and successful runs belong here.
Two techniques keep even the critical tier quiet enough to stay credible. The first is batching: instead of one notification per completed run, collect them and send a single digest on a schedule, for example a morning summary of everything that ran overnight. The second is deduplication: if the same failure happens fifty times in an hour, send one alert that says so, not fifty. Both reduce volume without hiding signal.
The test to apply to any notification rule is simple: would you want to be interrupted for this, every time it fires? If the honest answer is no, demote it from critical to warning or info. A notification layer earns trust by being right about urgency, and trust is what makes you actually read the next alert instead of swiping it away.
Routing approval requests to a human
Approval requests are a special kind of notification, because the agent is not just informing you, it is waiting on you. The run is paused at a checkpoint you defined, and it cannot continue until a person approves or rejects the proposed action. Getting this routing right is what makes an agent safe to hand real authority to.
A well-formed approval request includes:
- The action the agent wants to take, stated plainly: send this email, publish this update, charge this amount, delete these records.
- The context behind it: what the agent found, why it concluded this action is right, and what it used to decide.
- A clear way to respond: approve or reject in place, ideally without leaving the channel. A reply, a button, or a link to a confirmation view.
- A timeout and a default: how long the agent waits, and what happens if no one answers in that window. Common defaults are to cancel the action, to hold indefinitely, or to escalate to a backup approver.
You decide who approves which actions. High-stakes steps, such as anything involving money, external communication, or irreversible deletion, should route to a named owner, not a shared channel where everyone assumes someone else will handle it. Lower-stakes approvals can go to a team channel where the first available person responds. This routing is the operational half of a human-in-the-loop design; the human-in-the-loop guide covers where to place the checkpoints, and notifications are how the agent reaches the human when it hits one.
One practical note: keep the approval window honest. If you set a one-minute timeout but your approver is in meetings all afternoon, the agent will keep canceling actions or escalating unnecessarily. Match the timeout to how quickly the responsible person can realistically respond, and use escalation, covered next, for the cases where they cannot.
Quiet hours and escalation
Two settings keep a notification layer humane and reliable: quiet hours stop it from waking you for routine events, and escalation makes sure urgent ones still reach someone when the first person does not respond.
Quiet hours define a window, typically overnight and on weekends, when non-critical notifications are held and delivered later rather than pushed immediately. A completion at 2 a.m. does not need to light up your phone; it can wait for the morning digest. Quiet hours apply to the info and warning tiers, not to critical ones. The distinction matters: you want to silence the noise without silencing a genuine emergency. Define which severities respect quiet hours and which override them.
Escalation is the fallback for notifications that need action but get no response. The pattern is a ladder: send the alert to the primary owner; if it is not acknowledged within a set time, re-send it, then notify a backup person, then raise it to a louder channel or page an on-call rotation. Escalation is what prevents an approval request from sitting unread while the agent stays blocked, or a critical failure from going unnoticed because the one person who saw it was offline.
Combine the two and the behavior is what you would want from a good assistant: it does not bother you with trivia at night, but if something genuinely needs a human and the first human is unreachable, it keeps working up the chain until someone answers. For agents that run unattended on a schedule, quiet hours and escalation are not optional polish; they are what makes overnight autonomy trustworthy. If you are setting up agents that hand off to each other, the same routing logic applies across fallback chains, where one agent's failure notification can trigger the next.
A setup checklist
Pulling the decisions together, here is the order to configure them in:
- List the events your agent can produce: completion, failure, approval-needed, and any anomaly or threshold checks that matter for your workflow.
- Assign a severity to each: critical, warning, or info. Be strict; most events are not critical.
- Map severity to channel: critical to a fast human channel plus an optional webhook, warning to a single chat channel, info to an email digest or the log.
- Name the recipients, especially for approvals. High-stakes actions get a named owner; lower-stakes ones can go to a shared channel.
- Set timeouts and defaults for approval requests, matched to how fast the owner can realistically respond.
- Configure quiet hours and decide which severities override them.
- Build the escalation ladder for unacknowledged critical alerts.
- Tune over the first weeks: demote anything that fires too often, deduplicate repeated alerts, and confirm every critical notification still earns its interruption.
If you are configuring your very first agent and notifications are one piece of that, setting up your first AI agent walks through the full process from plain-language description to a running workflow, with notifications as one of the settings you choose along the way. For the broader vocabulary, the glossary defines the terms, and what is an AI agent explains why an agent, unlike a fixed script, can decide when an event is worth telling you about. When a notification surfaces a tool error you need to chase down, the debug agent tool errors guide covers what to do next.
How Gravity handles agent notifications
Gravity is an AI agent platform. You describe what you want to be told about in plain words: "let me know when the report is ready, ask me before sending anything to a client, and page me if a run fails twice." An expert-built agent applies that as its notification policy without you wiring up channels or writing alerting rules.
Completions arrive as a clean summary rather than a stream of pings. Approval requests reach you with the action and its context, and you approve or reject in place; the agent waits, then resumes or stops based on your answer and the timeout you set. Failures and threshold crossings route by the severity you chose, with quiet hours and escalation handled for you so overnight runs do not wake you for routine events but still reach someone if something genuinely needs a person. Because builders maintain these agents for Gravity, the notification behavior is tuned to be useful by default rather than left as a blank configuration screen.
You pay per use: $1 equals 1,000 credits, and you only pay when the agent runs. Notifications themselves are part of how the agent works, not a separate product to assemble. If your agents start handing work to each other, the same routing carries across fallback chains, and if you run agents with a team, sharing an agent with your team covers who gets notified about what.
FAQ
What events should an AI agent notify me about?
Notify on four event classes: completion (the agent finished and produced a result), failure (the run errored or could not finish), approval-needed (the agent paused and is waiting for a human decision before continuing), and anomaly or threshold (a value crossed a limit you set, such as cost, volume, or an unusual output). Approval-needed and failure are the highest priority because they block work or require action. Completion can often be batched or silenced for routine runs.
Which channel is best for agent notifications: email, Slack, or webhook?
Use email for low-frequency, record-keeping notifications like daily summaries and completion receipts. Use Slack or a team chat for time-sensitive alerts that a person needs to see and act on quickly, such as approvals and failures, because it supports threads and quick replies. Use a webhook when another system needs to react programmatically, for example to open a ticket, page an on-call engineer, or update a dashboard. Most setups combine all three and route by severity.
How do I stop AI agent notifications from becoming noise?
Assign a severity to every event, then route by severity: critical events interrupt immediately, informational events batch into a digest. Silence routine completions, deduplicate repeated failures into a single alert, and set quiet hours so non-critical notifications hold until morning. The goal is that every notification that interrupts you is one you would want to be interrupted for.
How does an agent route an approval request to a human?
When the agent hits a step you marked as requiring approval, it pauses and sends an approval request to the channel and person you designated, including the action it wants to take and the context behind it. The person approves or rejects, and the agent resumes or stops accordingly. You define who approves which actions, how long the agent waits, and what happens if no one responds within that window.
What are quiet hours and escalation for agent alerts?
Quiet hours are a time window when non-critical notifications are held and delivered later instead of interrupting you. Escalation is a fallback rule: if a notification that needs action is not acknowledged within a set time, the agent re-sends it, notifies a backup person, or raises it to a louder channel. Together they make sure you are not woken for routine events while genuinely urgent ones still reach someone.