Mid-year review season arrives and the calendar fills with the same quiet dread. For every report, a manager has to find the goals set in January, remember what got shipped since, collect feedback from a handful of colleagues, and write it all up fairly. The thinking is the hard part. The collation is just hours of digging through documents, chat threads, and old notes. An AI agent can take that digging off your plate, gather the inputs you need, and hand you a structured draft, so the time you have goes to judgment and the conversation rather than the search.
This guide walks through running a mid-year review with an agent in five steps. It builds on the basics in how to set up your first AI agent, and it leans hard on one rule throughout: the agent prepares, the human decides.
What a review agent actually does
A mid-year review agent is an assistant for the manager, not a judge of the employee. It gathers the inputs a review needs, the goals set at the start of the year, completed work, peer feedback, and prior review notes, then assembles a structured draft per report and tracks who still owes feedback. The value is in the collation, not the verdict: the agent prepares, the manager decides.
That distinction matters more here than almost anywhere else an agent gets used. A review touches someone's pay, growth, and standing. So the agent stays firmly on the preparation side of the line. It does not assign a rating, rank people against each other, or write the final word. It reads the same sources you would, lays them out cleanly, and leaves every judgment to you. If you are still deciding whether an agent or a simpler assistant fits, the difference between an AI agent, a chatbot, and an assistant is a useful primer, as is the broader explainer on what an AI agent is.
1. Define the outcome: a fair packet per person
Start by writing down what a finished run produces, in one sentence. The most reliable agent runs begin with a clear outcome rather than a list of steps. For a mid-year review, the outcome is simple: one fair, evidence-backed packet per direct report, with each goal, the work behind it, the feedback received, and a neutral draft summary the manager can edit.
Why the outcome comes first
Naming the outcome fixes the standard before any work starts. It tells the agent what "done" looks like and gives you the final check: a packet either has every goal backed by evidence, or it does not. Skip this and you get a pile of raw notes with no shape. A defined outcome also keeps scope honest. The packet is the deliverable; a rating is not, because that decision is yours, not the agent's.
2. Collect the inputs from your tools
With the outcome set, the agent gathers the raw material. Much of a review's prep time goes to hunting across scattered systems rather than to thinking. The agent connects to the tools where the evidence lives and pulls four things: the goals set at the start of the year, completed work, prior review notes, and any self-assessment the report has written.
Where the evidence lives
Goals usually sit in an HR system or a shared doc from January. Completed work shows up in project trackers, shipped tickets, closed deals, or a portfolio. Prior notes live in last cycle's review file. The agent reads each source, attributes every item to the goal it supports, and keeps a link back to the original so you can verify any claim in a click. Nothing gets invented; if a goal has no work attached, the agent says so rather than filling the gap. The mechanics of giving an agent the right access mirror the patterns in the AI agent contract review workflow.
3. Request and chase peer feedback
Peer feedback is where reviews usually stall, because chasing people is tedious and easy to drop. The follow-up, not the asking, is what tends to eat a manager's time. You tell the agent who should give feedback for each report. It sends each reviewer a short, consistent request with a clear deadline, then tracks every response.
Tracking who still owes input
The agent keeps a live list of who has responded and who has not, and sends polite reminders as the deadline nears. You see at a glance that four of six reviewers are in and two are outstanding, without opening your inbox. Asking every reviewer the same questions also keeps the input consistent, which matters for fairness. The agent collects and organizes the responses; it does not weigh them or decide whose view counts. That weighting is your call when you read the packet.
4. Compile against goals and draft a summary
Now the agent assembles everything into the packet. A packet structured goal by goal is faster to review and harder to bias than a loose narrative. For each goal, it lays out the work that supports it and the relevant feedback side by side, then writes a neutral summary the manager can edit, accept, or rewrite.
What "neutral draft" means
A neutral draft sticks to evidence and plain language. It says "shipped the billing migration in March, on the Q1 goal" rather than "did great work". It cites the source behind every point, avoids loaded adjectives, and never proposes a rating or a verdict. Think of it as a well-organized starting document, the kind of structured preparation explored in how to build a multi-step agent workflow. The manager reads it, corrects anything off, adds the context only a human has, and forms the actual assessment. The draft saves typing, not thinking.
5. Keep a human in the loop and fair
This is the step that protects everyone, so it is not optional. Any agent that touches people-data must assist the manager, never score or decide on a person. The agent gathers inputs and drafts; the human reads the evidence, removes bias, makes the call, and owns the result. A review with no human judgment in it is not a review.
Practical guardrails
Keep the evidence visible so every point in the packet traces back to a source you can check. Have the agent ask the same questions of every reviewer and draft in neutral language, which reduces the chance that wording nudges your read. Then do the part only you can do: weigh context, account for circumstances, and decide. Anthropic's guidance in Building Effective Agents makes the same point for high-stakes work, keep a person in control where the decision carries consequences. Mid-year reviews are exactly that kind of work.
One more practical note before the run. If you want to budget the credits a full review cycle will use across a team, how to estimate agent cost before deploying walks through the math, and for the wider picture of where agents are landing in real workflows this year, see the state of AI agents in mid-2026.
Frequently asked questions
Can an AI agent run performance reviews?
An agent can run the preparation, not the judgment. It gathers goals, completed work, and peer feedback, then assembles a structured packet per report. The manager still reads the evidence, forms the assessment, and holds the conversation. The agent does the collation so the manager spends time on what matters.
What can an AI agent do for a performance review?
It pulls the goals set at the start of the year, lists completed work from your tools, sends and chases peer feedback requests, and compiles everything against each goal. It then drafts a neutral summary for the manager to edit and flags any gaps in the evidence, like a goal with no supporting work attached.
Should an AI agent decide employee ratings?
No. An agent should never score people or set ratings. Those are judgments with real consequences for someone's pay and career, and they belong to a human manager. The agent prepares the evidence and a draft summary. The person reads it, corrects it, decides the rating, and owns the outcome.
How does a review agent gather feedback?
You tell the agent who should give feedback for each report. It sends each reviewer a short, consistent request with a clear deadline, then tracks who has responded and who still owes input. It chases the stragglers with reminders and shows the manager a live picture of what is in and what is missing.
How do I keep an AI review agent fair?
Keep a human in the loop and make the evidence visible. Have the agent draft in neutral language, cite the work behind each point, and ask the same questions of every reviewer. The manager checks the draft against the goals, removes bias, and decides. Fairness comes from a person reviewing the evidence, not from the tool.
Describe the outcome and let the agent prepare
You do not configure any of this step by step. On Gravity you describe the outcome in plain words, "prepare a mid-year review packet for each of my reports, pull their goals and shipped work, request feedback from these colleagues, and draft a neutral summary I can edit", and the right expert-built agent runs it. It collates in about 60 seconds of setup, chases feedback over the days that follow, and hands you packets ready to judge. You pay only when it runs, at one dollar for a thousand credits. The agent does the digging. You do the deciding, every time.
Sources
- Anthropic, "Building Effective Agents", 2024, anthropic.com/engineering/building-effective-agents
- Gravity internal notes, 2026. Retrieved 2026-06-14.