Skip to main content

Automating PR Review with AI: Guide (2026)

Automating pull request review accelerates feedback loops and reduces reviewer bottlenecks. Instead of waiting for a human to find time, an AI reviewer analyzes every PR immediately upon opening, posts findings as comments, and either approves the PR, requests changes, or escalates to a human. Teams report that AI-automated PR review reduces time-to-merge by 40–60% and eliminates the "review fatigue" that causes humans to miss bugs. The key is routing feedback correctly: trivial style issues auto-fix, security issues require human review, logic bugs get flagged for discussion.

Architecture 1: GitHub Actions + Claude API

The simplest setup uses a GitHub Actions workflow that calls the Claude API for every PR:

# .github/workflows/ai-review.yml
name: AI Code Review

on:
pull_request:
types: [opened, synchronize]

jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Get PR diff
id: diff
run: |
git diff origin/main..HEAD > pr.diff
echo "diff_lines=$(wc -l < pr.diff)" >> $GITHUB_OUTPUT

- name: Run AI review
id: ai_review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
python scripts/ai_review.py \
--pr-diff pr.diff \
--pr-number ${{ github.event.number }} \
--output findings.json

- name: Post findings as PR comments
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const findings = JSON.parse(fs.readFileSync('findings.json'));
for (const finding of findings.issues) {
github.rest.pulls.createReview({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: context.issue.number,
event: finding.severity === 'critical' ? 'REQUEST_CHANGES' : 'COMMENT',
body: finding.message
});
}

This workflow triggers on every PR, fetches the diff, sends it to Claude, and posts findings as review comments.

Architecture 2: Centralized Review Service

For larger teams, a dedicated service handles reviews:

# services/ai_review_service.py
import os
from anthropic import Anthropic

class AIReviewService:
def __init__(self):
self.client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def review_pr(self, pr_number: int, diff: str, pr_title: str, pr_body: str) -> dict:
"""Review a PR and return findings."""
prompt = f"""
Review this PR for bugs, security issues, style violations, and test coverage.

PR Title: {pr_title}
PR Description: {pr_body}

Diff:
{diff}

Return a JSON object with:
- issues: list of {severity, location, message, suggestion}
- approved: boolean
- blocking: list of critical issues
- suggestions: list of improvements
"""

response = self.client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
messages=[{"role": "user", "content": prompt}]
)

return json.loads(response.content[0].text)

# FastAPI endpoint
@app.post("/review")
async def create_review(pr_number: int, diff: str, title: str, body: str):
service = AIReviewService()
findings = service.review_pr(pr_number, diff, title, body)

# Store findings in DB for audit trail
db.store_findings(pr_number, findings)

# Post to GitHub
post_pr_comments(pr_number, findings)

return findings

This service can be called from GitHub Actions, Gitea, or any Git platform, making it reusable.

Architecture 3: Inline Bot with Contextual Comments

A more sophisticated setup posts inline comments on specific lines of code:

def post_inline_comments(pr_number: int, findings: list[dict]):
"""Post AI findings as inline PR comments."""
for finding in findings:
if finding.get("line"):
# Post comment on specific line
gh.create_pr_review_comment(
pr_number,
body=f"**{finding['severity'].upper()}**: {finding['message']}\n\n{finding['suggestion']}",
commit_id=finding.get("commit"),
path=finding.get("file"),
line=finding.get("line")
)
else:
# Post top-level PR comment
gh.create_pr_comment(pr_number, body=finding['message'])

Inline comments are easier to address because developers see the issue in context.

Handling Different Severity Levels

Design your workflow to route findings by severity:

SeverityActionOwnerTime to Merge
Critical (security, data loss)Block merge, require fixSecurity team+2 days
High (logic error, major bug)Request changes, require re-reviewCode owner+1 day
Medium (style, edge cases)Comment, can defer with ticketAuthor can ignoreNo delay
Low (suggestion)Post as comment, FYIAuthor can ignoreNo delay
ApprovedPost approval reactionAuto-approve trivial PRsNo delay

Auto-Fixing Trivial Issues

For style issues (formatting, import order, naming conventions), some teams configure the AI to generate fixes:

def auto_fix_pr(pr_number: int, findings: list[dict]):
"""Auto-fix trivial issues like import order, trailing spaces."""
fixes = [f for f in findings if f['severity'] == 'low' and f['auto_fixable']]

if not fixes:
return

# Create a commit with fixes
for fix in fixes:
apply_fix(fix['file'], fix['fix_code'])

subprocess.run(["git", "add", "."])
subprocess.run(["git", "commit", "-m", "chore: auto-fix style issues"])
subprocess.run(["git", "push"])

# Notify author
gh.create_pr_comment(pr_number, "Auto-fixed style issues; commit pushed.")

This accelerates reviews by addressing trivial issues automatically.

Measuring PR Review Automation Impact

Track metrics to justify the investment:

  • Time to first review: Before (24 hours human review), After (5 min AI review)
  • Time to merge: Before (2 days), After (1 day)
  • Bugs caught before merge: Before (3/100 PRs), After (8/100 PRs)
  • False positive rate: Track how many AI findings are incorrect (target <10%)
  • Manual review reduction: How many PRs need zero human review (target 30–50%)

Key Takeaways

  • Automate diff review via GitHub Actions or a dedicated service. Run AI review on every PR; post findings as comments.
  • Route findings by severity. Critical issues block merge; style suggestions don't delay.
  • Post inline comments. Line-specific feedback is easier to act on than top-level comments.
  • Measure impact. Track time-to-merge, bugs caught, and review volume to justify automation spend.

Frequently Asked Questions

What if the AI review posts irrelevant findings?

Refine your review prompt. Include examples of what is NOT a bug (legitimate patterns, domain-specific code). Track false positives and adjust the threshold—stricter prompts = fewer false positives but miss more issues.

Can I approve PRs automatically if AI review passes?

Only for low-risk changes (deps/docs/tests). For app code, require human approval. GitHub's branch protection rules can be configured: "Require reviews from code owners AND AI review passing."

How do I handle PR review for private repositories?

Ensure the AI review service has access to the private repo. For GitHub Actions, use the default GITHUB_TOKEN. For a custom service, authenticate via SSH or GitHub app.

What is the cost of automating PR review?

Using Claude API: roughly USD 0.02–0.05 per PR (for a 2KB diff). At 50 PRs/day, that's USD 1–2.50/day or USD 30–75/month. Compare to 1 human reviewer at USD 10,000/month saved via faster turnaround.

Can I use the AI review results to automatically merge PRs?

Yes, but carefully. Only auto-merge if: (1) AI approval, (2) all automated tests pass, (3) no human requested changes, (4) change is low-risk (docs, non-critical tests). High-risk changes require human approval.

Further Reading