EvalView

BTR ACTIVE STEADY

Regression testing for AI agents. Snapshot behavior,diff tool calls,catch regressions in CI. Works with LangGraph, CrewAI, OpenAI, Anthropic.

agent-benchmark
agent-evaluation
ai-agents
crewai
evaluation
langgraph

stars 118

last activity 17d ago

open issues 6

language Python

license Apache-2.0

latest release v0.8.0

momentum · per month since covered + 8/mo (+7%/mo) · + 4 total since PR#36

metrics as of today

star history

Exact curve on star-history.com ↗

PR#36 114★ 2026-06-17
now 118★ + 4 since first covered

curve is sampled from GitHub's star history; the dashed stretch is before we first covered it, the solid line since. figures at coverage are the numbers we printed then (approx.), current count is live.

covered in

PR#36 2026-06-17 below the radar

A behavior regression gate for agents, the Playwright for AI agents

// comments

COMING SOON