EvalView
BTR ACTIVE STEADY
hidai25/eval-view · homepage ↗
Regression testing for AI agents. Snapshot behavior,diff tool calls,catch regressions in CI. Works with LangGraph, CrewAI, OpenAI, Anthropic.
- agent-benchmark
- agent-evaluation
- ai-agents
- crewai
- evaluation
- langgraph
stars 118
last activity 17d ago
open issues 6
language Python
license Apache-2.0
latest release v0.8.0
momentum · per month since covered + 8/mo
(+7%/mo) · + 4 total since PR#36
metrics as of today
star history
Exact curve on star-history.com ↗- PR#36 114★ 2026-06-17
- now 118★ + 4 since first covered
curve is sampled from GitHub's star history; the dashed stretch is before we first covered it, the solid line since. figures at coverage are the numbers we printed then (approx.), current count is live.
covered in
-
A behavior regression gate for agents, the Playwright for AI agents
// comments
COMING SOONSign in with GitHub to weigh in on EvalView. We're wiring this up; check back soon.