PaddleOCR
ATR ACTIVE STEADY
PaddlePaddle/PaddleOCR · homepage ↗
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
- ocr
- chineseocr
- pdf2markdown
- pp-ocr
- pp-structure
- document-parsing
stars 84k
last activity 6d ago
open issues 150
language Python
license Apache-2.0
latest release v3.7.0
momentum · per month since covered + 3k/mo
(+4%/mo) · + 21k total since PR#21
metrics as of today
star history
Exact curve on star-history.com ↗- PR#21 64k★ 2025-11-12
- now 84k★ + 21k since first covered
curve is sampled from GitHub's star history; the dashed stretch is before we first covered it, the solid line since. figures at coverage are the numbers we printed then (approx.), current count is live.
covered in
-
Industry-grade OCR & document AI toolkit
// comments
COMING SOONSign in with GitHub to weigh in on PaddleOCR. We're wiring this up; check back soon.