Skip to content
Repository Radar

PaddleOCR

ATR ACTIVE STEADY

PaddlePaddle/PaddleOCR · homepage ↗

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

  • ocr
  • chineseocr
  • pdf2markdown
  • pp-ocr
  • pp-structure
  • document-parsing
stars 84k
last activity 6d ago
open issues 150
language Python
license Apache-2.0
latest release v3.7.0
momentum · per month since covered + 3k/mo (+4%/mo) · + 21k total since PR#21

metrics as of today

star history

Exact curve on star-history.com ↗
PR#21 · 64k84k★ now May 2020Jul 2026
  1. PR#21 64k★ 2025-11-12
  2. now 84k★ + 21k since first covered

curve is sampled from GitHub's star history; the dashed stretch is before we first covered it, the solid line since. figures at coverage are the numbers we printed then (approx.), current count is live.

covered in

  • PR#24 2025-12-22
  • PR#21 2025-11-12 above the radar

    Industry-grade OCR & document AI toolkit

// comments

COMING SOON

Sign in with GitHub to weigh in on PaddleOCR. We're wiring this up; check back soon.