GitStar
tesseract-ocrOrganization

tesseract-ocr

@tesseract-ocr • Tesseract OCR. Use this route to separate flagship concentration from portfolio breadth before you treat a publisher as broadly strong.

Portfolio concentration

95%

Top three share

Shows whether the organization is driven by one breakout repo or several visible projects.

Breadth

14 repos

Visible snapshot

2 repositories updated in the last 90 days.

Leading language

Unknown

Portfolio mix

Unknown (7), HTML (2), C++ (1)

Average size

6.3K

Stars per repository

Useful for distinguishing one flagship-heavy publisher from a repeatable portfolio.

Updated: 2026-04-19(38d ago)GitHub API fallback14 repositories

Portfolio Shape

95%

of the visible star count comes from this organization's top three repositories.

Average Repository Size

6.3K

stars per repository in this same snapshot.

Current Mix

Unknown

is the most common language here, with 2 repositories updated in the last 90 days.

Why this rank

This organization stands out because one flagship repo drives 84% of its visible star count.

Flagship share 84%Breakout repo: tesseract

Organization pages work best when you separate portfolio breadth from flagship concentration. In tesseract-ocr's case, the visible top three repositories account for about 95% of total stars in this snapshot, which helps explain whether the organization is known for one breakout project or for a broader repeatable portfolio.

The dominant language mix here is Unknown (7), HTML (2), C++ (1). That makes this page useful not just for popularity checks, but also for seeing what technical shape an organization's public ecosystem actually has.

Source: GitHub API fallback. This is the same cache-first snapshot used by the organization ranking list, so the summary view and the detail view should stay aligned.

Top Repositories

#RepositoryLanguage⭐ Stars
1tesseract-ocr/tesseract

Tesseract Open Source OCR Engine (main repository)

C++74.3K
2tesseract-ocr/tessdata

Trained models with fast variant of the "best" LSTM models + legacy models

7.5K
3tesseract-ocr/tessdoc

Tesseract documentation

HTML2.4K
4tesseract-ocr/tessdata_best

Best (most accurate) trained LSTM models.

1.6K
5tesseract-ocr/langdata

Source training data for Tesseract for lots of languages

868
6tesseract-ocr/tesstrain

Train Tesseract LSTM with make

Python720
7tesseract-ocr/tessdata_fast

Fast integer versions of trained LSTM models

601
8tesseract-ocr/docs

Various documents related to Tesseract OCR

267
9tesseract-ocr/langdata_lstm

Data used for LSTM model training

126
10tesseract-ocr/tesseract-ocr.github.io

Tesseract documentation

Ruby75
11tesseract-ocr/tessconfigs

Tesseract Config files

Makefile36
12tesseract-ocr/test

Repository for tesseract testing

Shell35
13tesseract-ocr/tessdata_contrib

User contributed (non Google) OCR models for Tesseract

31
14tesseract-ocr/tessapi

Tesseract source code and API documentation

HTML13

Next step after the organization read

Open a flagship repository, compare a couple of portfolio leaders, or return to the organization map when you want a broader concentration read.

Learn and methodology

Keep trust-building context reachable, but behind the first data read instead of ahead of it.

How to read this organization snapshot

Total stars are useful as a discovery signal, but they do not tell you whether a team maintains every repository equally. Pair this page with release cadence, maintainer activity, and the flagship concentration shown above before making adoption decisions.

For broader background on GitStar's ranking logic and editorial guidance, see Methodology & Editorial Standards.