Finalize humatheque extraction prompt and task constants by gegedenice · Pull Request #19 · davanstrien/ocr-bench

gegedenice · 2026-04-14T16:12:03Z

No description provided.

…xtraction-and-evaluation Adapt benchmark setup for humatheque metadata extraction workflow

…-evaluation-w92eo2

davanstrien · 2026-06-29T11:41:36Z

Hi @gegedenice thanks for this, and sorry it sat so long!

The benchmark core has moved quite a bit since April (a few merged PRs touching run.py/cli.py), so this branch now conflicts and wouldn't apply cleanly as-is.

The bigger qs is that the PR is closely tailored to the Humathèque thesis task whereas ocr-bench core tries to stay task-general. So I don't think it should merge into the core modules in this shape.

IMO there's a really good idea in here I'd love to keep: the deterministic, reference-based scorer (exact / list / fuzzy fields) as an alternative to the LLM-as-judge path. That generalises well and it's something the benchmark doesn't have yet.

Two ways forward, whichever suits you:

a smaller, config-driven version where the task specifics (dataset, fields, vocab) live in config rather than core, plus a short note on what it does and how you checked it; or
keep the Humathèque setup as a worked example/recipe rather than in core.

let me know what you think

gegedenice added 5 commits April 14, 2026 17:48

Adapt benchmark defaults for humatheque metadata extraction

50b7918

Merge pull request #1 from gegedenice/codex/adapt-code-for-metadata-e…

e573dcc

…xtraction-and-evaluation Adapt benchmark setup for humatheque metadata extraction workflow

Finalize humatheque extraction prompt and task constants

4f69bb0

Implement field-level standard metadata evaluation metrics

d3621a6

Merge branch 'main' into codex/adapt-code-for-metadata-extraction-and…

3954964

…-evaluation-w92eo2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finalize humatheque extraction prompt and task constants#19

Finalize humatheque extraction prompt and task constants#19
gegedenice wants to merge 5 commits into
davanstrien:mainfrom
gegedenice:codex/adapt-code-for-metadata-extraction-and-evaluation-w92eo2

gegedenice commented Apr 14, 2026

Uh oh!

davanstrien commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gegedenice commented Apr 14, 2026

Uh oh!

davanstrien commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants