Small Models for GLAM

Team

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

davanstrien updated a Space about 1 month ago

small-models-for-glam/index-card-extractor

davanstrien published a Space about 1 month ago

small-models-for-glam/index-card-extractor

davanstrien updated a collection about 2 months ago

Index cards to structured data

View all activity

Organization Card

Community About org cards

Small Models for GLAM

Most of what gets done in libraries, archives and museums runs on a long tail of small, repetitive jobs — backlogs to clear, scans to make searchable, metadata to tidy. A good chunk of that work can be handled by small, task-specific models, and the people who know what those tasks are are the people working in those institutions.

This org is a place to put the models that come out of that work, so the next institution facing the same problem doesn't start from scratch.

Each model here builds on something. Most are fine-tunes of open foundation models — YOLO, DETR, BERT, Qwen-VL — trained on community datasets, often from BigLAM or contributed by individual institutions. Several extend existing community-trained models for new collections rather than starting over: index-card-detector-v5 takes the National Library of Scotland's archival card detector and extends it to three additional archives. That extension pattern matters — it's how this kind of work gets cheaper for everyone over time.

Recipes for most of the models live in AI Patterns for GLAM; The Case for Boring AI and Beyond Chatbots set out the why.

How the models get built

Mostly with agentic workflows: an agent handles the data prep, training, and packaging; a human stays in the loop for the parts that matter — label review, evaluation, deciding whether something is good enough to release.

Share a model, or suggest one

If you've trained a small task-specific model for your own collection, share it in Discussions and we'll add good ones to a curated collection so other institutions can find them. Suggestions for tasks you'd like to see covered are welcome there too.

Maintained by Daniel van Strien and William Mattingly, with contributions and datasets from across the GLAM ML community.