Three pioneering Yiddish handwriting‑recognition models – including our flagship model, Mame Loshn Maven – are helping the L’Dor V’Dor Foundation’s AI Lab unlock Jewish records at scale and make Jewish memory more accessible.
Why Bubbe’s stories matter for AI
If you’ve ever sat at a family table while your Bubbe launches into a story in Yiddish, you know the feeling: everyone smiles, laughs in the right places… and half the people around the table don’t actually understand what she’s saying.
The same thing is happening on paper. Across archives, libraries, and family attics there are millions of handwritten Yiddish letters, pinkasim, community records, and memoirs that are “speaking” to us, but for most of us they might as well be a speech bubble full of squiggles.
Our flagship Yiddish handwriting model, Mame Loshn Maven (Mame Loshn for short), was built to remove those question marks. As part of the L’Dor V’Dor Foundation’s AI Lab, Mame Loshn and her two sister models were trained on historical Yiddish sources – e.g. family letters, JDC relief requests, 19th‑century metrical records from the Russian Empire, and more – and can be used to transcribe virtually any handwritten Yiddish document into machine‑readable, searchable text. Because they run on Transkribus, anyone with a web browser can try them; no special infrastructure required.
Why handwritten Yiddish matters – for everyone
For genealogists, historians, researchers, and anyone who wants to discover their ancestors and the stories of their lives, handwritten Yiddish documents are a goldmine. They contain names and relationships, addresses, occupations, migration stories, and the daily texture of Jewish life – the details that help us understand who and what came before us, and how that shapes who we are today.
The L’Dor V’Dor Foundation’s AI Lab exists to tackle this challenge. Our mission is to use AI to automate the hardest, most time‑consuming parts of discovery – wading through huge collections, spotting what’s relevant for Jewish lives, and turning it into searchable, accessible Jewish memory for projects like the Documentation of Jewish Records Worldwide (DoJR) and its catalog, JCat. As we make more records discoverable, we strengthen personal and Jewish identity: the future belongs to those who know the past.
To do that at scale, we need machines that can read the kinds of handwritten Yiddish documents that people actually encounter in the wild – not just clean, classroom examples.
Three Yiddish handwriting models, built for historical records
In 2025, we released three pioneering Yiddish handwriting‑recognition (HTR) models on the Transkribus platform. Each is trained on a different type of historical real‑world record, so together they cover a broad range of handwriting styles, genres, and dialects.
Mame Loshn Maven – all‑purpose Yiddish handwriting model
Mame Loshn Maven is our all‑purpose Yiddish handwriting model. It merges the training sets of Civil Records Reader and Letter Reader to deliver robust performance across a variety of genres and dialects. Use it when you don’t know what you’ll encounter — or as a springboard to create your own specialist models.
- Trained on: the combined training data behind Civil Records Reader and Letter Reader (including civil records, family letters, relief requests, and more).
- Ideal for: anyone who deals with handwritten Yiddish documents on diverse topics that require a broad vocabulary.
Letter Reader – early 20th‑century Yiddish in everyday life
Letter Reader focuses on early 20th‑century Yiddish by authors with various levels of education and exposure to Yiddish. It handles informal orthography and a wide range of regional vocabulary.
- Trained on: family letters; letters to relief societies such as the Grajewo Relief Society and the American Jewish Joint Distribution Committee (JDC); minutes from Lithuanian Jewish community groups; and handwritten plays.
- Ideal for: social historians, family researchers working with private papers, and archives with handwritten correspondence.
Civil Records Reader – high‑fidelity model for 19th‑century records
Civil Records Reader is a high‑fidelity Transkribus model for 19th‑century Yiddish handwriting, especially for the birth and death records from Russian Empire civil record books. Trained on hundreds of volumes from civil records of the Pale of Settlement, it delivers reliable, hallucination‑free accuracy on the documents that Jewish genealogists consult most.
- Trained on: hundreds of 19th‑century civil records books from the Pale of Settlement.
- Ideal for: Jewish genealogists (professional and amateur), historians of the Pale of Settlement, archivists, and anyone processing large collections of vital records.
Because all three models run on Transkribus, anyone with a browser – not just people with high‑performance computing – can upload images and get machine‑readable Yiddish they can search, analyze, and translate.
Watch: “Reading Bubbe’s Letters with AI: Unlocking Yiddish & Jewish History”
To mark this milestone, our AI Lab team presented a deep‑dive talk, “Reading Bubbe’s Letters with AI: Unlocking Yiddish & Jewish History,” explaining how we built the models, the challenges of messy handwriting, and how the work connects to DoJR and JCat.
In the talk, we walk through:
- the scale of the problem (those “millions of pages” of handwritten Yiddish)
- examples of truly oy vey penmanship and how the models cope with it
- how we curate and annotate training data from letters, relief requests, metrical records, and more
- and how volunteers, scholars, technologists, and curious families can join the effort
What makes these models different?
Our contribution is not just “Yiddish + AI.” It’s the combination of three things that sets this work apart:
- Trained on diverse historical records
The models are trained on family letters, JDC relief requests, 19th‑century metrical records from the Russian Empire, and more – the messy, inconsistent documents people actually work with, not only clean manuscripts. - Designed for Jewish memory and identity
We focus on materials that support Jewish memory work and the search for one’s ancestors and their stories. That means prioritizing sources rich innames, relationships, locations, dates, and the stories of Jewish lives, and connecting the outputs to DoJR and JCat so discoveries feed strengthened identity for individuals, families, and communities. - Accessible to anyone via Transkribus
By delivering the models through an easy‑to‑use platform, we remove the need forarchives, families, and other researchers to set up their own AI infrastructure. If you can use a browser, you can use our models.
In short: we build Yiddish HTR models trained on diverse, historical, real‑world records and deliver them through an accessible platform, so anyone – from professional historians to curious grandchildren – can use AI to reconnect with Yiddish and Jewish history.
How to try the models
For archives and libraries
If you steward collections of handwritten Yiddish documents, our models can help you triage and describe material faster. Use Transkribus to run sample batches, compare results across the three models, and identify which ones perform best on your handwriting styles. Reach out to the AI Lab if you’d like guidance on workflows or quality checking.
For family researchers and anyone curious about their roots
You don’t need to be a technologist to use these tools. Scan your family letters or other Yiddish documents, upload the images to Transkribus, and select the appropriate L’Dor V’Dor Yiddish model (including Mame Loshn Maven). The platform will return searchable transcriptions that you can copy, analyze, and translate – making it much easier to spot names, places, and stories that echo through your family.
For volunteers and collaborators
We’d love you to join the team. As the final slide of our social media carousel says, Help us teach AI to read Jewish history. You can watch the talk, try the public Yiddish models, or volunteer your time and expertise. We need Yiddish readers, data annotators, technologists, genealogists, historians, and community advocates. Visit ldvdf.org/ai to learn more and get involved.
What’s next: more languages of the Diaspora
While this post focuses on Yiddish, our AI Lab has already developed handwriting models for Polish, Russian, Hebrew, and Arabic, and we are now working on Ladino. Over time, we will continue developing models for the languages of the Diaspora, so that records from many communities and continents can be discovered, read, and woven back into Jewish memory.
As these models mature, they’ll help us scale discovery across diverse collections and feed more data into DoJR and JCat – not just for “genealogists,” but for anyone who wants to know who and what came before them, and to carry that knowledge forward from generation to generation.
###
About L’Dor V’Dor Foundation
The L’Dor V’Dor Foundation (LDVDF) rescues Jewish memory and makes it accessible to everyone. Through its flagship Documentation of Jewish Records Worldwide (DoJR) project, LDVDF is building JCat, a massive, free, online catalog of historical documents of Jewish lives – Ashkenazi, Sephardi, Mizrahi, Crypto‑Jewish, Rabbinic, and more. By discovering and describing every record collection we can find, LDVDF helps ensure that Jewish heritage can be found, studied, and passed from generation to generation.
Rescuing our lost history and changing lives — from generation to generation.


