University Library

Project OCR-BW

Digitized reproductions accompanied by full texts are a valuable source of material for researchers and can be used as a basis for applications in data science or digital humanities. The OCR-BW center of excellence which was set up in 2019 as a service offered by Tübingen and Mannheim university libraries uses automatic text recognition software to assist scientists, libraries and archives in Baden-Württemberg.

Mannheim University Library provides several open source software products such as Tesseract OCR as a user-friendly application for historic prints that works on all systems. Tübingen University Library is testing the potential for use of the transcription platform Transkribus for automatic recognition of manuscripts and prints. Together it is possible to cover a broad spectrum of materials and scripts. The center of excellence develops and imparts extensive expertise in the field of automatic text recognition.

If you are wanting to process large quantities of scans of manuscripts, archival records or old prints for your research or project and are looking for a way to make them readable and searchable more quickly and efficiently, we are happy to advise about the potential use of Transkribus and Tesseract.

Through the project, we are steadily transforming university library and archive holdings into digital reproductions with full text on our presentation platform:

  • Journals of the Tendaguru expedition by Tübingen geologist and paleontologist Edwin Hennig (1909–1911)
  • Journals of Tübingen classicist Martin Kraus (Crusius) (1573–1605)
  • Transcripts of Greek homilies by Martin Kraus (Crusius) (1563–1604)
  • Selected volumes of judicial consultations (1602–1879)
  • Selected volumes of minutes of the senate (1524–1912)
  • Selected incunabula
  • Selected Medieval manuscripts
  • Manuscripts and prints in Malayalam

You can find more information on our website, at information events and training schemes as well as via our mailing list.

Funded by