Project description: Unterschied zwischen den Versionen
Aus Kallimachos
| Zeile 57: | Zeile 57: | ||
===Printshop-specific character inventories=== | ===Printshop-specific character inventories=== | ||
[[File:CollageOCR.png|thumbnail|Erstellung von Typentabellen am Beispiel des Teilprojekts [[Narragonien]].| link=http://kallimachos.de/kallimachos/images/kallimachos/0/03/CollageOCR.png | alt=collage of different letter inventories]] | [[File:CollageOCR.png|thumbnail|Erstellung von Typentabellen am Beispiel des Teilprojekts [[Narragonien]].| link=http://kallimachos.de/kallimachos/images/kallimachos/0/03/CollageOCR.png | alt=collage of different letter inventories]] | ||
The OCR-Team at Würzburg University´s central library accompanies and evaluates the development process at the DFKI with the help of existing tools stemming from the EMOP project (''Franken+, Gamera, Tesseract''). With the help of our specially developed tool ''Glyph Miner'', specific inventories of letters are compiled for historic printers and publishers and coupled with a digital MUFI font type. These inventories allow for the creation of printer-specific training data for OCR, which can then be re-used to capture further texts using the same sets of letters. With this | The OCR-Team at Würzburg University´s central library accompanies and evaluates the development process at the DFKI with the help of existing tools stemming from the EMOP project (''Franken+, Gamera, Tesseract''). With the help of our specially developed tool ''Glyph Miner'', specific inventories of letters are compiled for historic printers and publishers and coupled with a digital MUFI font type. These inventories allow for the creation of printer-specific training data for OCR, which can then be re-used to capture further texts using the same sets of letters. With this printshop-specific approach, we are already able to reach recognition rates of 93% and higher, which has not been reached on similar types of texts before. | ||
<br clear=all> | <br clear=all> | ||