Tutorial anonymus translators (en): Unterschied zwischen den Versionen
Aus Kallimachos
| Zeile 5: | Zeile 5: | ||
===Composition of a text corpus=== | ===Composition of a text corpus=== | ||
The aim of | The aim of research for '''[[Identifikation_von_Übersetzern:Main |the project in]]''' is the identification of anonymous arabic-latin translations in medieval times by means of philological and computer-aided methods of style analysis. | ||
For this purpose, a corpus of electronic latin texts must be constructed. It's advisable to restrict | For this purpose, a corpus of electronic latin texts must be constructed. It's advisable to restrict the corpus to a certain arabic author, e.g. Averroes, or to a technical discipline, e.g. philosophy, astronomy/astrology, medicine, mathematics, alchemy/macig/prophecy or religion. | ||
However, this is only possible if the corpus is large enough. At Wuerzburg University an Averroes-based corpus (Hasse 2010) and two corpora with philosophical and astronomical/astrological translations of 12 century were formed and employed (Hasse 2016 and Hasse-Büttner in print). Herein, we were able to benefit from a list of philosophical arabic-latin translations already provided by Burnett in 2005, as well as Carmody in 1956 with a list of astronomic-astrologic translations (which are imprecisely and obsolete, though). In other branches of science, such lists have yet to be created. | |||
Translations are available in very different text formats: Some are critically edited, others are only available in earlier printings or only in medieval handwritings. The OCR of modern editions is largely unproblematic. A relieable OCR of early printings, where the computer has to "learn" the officin's characters, is currently a subject of University of Wuerzburg and DFKI Kaiserslautern. At present, it's still advisable to transcribe early printings manually. With hand writings, the manual transcription will be the only viable option for a long time. | |||
A preferable textual witness should be chosen, which is especially one who provides a complete and non-revised text (latin authors of early printings are listed at Hasse, Success and Suppression, 2016, S. 317-407). | |||
It's highly recommended to systematically seperate and index scans and the files produced due to further processing. This can be done simply by using seperated subfolders and seperatly managed spreadsheet or by means of a wiki program. This step may seem self-explanatory, but is also overlooked quite easily. The following aspacts should always be distinguished: | |||
# the bibliographic mark of origin | |||
# the scan | |||
# the fully searchable and quotable scan | |||
# a text cleaned of all non-textual features (page numbers, critical apparatus etc.) | |||
# a normalized orthographic text made for stylometry (e.g. as a simple text file) | |||
# scan | |||
# | |||
# a text cleaned of all non-textual | |||
# a normalized orthographic text made for stylometry (e.g. as simple file | |||
=Aufbereitung der Texte= | =Aufbereitung der Texte= | ||