A paper on the challenges of reversing and revising the LSJ
so that an English to Ancient Greek version can be produced and imported in LSJ.gr
. The paper was presented at the 12th Conference “Hellenic Language and Terminology”. Download the presentation
Liddell-Scott-Jones (LSJ) is a standard lexicographical work of the Ancient Greek language available online in a number of different incarnations. Its directionality is from Ancient Greek to English. What if one wants to search from English to Ancient Greek? The Perseus Project, a seminal and authoritative electronic source, provides a functionality whereby a reverse search is possible, based on a simple term-to-translation(s) logic, devoid of any further processing.
The above approach is a far cry from being satisfactory and is subject to a number of pitfalls which this paper aims to explore and provide a framework for their remediation on a linguistic and computational level. Some of the types of issues identified:
1. Missing term elements (“commander of a” for “τελάρχης”)
2. Missing Greek-derived equivalents (no “cephalalgia” in “κεφαλαλγία” and no “pankration” in “παγκράτιον”)
3. Use of Greek as part of the translation (for example “παρανυμφεύω” rendered as “act as παράνυμφος”)
4. Use of anaphora (“σκοτωματικός” as “causing dizziness | suffering from it”)
5. Use of Latin instead of English, especially for taboo words (“crepitus ventris” for “ἔριθος”)
6. Use of old English (“shew” instead of “show”, “connexion” instead of “connection”)
7. Use of dash inbetween words (“to-morrow”)
8. Abbreviated forms of the headword in phrases resulting in inflectional ambiguity (“κλυτὰ δ. βένθεσι λίμνης”)
9. Incomplete example phrases with mid-phrase ellipsis (“τὸ ὕδωρ… αὐ. μὲν οὔκ ἐστι”)
10. Typos and linguistic errors (“ἐντονία” instead of “εὐτονία”)
A script was created to extract the term/translation equivalents from the xml file. Phase I consisted of a) analysis of the output in Ancient Greek to English format b) identification and categorization of the issues and c) a plan for their remediation. Phase II was an analysis and further revision of the reversed material. Phase III was preparation and publication of the output in wiki format as an interactive supplemental resource to LSJ proper.