Fine-Tuning Pre-Trained Language Models for Authorship Attribution of the Pseudo-Dionysian Ars Rhetorica

Abstract

This paper explores the use of pre-trained language models for Ancient Greek in the context of authorship attribution. The study adopts a two-step approach: first, the models are fine-tuned on a domain-specific corpus using a masked language modeling (MLM) objective; second, based on the fine-tuned model, a classifier is trained to address the authorship attribution task. The analysis centers on a corpus of texts on rhetorical theory from the Second Sophistic period, with particular focus on the Pseudo-Dionysian Ars Rhetorica. The results of the experiment suggest that this approach offers valuable insights into the authorship of ancient texts. Notably, the findings align with some traditional scholarly views on the Ars Rhetorica while also opening the door to reconsidering long-discarded hypotheses about the treatise's internal structure. This study highlights how the integration of natural language processing and classical philology can significantly advance discussions in ancient literary scholarship.