Sentiment Below the Surface: Omissive and Evocative Strategies in Literature and Beyond

Abstract

As they represent one of the most complex forms of expression, literary texts continue to challenge Sentiment Analysis (SA) tools, often developed for other domains. At the same time, SA is becoming an increasingly central method in literary analysis itself, which raises the question of what are the challenges inherent to literary SA. We address this question by probing units from a variety of literary fiction texts where humans and systems diverge in their valence scoring, seeking to relate such disagreements to semantic traits central to implicit sentiment evocation in literary theory. The contribution of this study is twofold. First, we present a corpus of valence-annotated fiction -- English and Danish language literary texts from the 19 th and 20 th centuries -- representing different genres. We then test whether sentences where humans and models disagree in sentiment annotation are characterized by specific semantic traits by looking at their distribution and correlation across four different corpora. We find that items where humans detected significant sentiment, but where models did not, consistently employ lower levels of arousal, dominance and interoception, and higher levels of concreteness. Furthermore, we find that the amount of human-model disagreement correlated with semantic aspects is linked to the interiority-exteriority continuum more than with direct sensory information. Finally, we show that this interaction of features linked to implicit sentiment varies across textual domains. Our findings confirm that sentiment evocation exploits a more diverse and subtle set of semantic channels than those observed through simple sentiment analysis.