Rediscovering the 1890s: A Norwegian Poetry Corpus

Abstract

What language was used most for written poetry in the 1890s in Norway? What was the gender distribution of published poets in that decade? To what degree does the lyrical subject "reveal" itself, explicitly or implicitly, in the poems? This article presents answers to these questions in relation to corpus of poetry written by Norwegian and Danish poets, published between 1890 and 1899. The corpus contains 3,440 poems from 81 books, encoded in TEI XML, and annotated with author and publication metadata, as well as lyrical features such as rhyme schemes and presence of the lyrical subject. The objective of constructing this corpus is to fill a resource gap for research on Norwegian poetry. It provides empirical data for investigating historical claims about the literary period, and comparing findings from both close and distant readings of the data. It is structured in a well-known format (TEI) that enables testing and development of automatic poetry analysis tools, as well as the addition of further annotations. We have published the data in a public repository, making it easily findable, accessible, interoperable and resuable (FAIR).