Computational Humanities Research 2024

Viability of Zero-shot Classification and Search of Historical Photos

(long paper)

Authors: Erika Maksimova, Mari-Anna Meimer, Mari Piirsalu and Priit Järv

Presented in Session 1A: Visual Arts and Art History

Abstract

Multimodal neural networks are models that learn concepts in multiple modalities. The models can perform tasks like zero-shot classification: associating images with textual labels without specific training. This promises both easier and more flexible use of digital photo archives, e.g. annotating and searching. We investigate whether existing multimodal models can perform these tasks, when the data differs from the typical computer vision training sets, on historical photos from a cultural context outside the English speaking world.