in English
 
 
Computational History and the Transformation of Public Discourse in Finland, 1640-1910
​The consortium Computational History and the Transformation of Public Discourse in Finland, 1640–1910 (COMHIS) is based on the cooperation of four partners, The Faculty of Humanities at the University of Helsinki, the Departments of Cultural History and Information Technology at the University of Turku and the Centre for Preservation and Digitisation of the National Library of Finland. This brings together relevant complementary expertise on the research subject (eighteenth- and nineteenth-century history), methodology (computational sciences and language technology) and data (the preservation and enhancement of digital resources). The consortium establishes national collaboration in a key research area that is further linked with the very best international research centers via already existing collaborations.

The objective is to reassess the scope, nature, development, and transnational connections of public discourse in Finland, 1640–1910. Two complementary approaches will be utilized, one based on the use of library catalogue metadata and the other based on the full text-mining of all the digitized Finnish newspapers and journals published before 1910. In previous research, public discourse in Finland has been largely approached from the perspective of the breakthrough of the Finnish language, the role of elite discourse at the university, early Swedish-language newspapers, and book history. COMHIS combines all these perspectives, and analyzes further how language barriers, elite culture and popular debate, text reuse as well as different publication channels interacted. Earlier historians have not been able to analyze, for example, the entire record of Finnish publications, including newspapers. The point of this undertaking is to begin to fill this gap with multidiscilinary collaboration and identify overlooked moments of transformation in Finnish public discourse. Such extensive materials as Finnish newspapers and journals do not exist in digitized form in many other European contexts. COMHIS has thus a unique opportunity to carry out pioneering work. To conduct its research, COMHIS introduces the concept of open data analytical ecosystems which represents a key methodological innovation in the field of digital humanities. 

Contact: Hannu Salmi, hansalmi(a)utu.fi

Publications:

  • Kimmo Kettunen, Tuula Pääkkönen: “Measuring Lexical Quality of a Historical Finnish Newspaper Collection? Analysis of Garbled OCR Data with Basic Language Technology Tools and Means”, LREC 2016.
  • Kimmo Kettunen, Eetu Mäkelä, Juha Kuokkala, Teemu Ruokolainen, Jyrki Niemi: “Modern Tools for Old Content - in Search of Named Entities in a Finnish OCRed Historical Newspaper Collection 1771-1910”, LWDA 2016: 124-135.
  • Tuula Pääkkönen, Jukka Kervinen, Kimmo Kettunen, Asko Nivala, Eetu Mäkelä: “Exporting Finnish Digitized Historical Newspaper Contents for Offline Use”, D-Lib Magazine 22(7/8) (2016).
  • Mikko Tolonen, Jani Marjanen, Niko Ilomäki, Hege Roivainen and Leo Lahti, “Printing in a Periphery: a Quantitative Study of Finnish Knowledge Production, 1640-1828”, Proceedings of Digital Humanities 2016, long papers, Kraków, Poland, July, 2016
  • Mikko Tolonen, Leo Lahti and Niko Ilomäki, “A Quantitative Analysis of History in the ESTC catalogue”, Liber Quarterly, 25(2), pp. 87–116, 2016. DOI: http://doi.org/10.18352/lq.10112
  • Aleksi Vesanto, Asko Nivala, Tapio Salakoski, Hannu Salmi and Filip Ginter:  “A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora”,  In Proceedings of the 21st Nordic Conference of Computational Linguistics. Gothenburg, Sweden, 23–24 May 2017 (Linköping 2017), s. 330–333, http://www.ep.liu.se/ecp/131/049/ecp17131049.pdf.
  • Aleksi Vesanto, Asko Nivala, Heli Rantala, Tapio Salakoski, Hannu Salmi and Filip Ginter: “Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771-1910”, In Proceedings of the 21st Nordic Conference of Computational Linguistics. Gothenburg, Sweden, 23–24 May 2017 (Linköping 2017), s. 54–58, http://www.ep.liu.se/ecp/133/010/ecp17133010.pdf
Open code and software:
  • https://comhis.github.io/
  • https://github.com/avjves/textreuse-blast

Asiasana:
Tagit: