New version of the digitized Dialect Atlas of Finnish by Lauri Kettunen

Keywords: dialects, spoken language, speaker areas

In this data release paper, we present a new digitized version of the Dialect Atlas of Finnish (1940a) to promote its use in both linguistic and interdisciplinary research. The Atlas was compiled by Lauri Kettunen (1885–1963) in the 1920s and 1930s as a part of an international trend of collecting the variation of local languages into atlases. At the time of data collection, some of the Finnish speaking municipalities were located outside the area today known as Finland–in Sweden, Norway and Russian Karelia. The Atlas allows for studying the “natural” linguistic landscape of Finnish; the linguistic landscape mixed more after urbanization, media and mass movement of Karelian evacuees to the current Finland area during the Second World War (e.g. Lynch et a. 2022). The first version of the digitized data was prepared by Embleton & Wheeler (1997, 2000) at York University. It was further curated by the BEDLAN team and published as an undocumented version in Finnish in collaboration with the Institute for the Languages of Finland (See http://urn.fi/urn:nbn:fi:csc-kata20151130145346403821). 

Here we provide an easy-to-use and well-documented raw data of the Dialect Atlas of Finnish. It includes 213 linguistic traits organised into 213 maps, each of which describes the spatial distribution of the variants of one of one linguistic trait across 525 municipalities. Thus, the data point is a municipality (or parish). Each linguistic trait has 2–14 variants, and most municipalities have only one of these variants, but sometimes two, and in very rare cases three or four. Most of the traits describe morpho-phonological variation of the dialects: for example, how consonant gradation varies across dialects (such as the case of pata: padan/ paðan/ paran/ palan/ poan etc.) or whether a schwa-vowel exists between certain consonants (for example ‘old’ expressed as vanha vs. van(a)ha vs. vanaha). The Atlas provides no sociolinguistic information of the people interviewed. 

To facilitate the use of the digitized Atlas, we offer a complete release of the data, including English translations of the linguistic traits. To enable the spatial analysis and visualization of the dialectal data, we provide geospatial polygon information that describes the boundaries and central coordinates of the historical municipalities used in the Atlas. All this makes this valuable dialect data more findable, accessible, interoperable and reusable (FAIR Principles). The data is available on Zenodo, and a more precise data description is at https://journals.uio.no/dhnbpub/article/view/12270

Details about the resource

Content
  • Language: Finnish
  • Form: geographic information, information about language variants by parish, polygons and coordinates of speaker areas
  • Timescale: 1880–1930
Authors
Jenni SantaharjuContact person
Kaj SyrjänenData manager
Outi VesakoskiProject leader
Unni LeinoProject leader
Terhi HonkolaResearcher
Perttu SeppäResearcher
Availability

Available at

https://doi.org/10.5281/zenodo.10078078 

Contact person(s)

Outi Vesakoskioutves *at* utu.fi

Usage License

Creative Commons Attribution 4.0

Referring

Permanent Address of Dataset

https://doi.org/10.5281/zenodo.10078078 

Reference instructions

Santaharju, J., Syrjänen, K., Honkola, T., Seppä, P., Vesakoski, O., & Unni, L. (2023). New version of the digitized Dialect Atlas of Finnish by Lauri Kettunen [Data set]. In Digital Humanities in the Nordic and Baltic Countries Publications (0.1). Digital Humanities in the Nordic and Baltic Countries 8th Conference (DHNB2024), Reykjavik.