Finnish Dialect Syntax Archive

Keywords: dialects

The Finnish Dialect Syntax Archive is a corpus consisting of transcribed dialect interviews. It covers all the dialect areas of Finland and is the first Finnish corpus that was morphologically and syntactically coded. The dialect corpus was developed in collaboration by the University of Turku and the Institute for the Languages of Finland.

The corpus represents every dialect of Finnish spoken in present-day Finland as well as dialects that were spoken in areas surrendered to the Soviet Union after World War II prior to their loss.

The interviewees are born between 1860 and 1910 (most of them in the 1880s) and the interviews were done in the 1950s, 1960s and 1970s when the interviewees were on average 80 years old.

From each parish dialect one recording of approximately one hour in length has been chosen. The interviews are transcribed as well as morphologically and syntactically coded.

Details about the resource

Content
  • Language: Finnish
  • Form: audio, transcriptions
  • Genre: interview
  • Dataset size: 133 hours of audio, 133 texts (70,190 sentences, 1,078,183 words)
  • Timescale: 1952–1964

Approximately 85% of recordings are from the 1960s: from the 1950s there are 9 recordings, from the 1960s 119 recordings and from the 1970s 14 recordings.

Annotations
  • lemmatisation
  • morphology
  • syntax
Authors
Osmo Ikola
founder and original principal investigator (PI)
Nobufumi Inaba
senior researcher, Finnish Syntax Archive
Marja-Liisa Helasvuo
chairperson of the steering committee (Finnish Syntax archive) and chair of the department
Tommi Kurki
Member of the steering committee, Finnish Syntax Archive
Availability

Contact person

Nobufumi Inabaninaba *at* utu.fi

Other notices

contains personal data
Referring

Permanent Address of Dataset

Reference instructions

University of Turku, School of Languages and Translation Studies, & Institute for the Languages of Finland (2015). The Finnish Dialect Corpus of the Syntax Archive, Helsinki Korp Version [data set]. Kielipankki. http://urn.fi/urn:nbn:fi:lb-2016040702
 
University of Turku, School of Languages and Translation Studies, & Institute for the Languages of Finland (2021). The Finnish Dialect Corpus of the Syntax Archive, Downloadable Version [data set]. Kielipankki. http://urn.fi/urn:nbn:fi:lb-2020112935