Finnish Dialect Syntax Archive
Keywords: dialects
The Finnish Dialect Syntax Archive is a corpus consisting of transcribed dialect interviews. It covers all the dialect areas of Finland and is the first Finnish corpus that was morphologically and syntactically coded. The dialect corpus was developed in collaboration by the University of Turku and the Institute for the Languages of Finland.
The corpus represents every dialect of Finnish spoken in present-day Finland as well as dialects that were spoken in areas surrendered to the Soviet Union after World War II prior to their loss.
The interviewees are born between 1860 and 1910 (most of them in the 1880s) and the interviews were done in the 1950s, 1960s and 1970s when the interviewees were on average 80 years old.
From each parish dialect one recording of approximately one hour in length has been chosen. The interviews are transcribed as well as morphologically and syntactically coded.
Details about the resource
- Language: Finnish
- Form: audio, transcriptions
- Genre: interview
- Dataset size: 133 hours of audio, 133 texts (70,190 sentences, 1,078,183 words)
- Timescale: 1952–1964
Approximately 85% of recordings are from the 1960s: from the 1950s there are 9 recordings, from the 1960s 119 recordings and from the 1970s 14 recordings.
- lemmatisation
- morphology
- syntax
| Osmo Ikola | founder and original principal investigator (PI) |
| Nobufumi Inaba | senior researcher, Finnish Syntax Archive |
| Marja-Liisa Helasvuo | chairperson of the steering committee (Finnish Syntax archive) and chair of the department |
| Tommi Kurki | Member of the steering committee, Finnish Syntax Archive |
Available at
Contact person
| Nobufumi Inaba | ninaba *at* utu.fi |