Linguistic Variation in the Province of Satakunta in the 21st Century

Keywords: dialects, spoken language

Sapu*, a sociolinguistic research project at the University of Turku, started in 2007. Since the beginning the goal of the project was to study the current dialects, regional and social variation around the province of Satakunta, which is one of the historical provinces and administrative units in Finland. Yet, dialectically the province is divided into two; the dialectal boundary between 1) the Southwestern Dialects and 2) the Dialects of Häme and the Transitional Dialects is traditionally considered one of the most distinctive in Finland. The project was funded by the Finnish Cultural Foundation and the Turku University Foundation.

To create a representative overview of current linguistic situation, an extensive corpus of data from the people living along the historical dialect border was recorded 2007–2013 and 2016–2019 (262 hours of recordings, of which 213 hours has been transcribed). This dataset consists of these recordings and transcriptions. However, there is also grammatically (lemmatized, morphologically and syntactically annotated) annotated Sapu corpus available. The annotated Sapu corpus consists of 36 units (six informants representing different age cohorts from six selected dialects).

*Sapu is the abbreviation for Satakuntalaisuus puheessa (’Satakunta in the Speech’) the Finnish name of the project. The official name for the project in English is "Linguistic Variation in the Province of Satakunta in the 21st Century"

Details about the resource

Content
  • Language: Finnish
  • Form: audio, transcriptions, database with background information about informants
  • Genre: interviews, situation recordings
  • Dataset size: 270 texts, 262 hours of audio
  • Timescale: 2007–2019
Annotations
  • lemmatisation
  • morphology
  • syntax
Authors
Tommi Kurkifounder and principal investigator (PI)
Kirsti Siitonenfounder and member of the steering committee
Nobufumi Inabaresearcher
Availability

The dataset is available by contacting the person(s) below

Contact person

Tommi Kurki tommi.kurki *at* utu.fi

Other notices

contains delicate personal data