Volgan alueen kielten tutkimusyksikkö_aineistot

Research materials at the Research Unit for Volgaic Languages

The Research Unit for Volgaic Languages has a collection of electronic language corpuses that is constantly being developed. In the future, the materials will be made available through the joint material portal of the School of Languages and Translation Studies.

For the time being, those wishing to use the electronic corpuses must request access from the project coordinator by sending an email to volgaserver@utu.fi. Access rights are usually granted to researchers, teachers and students of universities and research institutes. 

The word lists of different languages compiled by the Research Unit are available through the website of Finno-Ugrian Society.

The materials can be divided into the following types: 

  • unannotated texts (plain texts without grammatical analyses)
  • grammatically annotated texts (all words include morphological tags)
  • parallel texts (the same text in many languages)
  • corpuses of the history of literary languages (texts from different decades)
  • word lists (words with word-class tags, but no semantic information)

The available materials listed by language

Mari
  • unannotated texts (mainly Meadow Mari)
  • a corpus of the history of the Meadow Mari literary language (texts from different decades between 1909 and 2008)
  • parallel texts: a youth novel (Meadow Mari, Hill Mari) and a book about Finland (Meadow Mari)
  • a word list containing Meadow Mari and Hill Mari vocabulary
Mordva
  • unannotated texts (Erzya and Moksha)
  • morphologically annotated texts (Erzya and Moksha)
  • a corpus of the history of the Erzya and Moksha literary languages (texts from different decades between 1920 and 2008)
  • parallel texts: a youth novel and a book about Finland (Erzya and Moksha)
  • a word list containing Erzya and Moksha vocabulary
Udmurt
  • unannotated texts
  • parallel texts: a youth novel and a book about Finland
  • a word list
Komi
  • unannotated Komi-Permyak texts
  • parallel texts: a youth novel (Komi and Komi-Permyak) and a book about Finland (Komi)
  • a word list (Komi)
Chuvash
  •  
  • unannotated texts
  • parallel text: a youth novel
  • a word list
  •  
Tatar
  • unannotated texts
  • parallel text: a youth novel
  • a word list
Other languages
  • parallel texts: a youth novel (Finnish, Khanty, Mansi, Hungarian, Russian) and a book about Finland (Finnish, Russian)