Turku 'Pavlik Morozov' Corpus

Keywords: parallel texts, literary language

V. Gubarev's youth novel Pavlik Morozov in the following languages: Erzya (two versions), Moksha, Hill Mari, Meadow Mari, Udmurt, Komi-Permyak, Komi, Hungarian (two versions), Khanty, Mansi, Finnish; Russian; Chuvash, Tatar.

Size: ca. 10,000 words, 1,600 aligned sentences per language version.

The corpus is accessible through Finno-Ugric Corpora portal.

Details about the resource

Content
  • Language: Russian, Finnish, Erzya, Moksha, Meadow Mari, Hill Mari, Udmurt, Komi-Permyak, Komi, Khanty, Mansi, Hungarian, Chuvash, Tatar
  • Form: written language
  • Genre: fiction
  • Dataset size: 10,000 words, 1,600 aligned sentences per language version
Authors
Jorma Luutonen et al.coordinator
Availability

Contact person

Jussi Ylikoskivolgaserver *at* utu.fi