Archive of Finnish and Finno-Ugric Languages

The Archive of Finnish and Finno-Ugric languages is a research archive specialized in developing and maintaining data collections of Finnish and other Finno-Ugric languages.

The Department of Finnish and Finno-Ugric languages specializes in developing research infrastructures by producing data and corpora. The Department is an internationally acknowledged producer of annotated digital corpora of Finnish and of related languages. The data and corpora are maintained by the Archive of Finnish and Finno-Ugric languages. The metadata are also available at the Digilang Language Resource Portal. They are available to researchers and students in Finland and internationally.

The Archive of Finnish and Finno-Ugric Languages is part of the UTU-Digilang research intrastructure which brings together digital language resources and language technology tools developed in the School of Languages and Translation Studies and in the Department of Computing at the University of Turku.

Finnish data and corpora

The Finnish language corpora are maintained at the Syntax Archive. The majority of the corpora have been grammatically annotated. The Finnish Language Recording Archive houses digital audio and video data available to researchers and students.

> Read more about the Syntax Archive

Finno-Ugric Language corpora

The Research Unit for Volgaic Languages has developed several digital language corpora of Finno-Ugric languages in the Volga region. The corpora include Mari, Mordvin, Udmurt and Komi, as well as corpora of the contact languages Chuvash and Tatar. 

> Read more about the Research Unit for Volgaic Languages