I'm a linguist working primarily with the Uralic languages and language technology. I have worked with different archives and memory organizations, and serve as the librarian and archivist of the Finno-Ugrian Society. My research work has, besides linguistics, regularly addressed the use and digitization of archived materials.
I work as an information specialist in the National Library of Finland. My work is primarily related to the minority language support in our digital services, especially for the Sámi languages spoken in Finland.
- 🔭 I'm currently finalizing my PhD thesis about morphological variation in Komi language
- 📔 I work regularly with normalization of dialectal and historical texts
- 📜 I know both R and Python at an advanced level
- 👯 I'm looking for new collaboration on:
- Speech technologies (forced alignment, speaker detection, speaker identification)
- Dependency parsing
- Linguistic data visualization and cartography
- 💬 Ask me about text and speech recognition, or Uralic languages
- 📫 How to reach me: [email protected]
I work or collaborate currently with various organizations, the list below is not exhaustive:
- Project ended in 2021 Kone Foundation funded research project Language Documentation meets Language Technology: The Next Step in the Description of Komi, led by Rogier Blokland and Michael Rießler
- Inclusive Technology for Marginalised Languages, with a goal to develop ASR technologies for endangered Uralic languages
- The Saami Culture Archive of University of Oulu, with a goal of building data management workflows for the archive using CSC's tools
- FU-Lab, The Finno-Ugric Laboratory for Support of the Electronic Representation of Regional Languages
- Dr. Jack Rueter's research team, consisting of me, Khalid Alnajjar and Mika Hämäläinen
I have taught following courses and workshops regularly. Please contact me, if you would like to organize something in your institution along these lines.
- Data management and publishing best practices
- Multimedia management in language documentation
- Using natural language processing in the language documentation context
- Advanced manipulation of ELAN corpora with Python and R
- Linguistic data analysis with spoken language corpora
- Text recognition tools: model fine tuning & extracting the data from recognition result
I live currently in Helsinki, Finland. I have previously lived in:
- 🥖 Paris, France
- 🍻 Hamburg, Germany
- ⛰️ Freiburg im Breisgau, Germany
- 🌲 Syktyvkar, Russia
- 🍷 Rovereto, Italy
- 🚣 Sulkava, Finland
- Finnish
- Komi
- Russian
- English
- Italian
- Please free to contact me also in: Northern Saami, Aanaar Saami, Skolt Saami, French, German, Estonian, Karelian, Udmurt and Swedish