Data science projects

Here you find information about some projects where I have been participating/managing. All results were achieved in a cooperation so all the glory doesn't belong to me.

Creation of ASR models

In Feelingstream I helped to create ASR models from the scratch in Scandinavian languages and Estonian:
  • data collection and preparation for transcribing
  • leading manual transcribing team work (including training)
  • data preparation for model training
  • model training, optimization
  • model testing
  • model introduction to customer

I've also helped to adapt models specifically for customers.

Publicly you can find some models that I've trained as a hobby.


Automatic topic extraction from texts

Analyzing customer conversations quickly reaches a phase where you want to understand what is said in texts. You don't want to do it manually.

For that I've experimented with different solutions, beginning with classical LDA and ending with LLMs.

I've also written a Master Thesis in this topic.

Publicly available is analysis on Estonian Parliament stenograms.


Preparations for ISO 27001 certification

Data Science is easy: you pull data from the web, find code in github and train your model.

It is a different thing to build and organization so that there are processes and clarity who can have access to what data and when. ISO 27001 certification helped to build foundations for Feelingstream information security management.

For certification I helped (with many partners):

  • create framework for information security management system
  • implement it
  • describe and implement important infosec processes
  • do risk assessment and risk management
  • etc.