Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Academic Website of Emily Öhman

Academic website of Emily Öhman containing information on publications, teaching, and research interests.

Course material (for students)

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

publications

European intercultural workplace: Sverige

Published in G oteborgs Universitet, 2007

Use Google Scholar for full citation

Recommended citation: Jens Allwood, Natalia Lindström, Margreth Börjesson, Charlotte Edeb{\ a}ck, Randi Myhre, Kaarlo Voionmaa, Emily Öhman, European intercultural workplace: Sverige. G"oteborgs Universitet, 2007.

Recent Changes in Indefinite Pronouns with Human Reference: A diachronic corpus study of 200 years of-one/-body and-man indefinite pronoun variation in Late Modern and Present-day English

Published in Linnaeus University, 2015

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Recent Changes in Indefinite Pronouns with Human Reference: A diachronic corpus study of 200 years of-one/-body and-man indefinite pronoun variation in Late Modern and Present-day English. Linnaeus University, 2015.

Language Change Database: A new online resource

Published in ICAME journal, 2016

Use Google Scholar for full citation

Recommended citation: Terttu Nevalainen, Turo Vartiainen, Tanja S{\ a}ily, Joonas Kes{\ a}niemi, Agata Dominowska, Emily Öhman, Language Change Database: A new online resource. ICAME journal, 2016.

The challenges of multi-dimensional sentiment analysis across languages

Published in In the proceedings of Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES) at COLING 2016, 2016

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Timo Honkela, Jörg Tiedemann. The challenges of multi-dimensional sentiment analysis across languages. In the proceedings of Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES) at COLING 2016, 2016.

Sentimentator: Gamifying Fine-grained Sentiment Annotation

Published in In the proceedings of Digital Humanities in the Nordic Countries 2018, 2018

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Kaisla Kajava. Sentimentator: Gamifying Fine-grained Sentiment Annotation. In the proceedings of Digital Humanities in the Nordic Countries 2018, 2018.

Creating a dataset for multilingual fine-grained emotion-detection using gamification-based annotation

Published in In the proceedings of Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis at EMNLP 2018, 2018

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Kaisla Kajava, Jörg Tiedemann, Timo Honkela. Creating a dataset for multilingual fine-grained emotion-detection using gamification-based annotation. In the proceedings of Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis at EMNLP 2018, 2018.

Towards the inevitable demise of everybody?

Published in ICAME 40, 2019

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Tanja S{\ a}ily, Mikko Laitinen, Towards the inevitable demise of everybody?. ICAME 40, 2019.

Challenges in Annotation: Annotator Experiences from a Crowdsourced Emotion Annotation Task.

Published in In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020

Use Google Scholar for full citation

Recommended citation: Emily Öhman. Challenges in Annotation: Annotator Experiences from a Crowdsourced Emotion Annotation Task.. In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020.

Emotion Preservation in Translation: Evaluating Datasets for Annotation Projection.

Published in In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020

Use Google Scholar for full citation

Recommended citation: Kaisla Kajava, Emily Öhman, Hui Piao, Jörg Tiedemann, Emotion Preservation in Translation: Evaluating Datasets for Annotation Projection.. In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020.

LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?

Published in In the proceedings of SemEval-2020: International Workshop on Semantic Evaluation - COLING 28th International Conference on Computational Linguistics, Barcelona, Spain, 2020

Use Google Scholar for full citation

Recommended citation: Marc Pàmies, Emily Öhman, Kaisla Kajava, Jörg Tiedemann. LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?. In the proceedings of SemEval-2020: International Workshop on Semantic Evaluation - COLING 28th International Conference on Computational Linguistics, Barcelona, Spain, 2020.

XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection

Published in In the proceedings of Proceedings of the 28th International Conference on Computational Linguistics, 2020

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Marc Pàmies, Kaisla Kajava, Jörg Tiedemann. XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection. In the proceedings of Proceedings of the 28th International Conference on Computational Linguistics, 2020.

Emotion Annotation: Rethinking Emotion Categorization.

Published in In the proceedings of DHN Post-Proceedings, 2021

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Emotion Annotation: Rethinking Emotion Categorization.. In the proceedings of DHN Post-Proceedings, 2020.

Affect and Emotions in Finnish Literature: Combining Qualitative and Quantitative Approach

Published in University of Helsinki, 2021

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Riikka Rossi, Affect and Emotions in Finnish Literature: Combining Qualitative and Quantitative Approach. University of Helsinki, 2021.

The Language of Emotions: Building and Applying Computational Methods for Emotion Detection for English and Beyond

Published in University of Helsinki, 2021

Use Google Scholar for full citation

Recommended citation: Emily Öhman, The Language of Emotions: Building and Applying Computational Methods for Emotion Detection for English and Beyond. University of Helsinki, 2021.

Japanese Beauty Marketing on Social Media: Critical Discourse Analysis Meets NLP

Published in In the proceedings of ICON 2021, 2021

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Amy Metcalfe. Japanese Beauty Marketing on Social Media: Critical Discourse Analysis Meets NLP. In the proceedings of ICON 2021, 2021.

The Validity of Lexicon-based Sentiment Analysis in Interdisciplinary Research

Published in In the proceedings of NLP4DH @ ICON 2021, 2021

Use Google Scholar for full citation

Recommended citation: Emily Öhman, The Validity of Lexicon-based Sentiment Analysis in Interdisciplinary Research. In the proceedings of NLP4DH @ ICON 2021, 2021.

Hate speech, Censorship, and Freedom of Speech: The Changing Policies of Reddit

Published in Journal of Data Mining & Digital Humanities, 2022

Use Google Scholar for full citation

Recommended citation: Emily Öhman, Elissa Nakajima, Hate speech, Censorship, and Freedom of Speech: The Changing Policies of Reddit. Journal of Data Mining & Digital Humanities, 2022.

SELF & FEIL: Emotion Lexicons for Finnish

Published in In the proceedings of The 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022

Use Google Scholar for full citation

Recommended citation: Emily Öhman, SELF & FEIL: Emotion Lexicons for Finnish. In the proceedings of The 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022.

Strategic sentiments and emotions in post-Second World War party manifestos in Finland

Published in Journal of Computational Social Science, 2022

Use Google Scholar for full citation

Recommended citation: Koljonen, Juha & Emily Öhman & Pertti Ahonen & Mikko Mattila, Strategic sentiments and emotions in post-Second World War party manifestos in Finland. In Journal of Computational Social Science. 2022.

Computational Exploration of the Origin of Mood in Literary Texts

Published in Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities @ AACL 2022, 2022

Use Google Scholar) for full citation

Recommended citation: Öhman, E. and Rossi, R., 2022. Computational Exploration of the Origin of Mood in Literary Texts. NLP4DH 2021, p.8.

From hate speech recognition to happiness indexing: critical issues in datafication of emotion in text mining

Published in Handbook of Critical Studies of Artificial Intelligence, 2023

In-press.

Recommended citation: Laaksonen, S.M., Pääkkönen, J. and Öhman, E., 2022. From hate speech recognition to happiness indexing: critical issues in datafication of emotion in text mining. In Handbook of Critical Studies of Artificial Intelligence. Edward Elgar.

Introduction to Text Analytics

Published in Sage Publishing, 2024

Coming in November 2024.

Recommended citation: Öhman, E., 2024. Introduction to Text Analytics. Edward Elgar.

research

Coding for Humanities

2020-11-30 to 2021-03-31

An EADH small grants project in which I created Python notebooks geared towards humanities and other non-STEM students. The project is available here.

Negative emotions in literature: A computational approach to tone and mood

2022-04-01 to 2024-03-31

JSPS Kakenhi Early career researcher: Negative emotions in literature: A computational approach to tone and mood: 2022-2024. 1,950,000JPY

The Language of Emotions: Building and Applying Computational Methods for Emotion Detection for English and Beyond

2016-01-01 to 2021-03-30

My doctoral project focusing on computational methods for emotion detection in text.

Un-Conventional Communicators in the Corona Crisis (UnCoCo)

2021-01-01 to 2022-11-30

Helsingin Sanomain Säätiö, Unconventional communicators in the corona-crisis, 2020-2022. 99000€ (group application)

resources

Sentimentator

Published: University of Helsinki, 2018

Sentimentator GitHub

EADH notebooks

Published: EADH 2020 Small Grants, 2020

EADH project page on GitHub)

XED

Published: CoLing, 2020

XED on huggingface

talks

Language Change Database: a new online resource

Published: October 01, 2015

Introducing LCD.

Sentiment Analysis as a Research Topic

Published: June 01, 2016

What is sentiment analysis

Sentiment Analysis for low-resource languages

Published: June 15, 2016

Movie subtitles as parallel texts for low resource languages

Sentiment Analysis for Theoretical Philosophy

Published: December 01, 2016

How can we use sentiment analysis in the field of theoretical philosophy

The Challenges of Multi-dimensional Sentiment Analysis Across Languages

Published: December 10, 2016

We outline a pilot study on multi-dimensional and multilingual sentiment analysis of social media content. We use parallel corpora of movie subtitles as a proxy for colloquial language in social media channels and a multilingual emotion lexicon for fine-grained sentiment analyses. Parallel data sets make it possible to study the preservation of sentiments and emotions in translation and our assessment reveals that the lexical approach shows great inter-language agreement. However, our manual evaluation also suggests that the use of purely lexical methods is limited and further studies are necessary to pinpoint the cross-lingual differences and to develop better sentiment classifiers.

Lexicon-based Sentiment Analysis

Published: February 01, 2018

Using lexicons as a quick and dirty tool to analyze emotions in multilingual data

Sentimentator: A Sentiment and Emotion Annotation Platform

Published: March 20, 2018

We introduce Sentimentator; a publicly available gamified web-based annotation platform for fine-grained sentiment annotation at the sentence-level. Sentimentator is unique in that it moves beyond binary classification. We use a ten-dimensional model which allows for the annotation of 51 unique sentiments and emotions. The platform is gamified with a complex scoring system designed to reward users for high quality annotations. Sentimentator introduces several unique features that have previously not been available, or at best very limited, for sentiment annotation. In particular, it provides streamlined multi-dimensional annotation optimized for sentence-level annotation of movie subtitles. Because the platform is publicly available it will benefit anyone and everyone interested in fine-grained sentiment analysis and emotion detection, as well as annotation of other datasets.

Creating a Dataset for Multilingual Fine-grained Emotion-detection Using Gamification-based Annotation

Published: October 01, 2018

This paper introduces a gamified framework for fine-grained sentiment analysis and emotion detection. We present a flexible tool, \textit{Sentimentator}, that can be used for efficient annotation based on crowd sourcing and a self-perpetuating gold standard. We also present a novel dataset with multi-dimensional annotations of emotions and sentiments in movie subtitles that enables research on sentiment preservation across languages and the creation of robust multilingual emotion detection tools. The tools and datasets are public and open-source and can easily be extended and applied for various purposes.

Computational Bias

Published: October 10, 2018

Mitigating and acknowledging biased data and how to ethically deal with biases in data.

Multilingual Emotion Analysis

Published: November 01, 2018

Emotion analysis using transfer learning

Sentiment Analysis for Social Media Analytics

Published: November 15, 2018

Special caveats for conducting sentiment analysis of social media data

Biased Algorithms

Published: February 01, 2019

How do biased data affect algorithms in education?

Teaching Computational Methods to Humanities Students

Published: February 20, 2019

Best practice solutions for teaching computational methods to humanities studets.

Towards the Inevitable Demise of Everybody? A multifactorial analysis of -one/-body/-man variation in indefinite pronouns in historical American English

Published: June 01, 2019

Language change in real time: the case of indefinite pronouns in English

Emotion Annotation

Published: October 01, 2020

With the prevalence of machine learning in natural language processing and other fields, an increasing number of crowd-sourced data sets are created and published. However, very little has been written about the annotation process from the point of view of the annotators. This pilot study aims to help fill the gap and provide insights into how to maximize the quality of the annotation output of crowd-sourced annotations with a focus on fine-grained sentence-level sentiment and emotion annotation from the annotators point of view.

Mitigating the effects of bias in data for machine learning algorithms

Published: November 01, 2020

How can we mitigate the effects of bias in the underlying data for machine learning tasks?

Sentiment annotation and analysis in Finnish

Published: November 02, 2020

What special problems arise when working with a language like Finnish for sentiment analysis?

XED: Multilabel Emotion Dataset

Published: December 10, 2020

We introduce XED, a multilingual fine-grained emotion dataset. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 30 additional languages, providing new resources for many low-resource languages. We use Plutchik's core emotions to annotate the dataset with the addition of neutral to create a multilabel multiclass dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to show that XED performs on par with other similar datasets and is therefore a useful tool for sentiment analysis and emotion detection.

Concepts of Beauty on Japanese Social Media.

Published: July 21, 2021

How is beauty discussed on social media and what kind of images are used to convey beauty to consumers?

Coding for Digital Humanities

Published: September 01, 2021

For this panel, I together with the other winners of the EADH small grants 2020 presented our projects.

Breaking into Academia

Published: October 01, 2021

For this panel, I together with two more senior academics discussed the dos and don'ts of breaking into academia as a career.

Japanese Beauty Marketing on Social Media: Critical Discourse Analysis Meets NLP

Published: December 19, 2021

This project is a pilot study intending to combine traditional corpus linguistics, Natural Language Processing, critical discourse analysis, and digital humanities to gain an up-to-date understanding of how beauty is being marketed on social media, specifically Instagram, to followers. We use topic modeling combined with critical discourse analysis and NLP tools for insights into the ``Japanese Beauty Myth" and show an overview of the dataset that we make publicly available.

The Validity of Lexicon-based Emotion Analysis in Interdisciplinary Research

Published: December 20, 2021

Lexicon-based sentiment and emotion analysis methods are widely used particularly in applied Natural Language Processing (NLP) projects in fields such as computational social science and digital humanities. These lexicon-based methods have often been criticized for their lack of validation and accuracy – sometimes fairly. However, in this paper, we argue that lexicon-based methods work well particularly when moving up in granularity and show how useful lexicon-based methods can be for projects where neither qualitative analysis nor a machine learning-based approach is possible. Indeed, we argue that the measure of a lexicon's accuracy should be grounded in its usefulness.

SELF & FEIL Emotion Lexicons for Finnish

Published: March 15, 2022

I introduce a Sentiment and Emotion Lexicon for Finnish (SELF) and a Finnish Emotion Intensity Lexicon (FEIL). Sentiment analysis and emotion detection require annotated data regardless of the chosen approach, but most existing resources are for the English language. To overcome this, the SELF and FEIL lexicons use projected annotations from existing resources with carefully edited translations and domain adaptations. In this paper the creation process and translation issues are explained in detail to allow others to create similar lexicons for other languages. The usefulness of SELF and FEIL are demonstrated via several interdisciplinary affect-related projects. To our best knowledge, this is the first comprehensive sentiment and emotion lexicon for Finnish.

Creating an army of hacker-scholars

Published: July 21, 2022

International similarities and differences in what students struggle with on programming courses aimed at non-STEM students.

teaching

Project Course in English Linguistics

Graduate course, University of Helsinki, Department of Modern Languages, English Philology, 2015

For graduate students of English philology and the English teaching program. Focusing on practical tasks for the Language Change Database and teaching students to digest information from academic papers focused on diachronic corpus linguistics.

Introduction to Language Technology

Underaduate course, University of Helsinki, Department of Digital Humanities, 2015

This course was a compulsory course for all students of languages. The focus was on teaching both the applications of language technology and how language and computers are intertwined, but also many practical tools. These tools included:

Methods for Digital Humanities

Graduate course, University of Helsinki, Department of Digital Humanities, 2016

This course is a graduate level course teaching digital humanities methods to students of various humanities and social science disciplines.

Citizen Science: Crowd-sourcing as a Tool for Collecting Quantitative and Qualitative Data

Graduate course, University of Helsinki, Department of Digital Humanities, 2021

This course is a graduate level, UNAEuropa, course teaching students about crowdsourcing and citizen science for various digital humanities projects. This course was co-taught with Suzie Thomas and the focus was split on cultural heritage studies and crowdsourcing annotations for language technology projects.

Emily Ohman

Sitemap

Pages

Posts

publications

research

resources

talks

teaching

zemi