Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
Academic website of Emily Öhman containing information on publications, teaching, and research interests.
This is a page not in th emain menu
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Published in G oteborgs Universitet, 2007
Use Google Scholar for full citation
Recommended citation: Jens Allwood, Natalia Lindström, Margreth Börjesson, Charlotte Edeb{\ a}ck, Randi Myhre, Kaarlo Voionmaa, Emily Öhman, European intercultural workplace: Sverige. G"oteborgs Universitet, 2007.
Published in Linnaeus University, 2015
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Recent Changes in Indefinite Pronouns with Human Reference: A diachronic corpus study of 200 years of-one/-body and-man indefinite pronoun variation in Late Modern and Present-day English. Linnaeus University, 2015.
Published in ICAME journal, 2016
Use Google Scholar for full citation
Recommended citation: Terttu Nevalainen, Turo Vartiainen, Tanja S{\ a}ily, Joonas Kes{\ a}niemi, Agata Dominowska, Emily Öhman, Language Change Database: A new online resource. ICAME journal, 2016.
Published in In the proceedings of Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES) at COLING 2016, 2016
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Timo Honkela, Jörg Tiedemann. The challenges of multi-dimensional sentiment analysis across languages. In the proceedings of Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES) at COLING 2016, 2016.
Published in In the proceedings of Digital Humanities in the Nordic Countries 2018, 2018
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Kaisla Kajava. Sentimentator: Gamifying Fine-grained Sentiment Annotation. In the proceedings of Digital Humanities in the Nordic Countries 2018, 2018.
Published in In the proceedings of Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis at EMNLP 2018, 2018
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Kaisla Kajava, Jörg Tiedemann, Timo Honkela. Creating a dataset for multilingual fine-grained emotion-detection using gamification-based annotation. In the proceedings of Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis at EMNLP 2018, 2018.
Published in ICAME 40, 2019
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Tanja S{\ a}ily, Mikko Laitinen, Towards the inevitable demise of everybody?. ICAME 40, 2019.
Published in In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020
Use Google Scholar for full citation
Recommended citation: Emily Öhman. Challenges in Annotation: Annotator Experiences from a Crowdsourced Emotion Annotation Task.. In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020.
Published in In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020
Use Google Scholar for full citation
Recommended citation: Kaisla Kajava, Emily Öhman, Hui Piao, Jörg Tiedemann, Emotion Preservation in Translation: Evaluating Datasets for Annotation Projection.. In the proceedings of Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020.
Published in In the proceedings of SemEval-2020: International Workshop on Semantic Evaluation - COLING 28th International Conference on Computational Linguistics, Barcelona, Spain, 2020
Use Google Scholar for full citation
Recommended citation: Marc Pàmies, Emily Öhman, Kaisla Kajava, Jörg Tiedemann. LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?. In the proceedings of SemEval-2020: International Workshop on Semantic Evaluation - COLING 28th International Conference on Computational Linguistics, Barcelona, Spain, 2020.
Published in In the proceedings of Proceedings of the 28th International Conference on Computational Linguistics, 2020
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Marc Pàmies, Kaisla Kajava, Jörg Tiedemann. XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection. In the proceedings of Proceedings of the 28th International Conference on Computational Linguistics, 2020.
Published in In the proceedings of DHN Post-Proceedings, 2021
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Emotion Annotation: Rethinking Emotion Categorization.. In the proceedings of DHN Post-Proceedings, 2020.
Published in University of Helsinki, 2021
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Riikka Rossi, Affect and Emotions in Finnish Literature: Combining Qualitative and Quantitative Approach. University of Helsinki, 2021.
Published in University of Helsinki, 2021
Use Google Scholar for full citation
Recommended citation: Emily Öhman, The Language of Emotions: Building and Applying Computational Methods for Emotion Detection for English and Beyond. University of Helsinki, 2021.
Published in In the proceedings of ICON 2021, 2021
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Amy Metcalfe. Japanese Beauty Marketing on Social Media: Critical Discourse Analysis Meets NLP. In the proceedings of ICON 2021, 2021.
Published in In the proceedings of NLP4DH @ ICON 2021, 2021
Use Google Scholar for full citation
Recommended citation: Emily Öhman, The Validity of Lexicon-based Sentiment Analysis in Interdisciplinary Research. In the proceedings of NLP4DH @ ICON 2021, 2021.
Published in Journal of Data Mining & Digital Humanities, 2022
Use Google Scholar for full citation
Recommended citation: Emily Öhman, Elissa Nakajima, Hate speech, Censorship, and Freedom of Speech: The Changing Policies of Reddit. Journal of Data Mining & Digital Humanities, 2022.
Published in In the proceedings of The 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022
Use Google Scholar for full citation
Recommended citation: Emily Öhman, SELF & FEIL: Emotion Lexicons for Finnish. In the proceedings of The 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022.
Published in Journal of Computational Social Science, 2022
Use Google Scholar for full citation
Recommended citation: Koljonen, Juha & Emily Öhman & Pertti Ahonen & Mikko Mattila, Strategic sentiments and emotions in post-Second World War party manifestos in Finland. In Journal of Computational Social Science. 2022.
Published in Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities @ AACL 2022, 2022
Use Google Scholar) for full citation
Recommended citation: Öhman, E. and Rossi, R., 2022. Computational Exploration of the Origin of Mood in Literary Texts. NLP4DH 2021, p.8.
Published in Handbook of Critical Studies of Artificial Intelligence, 2023
In-press.
Recommended citation: Laaksonen, S.M., Pääkkönen, J. and Öhman, E., 2022. From hate speech recognition to happiness indexing: critical issues in datafication of emotion in text mining. In Handbook of Critical Studies of Artificial Intelligence. Edward Elgar.
Published in Sage Publishing, 2024
Coming in November 2024.
Recommended citation: Öhman, E., 2024. Introduction to Text Analytics. Edward Elgar.
2020-11-30 to 2021-03-31
An EADH small grants project in which I created Python notebooks geared towards humanities and other non-STEM students. The project is available here.
2022-04-01 to 2024-03-31
JSPS Kakenhi Early career researcher: Negative emotions in literature: A computational approach to tone and mood: 2022-2024. 1,950,000JPY
2016-01-01 to 2021-03-30
My doctoral project focusing on computational methods for emotion detection in text.
2021-01-01 to 2022-11-30
Helsingin Sanomain Säätiö, Unconventional communicators in the corona-crisis, 2020-2022. 99000€ (group application)
Published: University of Helsinki, 2018
Published: EADH 2020 Small Grants, 2020
Published: CoLing, 2020
Published:
Introducing LCD.
Published:
What is sentiment analysis
Published:
Movie subtitles as parallel texts for low resource languages
Published:
How can we use sentiment analysis in the field of theoretical philosophy
Published:
We outline a pilot study on multi-dimensional and multilingual sentiment analysis of social media content. We use parallel corpora of movie subtitles as a proxy for colloquial language in social media channels and a multilingual emotion lexicon for fine-grained sentiment analyses. Parallel data sets make it possible to study the preservation of sentiments and emotions in translation and our assessment reveals that the lexical approach shows great inter-language agreement. However, our manual evaluation also suggests that the use of purely lexical methods is limited and further studies are necessary to pinpoint the cross-lingual differences and to develop better sentiment classifiers.
Published:
Using lexicons as a quick and dirty tool to analyze emotions in multilingual data
Published:
We introduce Sentimentator; a publicly available gamified web-based annotation platform for fine-grained sentiment annotation at the sentence-level. Sentimentator is unique in that it moves beyond binary classification. We use a ten-dimensional model which allows for the annotation of 51 unique sentiments and emotions. The platform is gamified with a complex scoring system designed to reward users for high quality annotations. Sentimentator introduces several unique features that have previously not been available, or at best very limited, for sentiment annotation. In particular, it provides streamlined multi-dimensional annotation optimized for sentence-level annotation of movie subtitles. Because the platform is publicly available it will benefit anyone and everyone interested in fine-grained sentiment analysis and emotion detection, as well as annotation of other datasets.
Published:
This paper introduces a gamified framework for fine-grained sentiment analysis and emotion detection. We present a flexible tool, \textit{Sentimentator}, that can be used for efficient annotation based on crowd sourcing and a self-perpetuating gold standard. We also present a novel dataset with multi-dimensional annotations of emotions and sentiments in movie subtitles that enables research on sentiment preservation across languages and the creation of robust multilingual emotion detection tools. The tools and datasets are public and open-source and can easily be extended and applied for various purposes.
Published:
Mitigating and acknowledging biased data and how to ethically deal with biases in data.
Published:
Emotion analysis using transfer learning
Published:
Special caveats for conducting sentiment analysis of social media data
Published:
How do biased data affect algorithms in education?
Published:
Best practice solutions for teaching computational methods to humanities studets.
Published:
Language change in real time: the case of indefinite pronouns in English
Published:
With the prevalence of machine learning in natural language processing and other fields, an increasing number of crowd-sourced data sets are created and published. However, very little has been written about the annotation process from the point of view of the annotators. This pilot study aims to help fill the gap and provide insights into how to maximize the quality of the annotation output of crowd-sourced annotations with a focus on fine-grained sentence-level sentiment and emotion annotation from the annotators point of view.
Published:
How can we mitigate the effects of bias in the underlying data for machine learning tasks?
Published:
What special problems arise when working with a language like Finnish for sentiment analysis?
Published:
We introduce XED, a multilingual fine-grained emotion dataset. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 30 additional languages, providing new resources for many low-resource languages. We use Plutchik's core emotions to annotate the dataset with the addition of neutral to create a multilabel multiclass dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to show that XED performs on par with other similar datasets and is therefore a useful tool for sentiment analysis and emotion detection.
Published:
How is beauty discussed on social media and what kind of images are used to convey beauty to consumers?
Published:
For this panel, I together with the other winners of the EADH small grants 2020 presented our projects.
Published:
For this panel, I together with two more senior academics discussed the dos and don'ts of breaking into academia as a career.
Published:
This project is a pilot study intending to combine traditional corpus linguistics, Natural Language Processing, critical discourse analysis, and digital humanities to gain an up-to-date understanding of how beauty is being marketed on social media, specifically Instagram, to followers. We use topic modeling combined with critical discourse analysis and NLP tools for insights into the ``Japanese Beauty Myth" and show an overview of the dataset that we make publicly available.
Published:
Lexicon-based sentiment and emotion analysis methods are widely used particularly in applied Natural Language Processing (NLP) projects in fields such as computational social science and digital humanities. These lexicon-based methods have often been criticized for their lack of validation and accuracy – sometimes fairly. However, in this paper, we argue that lexicon-based methods work well particularly when moving up in granularity and show how useful lexicon-based methods can be for projects where neither qualitative analysis nor a machine learning-based approach is possible. Indeed, we argue that the measure of a lexicon's accuracy should be grounded in its usefulness.
Published:
I introduce a Sentiment and Emotion Lexicon for Finnish (SELF) and a Finnish Emotion Intensity Lexicon (FEIL). Sentiment analysis and emotion detection require annotated data regardless of the chosen approach, but most existing resources are for the English language. To overcome this, the SELF and FEIL lexicons use projected annotations from existing resources with carefully edited translations and domain adaptations. In this paper the creation process and translation issues are explained in detail to allow others to create similar lexicons for other languages. The usefulness of SELF and FEIL are demonstrated via several interdisciplinary affect-related projects. To our best knowledge, this is the first comprehensive sentiment and emotion lexicon for Finnish.
Published:
International similarities and differences in what students struggle with on programming courses aimed at non-STEM students.
Graduate course, University of Helsinki, Department of Modern Languages, English Philology, 2015
For graduate students of English philology and the English teaching program. Focusing on practical tasks for the Language Change Database and teaching students to digest information from academic papers focused on diachronic corpus linguistics.
Underaduate course, University of Helsinki, Department of Digital Humanities, 2015
This course was a compulsory course for all students of languages. The focus was on teaching both the applications of language technology and how language and computers are intertwined, but also many practical tools. These tools included:
Graduate course, University of Helsinki, Department of Digital Humanities, 2016
This course is a graduate level course teaching digital humanities methods to students of various humanities and social science disciplines.
Graduate course, University of Helsinki, Department of Digital Humanities, 2021
This course is a graduate level, UNAEuropa, course teaching students about crowdsourcing and citizen science for various digital humanities projects. This course was co-taught with Suzie Thomas and the focus was split on cultural heritage studies and crowdsourcing annotations for language technology projects.
Graduate course, Tampere University, Faculty of Information Technology and Communication Sciences, 2021
This course is a graduate level course on digital intimacy. I was a guest lecturer and talked about computational approaches to measuring intimacy and emotion in online sources.
Undergraduate course, compulsory course, Waseda Unviersity, School of International Liberal Studies, 2022
A compulsory course to all SILS students.
Undergraduate course, Introductory course, Waseda Unviersity, School of International Liberal Studies (open to all of Waseda), 2022
This course is an undergraduate level introductory course teaching digital humanities methods to students of liberal arts since 2021.
Undergraduate course, Advanced course, Waseda Unviersity, School of International Liberal Studies (open to all of Waseda), 2022
This course is an undergraduate level advanced course teaching intermediate practical programming skills with a data analysis and social media analysis focus since 2021.
Undergraduate course, Intermediate course, Waseda Unviersity, School of International Liberal Studies (open to all of Waseda), 2022
This course is an undergraduate level intermediate course teaching practical programming skills for beginners since 2021.