From Emoji Usage to Categorical Emoji Prediction


Emoji usage drastically increased recently, they are becoming some of the most common ways to convey emotions and sentiments in social messaging applications. Several research works automatically recommend emojis, so users do not have to go through a library of thousands of emojis. In order to improve emoji recommendation, we present and distribute two useful resources: an emoji embedding model from real usage, and emoji clustering based on these embeddings to automatically identify groups of emojis. Assuming that emojis are part of written natural language and can be considered as words, we only used unsupervised learning methods to extract patterns and knowledge from real emoji usage in tweets. Thereby, emotion categories of face emojis were obtained directly from text in a fully reproductible way. These resources and methodology have multiple usages; for example, they could be used to improve our understanding of emojis or enhance emoji recommendation.

In 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING 2018)

This work has obtained the Best Verifiability, Reproducibility, and Working Description award.

Gaël Guibon
Gaël Guibon
Post-doctoral Researcher

My research goes from emojis and emotion prediction and recommendation to French lexical evolution studies.