About

EmoLib is a library that extracts the affect and emotions from an incoming text by tagging such text according to the feeling that is written or being conveyed.

The processing structure of EmoLib is built of several classes that define a modular framework, called a pipeline due to the sequentiality of the tagging process. This is the AffectiveTagger.

The classes that define the architecture of EmoLib are described as follows:

  • Tokenizer - Splits a text string into individual units, called tokens. These tokens are expressed in regular patterns as established by the grammar of the language. It is the INPUTTER of the pipeline.
  • Sentence Splitter - Segments the incoming text into paragraphs and sentences.
  • Part-Of-Speech Tagger - Disambiguates the function of nouns, verbs and adjectives in the sentence according to the context.
  • Word Sense Disambiguator - Determines which is the correct sense of a word according to the context and stuffs the word in question with the appropriate set of synonyms.
  • Stemmer - Strips the suffix of words in order to group those which share a common meaning, thus improving Information Retrieval (IR) performance.
  • Emotional Keyword Spotter - Determines the emotional dimensions of the words that have (an affective content) emotional dimensions.
  • Statistic - Computes the average emotional dimensions of the words in the text.
  • Classifier - Labels the text with the most appropriate emotion according to the affective attributes.
  • Affective Formatter - Presents the obtained results. It is the OUTPUTTER of the pipeline.

Please be careful doing some direct text "copy-paste" from the Internet since some buggy characters may cause the system to malfunction.

EmoLib is presently being developed by Alexandre Trilla as part of his Ph.D. Thesis, under the advisory of Francesc Alías at Enginyeria i Arquitectura La Salle, Universitat Ramon Llull, Barcelona, Spain. Feel free to contact them for any comment or suggestion of any kind.