Blog
-- Thoughts on data analysis, software
development and innovation management. Comments are welcome
Post 4
FreeLing exposed
19-Nov-2008
Last week, on Friday the 14th. the presentation of FreeLing at Universitat Pompeu Fabra
was a total success. I think that Dr. Padro, the speaker and one of the developers of FreeLing,
showed a tool that is a must to be known by anybody related to the NLP field.
This text processing library consists of a series of independent modules that must be
concatenated as a pipeline of data processors in order to produce the desired results.
The advantage is that these modules are developed independently, so they can be switched, added or
removed at will. The disadvantage is that this process is not trivial for a person not
experienced in C++, say a linguist, the end user by default at who FreeLing is aimed.
To my mind, this very well defined framework should be configured
through a XML file, thus not requiring to compile the whole application every time the configuration
changes. I am personally following this philosophy for EmoLib, taking the fabulous
"Configuration Manager" from the CMU Sphinx-4 speech recognition engine.
I understand that this is not the goal of the library, but a tiny detail like this one could
enlarge a big deal the range of users of FreeLing. And I insist on it being a "tiny detail"
because if compared to the load of work needed to produce this library it becomes ridiculous.
There are lots of hand-crafted huge data files that give FreeLing many distinctive
features that take it to the top of the language analyzers, especially in Spanish
and Catalan. For this, as well as for lots of other reasons, I do believe that
FreeLing is worth the look.
Post 3
Speech technologies wiki
23-Oct-2008
Following the ideal of free information and responding to the need of having a good means of documentation,
a wiki specialised on speech technologies has been created. My Ph.D. Thesis advisor, Dr. Francesc Alias,
recommended me to jot down in a safe place anything important I learned and that I would probably need sometime.
Then the proposal of a wiki was on the discussion board, which Dr. Alias considered with enthusiasm. Finally,
a speech technologies wiki has been created in order to maintain an accurate information repository about our
investigation issues and tools.
The correspondent "Wiki" link is accessible on the menu of the homepage.
The wiki engine of choice has been DokuWiki. It is a most modular, plugin
extensible wiki engine based on flat text files, so no problems with data-bases are expected. It is a mature
free software PHP project born in Germany (Deutschland) which is used to provide enterprise solutions to
documentation services.
Post 2
NLP disambiguation
16-Oct-2008
This topic aroused my curiosity in my yesterday's attendance at the scientific
research methodologies class, conducted by Dr. Mundet, when he suggested that
the clear advantage of Barack Obama over John McCain for the Presidential
Election 2008 was due, in part, to the special speech techniques that Obama
uses. It's none of my business to discuss the political issues of the two
candidates, but to analyse the scientific aspect of such fact, if there is
some.
I am presently dealing with the emotion identification from text using
semantic disambiguation [Garcia and Alias, 2008], so I first thought of
NLP (Natural Language Processing) techniques as tools to analyse
such special speech
techniques that Obama uses. I did some "googling" trying to relate
Obama to NLP and in fact many links were found, but none matched my
actual query. NLP referred to Neuro-linguistic Programming, an
interpersonal communication model that could be used as a manipulation
and hypnotic technique. It's far more interesting than I first
thought.
The referred article proposes a system that tags
incoming text with five different
emotions: happiness, surprise, sadness, wrath and fear. It would be
really interesting to know to what extent the discourses of Barack
Obama apply for each of these emotions.
--
[Garcia and Alias, 2008] Garcia, D. and Alias, F., "Emotion
identification from text using semantic disambiguation",
Procesamiento del Lenguaje Natural,
n. 40, (ISSN: 1135-5948), pp. 67-74. (in Spanish)
Post 1
Hello world!
15-Oct-2008
This is my first post, and with it, this blog is born. My name is Alexandre
Trilla and I am a researcher at the Department of Media Technologies (DTM) at
Enginyeria i Arquitectura La Salle, Universitat Ramon Llull, Barcelona.
I've been at the DTM for a week and a half, and my experience could not have
been more splendid. Most intense and interesting work, most qualified (and nice)
colleagues... and bosses. My Ph.D. Thesis advisor will be Dr. Francesc Alias.
After the most satisfactory experience of being under his direction and
leadership during the development of my Master's Thesis, I freely
choose to repeat the experience with this new academic adventure.
To sum up, I am very eager to keep learning and carrying on my investigations
here, filling this blog with lots of interesting (when possible) and useful
pieces of knowledge about speech technologies and Machine Learning.
Indeed, I want this blog to be a scientific blog. So,
after fixing some setbacks with my new account and after having some fun
coding this homepage, I declare this blog open.
newer - RSS - Search
|