Alexandre Trilla, PhD - Research Engineer |

Blog

-- Thoughts on data analysis, software development and innovation management. Comments are welcome

FreeLing exposed

19-Nov-2008

Last week, on Friday the 14th. the presentation of FreeLing at Universitat Pompeu Fabra was a total success. I think that Dr. Padro, the speaker and one of the developers of FreeLing, showed a tool that is a must to be known by anybody related to the NLP field.

This text processing library consists of a series of independent modules that must be concatenated as a pipeline of data processors in order to produce the desired results. The advantage is that these modules are developed independently, so they can be switched, added or removed at will. The disadvantage is that this process is not trivial for a person not experienced in C++, say a linguist, the end user by default at who FreeLing is aimed. To my mind, this very well defined framework should be configured through a XML file, thus not requiring to compile the whole application every time the configuration changes. I am personally following this philosophy for EmoLib, taking the fabulous "Configuration Manager" from the CMU Sphinx-4 speech recognition engine. I understand that this is not the goal of the library, but a tiny detail like this one could enlarge a big deal the range of users of FreeLing. And I insist on it being a "tiny detail" because if compared to the load of work needed to produce this library it becomes ridiculous.

There are lots of hand-crafted huge data files that give FreeLing many distinctive features that take it to the top of the language analyzers, especially in Spanish and Catalan. For this, as well as for lots of other reasons, I do believe that FreeLing is worth the look.

Post 3

Speech technologies wiki

23-Oct-2008

Following the ideal of free information and responding to the need of having a good means of documentation, a wiki specialised on speech technologies has been created. My Ph.D. Thesis advisor, Dr. Francesc Alias, recommended me to jot down in a safe place anything important I learned and that I would probably need sometime. Then the proposal of a wiki was on the discussion board, which Dr. Alias considered with enthusiasm. Finally, a speech technologies wiki has been created in order to maintain an accurate information repository about our investigation issues and tools. The correspondent "Wiki" link is accessible on the menu of the homepage.

The wiki engine of choice has been DokuWiki. It is a most modular, plugin extensible wiki engine based on flat text files, so no problems with data-bases are expected. It is a mature free software PHP project born in Germany (Deutschland) which is used to provide enterprise solutions to documentation services.

Post 2

NLP disambiguation

16-Oct-2008

This topic aroused my curiosity in my yesterday's attendance at the scientific research methodologies class, conducted by Dr. Mundet, when he suggested that the clear advantage of Barack Obama over John McCain for the Presidential Election 2008 was due, in part, to the special speech techniques that Obama uses. It's none of my business to discuss the political issues of the two candidates, but to analyse the scientific aspect of such fact, if there is some.

I am presently dealing with the emotion identification from text using semantic disambiguation [Garcia and Alias, 2008], so I first thought of NLP (Natural Language Processing) techniques as tools to analyse such special speech techniques that Obama uses. I did some "googling" trying to relate Obama to NLP and in fact many links were found, but none matched my actual query. NLP referred to Neuro-linguistic Programming, an interpersonal communication model that could be used as a manipulation and hypnotic technique. It's far more interesting than I first thought.

The referred article proposes a system that tags incoming text with five different emotions: happiness, surprise, sadness, wrath and fear. It would be really interesting to know to what extent the discourses of Barack Obama apply for each of these emotions.

--
[Garcia and Alias, 2008] Garcia, D. and Alias, F., "Emotion identification from text using semantic disambiguation", Procesamiento del Lenguaje Natural, n. 40, (ISSN: 1135-5948), pp. 67-74. (in Spanish)

Post 1

Hello world!

15-Oct-2008

This is my first post, and with it, this blog is born. My name is Alexandre Trilla and I am a researcher at the Department of Media Technologies (DTM) at Enginyeria i Arquitectura La Salle, Universitat Ramon Llull, Barcelona.

I've been at the DTM for a week and a half, and my experience could not have been more splendid. Most intense and interesting work, most qualified (and nice) colleagues... and bosses. My Ph.D. Thesis advisor will be Dr. Francesc Alias. After the most satisfactory experience of being under his direction and leadership during the development of my Master's Thesis, I freely choose to repeat the experience with this new academic adventure.

To sum up, I am very eager to keep learning and carrying on my investigations here, filling this blog with lots of interesting (when possible) and useful pieces of knowledge about speech technologies and Machine Learning. Indeed, I want this blog to be a scientific blog. So, after fixing some setbacks with my new account and after having some fun coding this homepage, I declare this blog open.

newer - RSS - Search