Alexandre Trilla, PhD - Research Engineer |

Blog

-- Thoughts on data analysis, software development and innovation management. Comments are welcome

FreeLing exposed

19-Nov-2008

Last week, on Friday the 14th. the presentation of FreeLing at Universitat Pompeu Fabra was a total success. I think that Dr. Padro, the speaker and one of the developers of FreeLing, showed a tool that is a must to be known by anybody related to the NLP field.

This text processing library consists of a series of independent modules that must be concatenated as a pipeline of data processors in order to produce the desired results. The advantage is that these modules are developed independently, so they can be switched, added or removed at will. The disadvantage is that this process is not trivial for a person not experienced in C++, say a linguist, the end user by default at who FreeLing is aimed. To my mind, this very well defined framework should be configured through a XML file, thus not requiring to compile the whole application every time the configuration changes. I am personally following this philosophy for EmoLib, taking the fabulous "Configuration Manager" from the CMU Sphinx-4 speech recognition engine. I understand that this is not the goal of the library, but a tiny detail like this one could enlarge a big deal the range of users of FreeLing. And I insist on it being a "tiny detail" because if compared to the load of work needed to produce this library it becomes ridiculous.

There are lots of hand-crafted huge data files that give FreeLing many distinctive features that take it to the top of the language analyzers, especially in Spanish and Catalan. For this, as well as for lots of other reasons, I do believe that FreeLing is worth the look.