20 ene 2009

Today it is a good day to start learning Python

I just was trying to solve a problem of my brother, and I found Python really exciting. Of course, it doesn't seem to be as good as Perl for regular expressions, but instead, seems that modularization is very clean, and it also has a very large set of libraries.

Of course, I am not planning moving into Python, but I will keep an eye opened for that language. It could be a nice choice when building a fast prototype :).

3 ene 2009

Liblinear is amazingly fast!

I decided to give up using LibSVM for the linear case because it was not optimized for that. Then, I had a look at liblinear, developed on the same team than LivSVM. Liblinear is recommended for document classification because it removes lots of unuseful operations for the linear case. It has also a (very recent) port to Java, which is located here.

Now, working with 50 categories and 0.5GB of data takes only less than 10 seconds on a Core 2 Duo 2GHz laptop. Those timings are impressive! The interface for this library is very similar to the LibSVM one, so it is very easy to migrate from one library to other. Of course, if you made a good design, all you have to do is to update/change your corresponding facade class.

BTW: Happy new year!