Friday, March 13, 2009

active learning with python & libsvm (or; on hacking libsvm)

Arguably the most popular Support Vector Machine (SVM) library is libsvm. It is widely used, cross-platform and open-source. Moreover, while written in C++, there are myriad interfaces that provide access to the library in every language from Python to Matlab to brainf*ck. OK, not brainf*ck.

Here I'll focus on the bridge from Python. In particular, I was interested in using active learning, a useful framework for interactively training classifiers. In active learning, an expert provides labeled training examples for a model (e.g., an SVM), and the model iteratively selects new points from the remaining unlabeled data for labeling at each step. This differs from the canonical learning framework wherein you train your classifier on a random set of labeled data. The intuition is that the classifier can select informative examples for the expert to label at each iteration in the training process (in other words, active learning better exploits the expert). The upshot is that the most popular active learning strategy (i.e., method of selecting maximally informative examples for labeling) for SVMs is known as SIMPLE, which picks the (unlabeled) point closest to the separating hyperplane for labeling at each step. This requires access to the norm of w, the vector orthogonal to the separating hyperplane.

Unfortunately, libsvm doesn't automatically compute this. The libsvm folks do, however, outline how one might go about this in their FAQ:
The distance is |decision_value| / |w|. We have |w|^2 = w^Tw = alpha^T Q alpha = 2*(dual_obj + sum alpha_i). Thus in svm.cpp please find the place where we calculate the dual objective value (i.e., the subroutine Solve()) and add a statement to print w^Tw.
Thus the following steps were required to extract |w| from the library: (1) hack the C++ code as outlined above and (2) provide access to the computed value via Python. The first step was simple enough. After some modifications to various structs (e.g., svm_model) to keep hold of the computed|w| value, I ultimately added the following method to svm.cpp:
// get |w|^2
double svm_get_model_w2(struct svm_model *model)
{
return model->w_2;
}
Next, the header (svm.h) needed be updated to reflect this change. libsvm next needs to be rebuilt. On *nix (including OS X) this can be done with make. On Windows, I managed to do it with Visual Studio by first making sure the vcvar.bat file was in my path:
>"C:\Program Files\Microsoft Visual Studio 8\VC\bin\vcvar.bat"
Then:
> nmake -f Makefile.win clean
> nmake -f Makefile.win all
Next we need to modify the Python interface. In the "Python" directory, add the method header to the svmc.i interface files:
double svm_get_model_w2(struct svm_model *model);
Finally, rebuild the interface:
>swig -python svmci.i
>python setup.py build build_ext --inplace
Now the added method will be available through svm.svmc.svm_get_model_w2. If you're interested, the source is available @ http://code.google.com/p/libsvm288fork/

2 comments:

  1. Hi, I'm new and thus probably wrong. But as the Simple Margin strategy in active learning states:
    ".. and choose as the next instance to query the instance that comes closest to the hyperplane F" (Support vector machine activel learning with applications to text classification, Journal of machine learning research 2001 45-66, S. Tong D. Koller)
    isn't it enough to to calculate the sample with minimum |decision_value| ? After all w is constant.

    ReplyDelete
  2. I think you're right. In a two-class SVM, w is constant, and you won't need its' value for active learning.

    For multi-class SVM, libsvm trains several models, and each model will have a different w. I've handled that here: https://github.com/mpenkov/libsvm

    ReplyDelete