The tree kernel has been encoded inside the well known SVM-light software written by Joachims Thorsten (www.joachims.org). The input format and the new options are compatible with those of the original SVM-light 5.01. Moreover, as a tree kernel maybe useful in conjunction with other features (not expressible with a parse tree) it is provided a way to combine the tree-kernels with the default and custom SVM-light kernels. The embedded combinations are the sum and the product of kernels. Other combination functions can be specified by simply modifying the code.
-
Tree
Kernel and kernel Combinations [Moschitti, 2004] inside SVM-light 5.1 [Joachims
et al., 1999]
-
Subset
tree kernel [Collins and Duffy, 2002]
-
Subtree
kernel (see [Vishwanathan and Smola, 2001] for a definition).
The input format is similar to the original SVM-light format:
1 (S (N Paul)(VP (V delivers))) |STD|
-1 (VP (V delivers)(PP (IN in)(
-1 (VP ((V delivers)(NP (D a)(N talk)))) |STD|
where the
parse trees are in the usual Penn Treebank format and |STD| is used to
separate the parse trees from the standard features.
To combine
trees with standard features we need to provide both type of features as
follows:
1 (S (N Paul)(VP (V delivers))) |STD| 1:0.082913 1000:0.466706 ...
-1 (VP (V delivers)(PP (IN in)(
-1 (VP ((V delivers)(NP (D a)(N talk))))
|STD| 1:0.039222 1501:0.045687
...
In any case
the original SVM-light input format can be used without any changes:
1 1:0.082913 1000:0.466706 ...
-1
1:0.010287 1001:0.113421 ...
-1
1:0.039222 1501:0.045687 ...
Note that
even when we don’t use the standard flat features we will use the “|STD|”
string as a tree terminator (so please
include it).
Important:
follow the syntax of the tree exactly (for the moment the tree loader procedure
is not robust). There is a space between a non terminal and a terminal or
parenthesis and there is no space between two parenthesis. Moreover, for the
moment the expected input is a parse-tree this means that a pre-terminal (the
last non-terminal before a leaf) is always followed by one leaf. Soon I will
modify the program input to make it more robust and usable with more general
trees.
The “svm_classify” and the “svm_learn” commands maintain the original format:
usage: svm_learn
[options] example_file model_file
Arguments:
example_file-> file with training
data
model_file -> file to store the learned decision
rules in
usage:
svm_classify [options] example_file model_file
Arguments:
example_file-> file with testing
data
model_file -> file to retrieve the learned decision
rules
The new options
are shown hereafter (in blue colour):
Kernel
options:
-t int -> type of kernel function:
0: linear (default)
1: polynomial (s
a*b+c)^d
2: radial basis
function exp(-gamma ||a-b||^2)
3: sigmoid tanh(s a*b +
c)
4: user defined kernel
from kernel.h
5: SubTree Kernel and
SubSet Tree Kernels
6: K1*K2 or r*K1 + K2
kernel combinations,
where K1 = Tree Kernel and K2 is a second kernel
from 0 to 4 defined above.
-D [0,1] -> run the SubTree or the SubSet Tree
kernels (default 1)
-L float -> decay factor in tree kernels (default
0.4)
-C ['*','+']-> combination operator
between K1 and K2 (default '*')
-S [0,4] -> second kernel K2 (default polynomial,
i.e. 1)
-T float -> r value for K1 when '+' operator is
used (default 0.3)
-N [0,3] -> 0 = no normalization, 1= normalizes
K1,
2 =
normalizes
-u string -> parameter of user defined kernel
-d int
-> parameter d in polynomial kernel
-g float -> parameter gamma in rbf kernel
-s float -> parameter s in sigmoid/poly kernel
-r float -> parameter c in sigmoid/poly kernel
-u string -> parameter of user defined kernel
./svm_learn -t 5 example_file model_file (the subset-tree kernel alone is used)
./svm_learn -t 6 -C '+' -S 1 –d 3 example_file model_file (the subset tree kernel is summed to the polynomial kernel with a degree = 3)
./svm_learn -t 6 -C '+' –D 0 -S 2 -g .1 example_file model_file (the subtree kernel is summed to a radial basis function kernel with gamma = 2)
- Example data (it contains the PropBank Argument 0 as positive class and Argument 1 as negative class)
[Moschitti, 2004], Alessandro Moschitti. A study on Convolution Kernels for Shallow Semantic Parsing. In proceedings of the 42-th Conference on Association for Computational Linguistic (ACL-2004), Barcelona, Spain, 2004.
[Joachims,
1999], Thorsten Joachims. Making large-scale SVM learning
practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in
Kernel Methods - Support Vector Learning, 1999.
[Collins and Duffy, 2002], Michael Collins and Nigel Duffy. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In ACL02, 2002.
[Vishwanathan and Smola, 2002], S.V.N. Vishwanathan and A.J. Smola. Fast kernels on strings and trees. In Proceedings of Neural Information Processing Systems, 2002.
Maintained by Alessandro Moschitti moschitti@info.uniroma2.it |