The tree kernel has been encoded inside the well known SVMlight software written by Joachims Thorsten (www.joachims.org). The input format and the new options are compatible with those of the original SVMlight 5.01. Moreover, as a tree kernel maybe useful in conjunction with other features (not expressible with a parse tree) it is provided a way to combine the treekernels with the default and custom SVMlight kernels. The embedded combinations are the sum and the product of kernels. Other combination functions can be specified by simply modifying the code.

Tree
Kernel and kernel Combinations [Moschitti, 2004] inside SVMlight 5.1 [Joachims
et al., 1999]

Subset
tree kernel [Collins and Duffy, 2002]

Subtree
kernel (see [Vishwanathan and Smola, 2001] for a definition).
The input format is similar to the original SVMlight format:
1 (S (N Paul)(VP (V delivers))) STD
1 (VP (V delivers)(PP (IN in)(
1 (VP ((V delivers)(NP (D a)(N talk)))) STD
where the
parse trees are in the usual Penn Treebank format and STD is used to
separate the parse trees from the standard features.
To combine
trees with standard features we need to provide both type of features as
follows:
1 (S (N Paul)(VP (V delivers))) STD 1:0.082913 1000:0.466706 ...
1 (VP (V delivers)(PP (IN in)(
1 (VP ((V delivers)(NP (D a)(N talk))))
STD 1:0.039222 1501:0.045687
...
In any case
the original SVMlight input format can be used without any changes:
1 1:0.082913 1000:0.466706 ...
1
1:0.010287 1001:0.113421 ...
1
1:0.039222 1501:0.045687 ...
Note that
even when we don’t use the standard flat features we will use the “STD”
string as a tree terminator (so please
include it).
Important:
follow the syntax of the tree exactly (for the moment the tree loader procedure
is not robust). There is a space between a non terminal and a terminal or
parenthesis and there is no space between two parenthesis. Moreover, for the
moment the expected input is a parsetree this means that a preterminal (the
last nonterminal before a leaf) is always followed by one leaf. Soon I will
modify the program input to make it more robust and usable with more general
trees.
The “svm_classify” and the “svm_learn” commands maintain the original format:
usage: svm_learn
[options] example_file model_file
Arguments:
example_file> file with training
data
model_file > file to store the learned decision
rules in
usage:
svm_classify [options] example_file model_file
Arguments:
example_file> file with testing
data
model_file > file to retrieve the learned decision
rules
The new options
are shown hereafter (in blue colour):
Kernel
options:
t int > type of kernel function:
0: linear (default)
1: polynomial (s
a*b+c)^d
2: radial basis
function exp(gamma ab^2)
3: sigmoid tanh(s a*b +
c)
4: user defined kernel
from kernel.h
5: SubTree Kernel and
SubSet Tree Kernels
6: K1*K2 or r*K1 + K2
kernel combinations,
where K1 = Tree Kernel and K2 is a second kernel
from 0 to 4 defined above.
D [0,1] > run the SubTree or the SubSet Tree
kernels (default 1)
L float > decay factor in tree kernels (default
0.4)
C ['*','+']> combination operator
between K1 and K2 (default '*')
S [0,4] > second kernel K2 (default polynomial,
i.e. 1)
T float > r value for K1 when '+' operator is
used (default 0.3)
N [0,3] > 0 = no normalization, 1= normalizes
K1,
2 =
normalizes
u string > parameter of user defined kernel
d int
> parameter d in polynomial kernel
g float > parameter gamma in rbf kernel
s float > parameter s in sigmoid/poly kernel
r float > parameter c in sigmoid/poly kernel
u string > parameter of user defined kernel
./svm_learn t 5 example_file model_file (the subsettree kernel alone is used)
./svm_learn t 6 C '+' S 1 –d 3 example_file model_file (the subset tree kernel is summed to the polynomial kernel with a degree = 3)
./svm_learn t 6 C '+' –D 0 S 2 g .1 example_file model_file (the subtree kernel is summed to a radial basis function kernel with gamma = 2)
 Example data (it contains the PropBank Argument 0 as positive class and Argument 1 as negative class)
[Moschitti, 2004], Alessandro Moschitti. A study on Convolution Kernels for Shallow Semantic Parsing. In proceedings of the 42th Conference on Association for Computational Linguistic (ACL2004), Barcelona, Spain, 2004.
[Joachims,
1999], Thorsten Joachims. Making largescale SVM learning
practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in
Kernel Methods  Support Vector Learning, 1999.
[Collins and Duffy, 2002], Michael Collins and Nigel Duffy. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In ACL02, 2002.
[Vishwanathan and Smola, 2002], S.V.N. Vishwanathan and A.J. Smola. Fast kernels on strings and trees. In Proceedings of Neural Information Processing Systems, 2002.
Maintained by Alessandro Moschitti moschitti@info.uniroma2.it 