Click here for Top of Page
Right Arrow: Next
Right Arrow: Previous
Newsletter

FEATURE:

 

ICPR 2010

K.S. Fu Prize Lecture

Towards the Unification of

Structural and Statistical Pattern Recognition

 

By Horst Bunke, IAPR Fellow (Switzerland)

Reviewed by Cem Keskin (Turkey)

The session started with Prof. Anil Jain, IAPR Fellow, introducing Prof. Horst Bunke, IAPR Fellow. Prof. Bunke started his talk by mentioning the revolutionary work of K.S. Fu. He then mentioned the two common approaches to pattern recognition, namely statistical and structural approaches. The former approach mainly makes use of feature vectors, whereas the latter uses strings, trees, and—most importantly—graphs as representatives. The talk continued with a comparison of these approaches.

The advantages of statistical approaches, as Prof. Bunke pointed out, are that they have strong theoretical foundations, and that there are many powerful associated algorithms. Yet, these methods usually use feature vectors of fixed dimensionality (based on the application), and make use of only unary features and not relations.

On the other hand, structural approaches have a variable representation size, which is based on the size of the graph, and generally have a higher representational power, Prof. Bunke added. However, unlike the statistical approaches, these methods suffer from a lack of a strong mathematical foundation, and from a lack of algorithmic tools. Prof. Bunke concluded from this argument, that by unifying both approaches, we can come up with stronger methods.

Prof. Bunke then explained what has been done in the so called classical period. He first mentioned the graph edit distance, which is the minimum amount of distortion needed to transform one graph into another. An edit path is defined as the sequence of such distortions as applied to the graph. A cost function associated with each type of distortion is used to estimate a cost for such a path, and the minimal such cost is called the graph edit distance. Finally, Prof. Bunke mentioned that faster algorithms have been discovered for this cost estimation problem.

Prof. Bunke continued his talk with the median graph finding problem, which can potentially represent a set of graphs with a single graph. Basically, by coming up with a distance measure between graphs, it is possible to find a graph either in the graph set or in the universal set that has minimal distance from all the graphs in the set. He gave some examples of handwritten letters where the estimated universal median was a very intuitive representation of the letters. He also showed that the median for graphs has similar properties to its statistical counterpart, i.e. the median graph is also very robust against noise and outliers. He finally mentioned that by making use of such median graphs (possibly by assigning weights to each affecting graph), it is possible to apply some well-known algorithms from the statistical domain, namely k-means clustering and self organizing maps. He concluded his discussion of the classical period with some examples and results on using SOM and k-NN for graphs.

The difference of the modern period, according to Prof. Bunke, is that the unification of statistical and structural methods is pursued in a more systematic manner by using tools such as graph kernels and graph embedding. Prof. Bunke then showed how the well known kernel trick can be applied to graphs by mapping them to points in R^n, i.e. via graph embedding. He mentioned that their favorite method for graph embedding is choosing a subset of graphs at hand and calculating the graph edit distance of all the graphs with the selected subset, forming a distance vector, describing the position of the graph in the so called dissimilarity space.

To demonstrate the efficiency of these methods, Prof. Bunke showed a few classification results of experiments conducted on several distinct data set types. For most of these data sets, which include handwritten letters, digits, fingerprints, web data, protein data, and molecular structures, the graph embedding method shows significant improvement over k-NN and SVM based on basic similarities.

After mentioning some additional work and literature surveys on graphs, Prof. Bunke went over some current activities in this area. He especially emphasized that the power of graphical representations is not fully explored for some sequential problems, such as handwriting recognition, where traditionally HMMs are used.

Prof. Bunke concluded his talk with a short summary and answered some questions regarding the time complexity of graph matching and the problem dependency of similarity measures for graphs. He said that currently, graphs with 1000 nodes can be matched in seconds using approximate methods, and that the similarity measure is always problem dependent.

Professor King-Sun Fu was instrumental in the founding of IAPR, served as its first president, and is widely recognized for his extensive contributions to the field of pattern recognition.

The K.S. Fu Prize is a biennial award that is given to a living person in the recognition of an outstanding technical contribution to the field of pattern recognition. 

This year’s recipient was

Prof Horst Bunke, IAPR Fellow,

Research Group on

Computer Vision

and Artificial Intelligence IAM, University of Bern, Switzerland

Prof. Horst Bunke presenting the K.S. Fu Prize Lecture at ICPR 2010.