Graph Propositionalization for Learning from Structured Data

Author: Thashmee Karunaratne; Stockholms Universitet.; [2007]

Keywords: ;

Abstract: Learning from structured data is challenging in terms of learning concepts, patterns or relations hidden within the structured data. The state-of-the-art methods include logic based approaches and graph based approaches. Propositionalization approaches, which is a sub class of logic and graph based methods, construct feature vectors either by a logic or graph based approach, allowing an attribute value learner to build the predictive model. Logic based methods use extensive biases during the construction of the hypothesis language, and search. In contrast, graph based methods use constraints such as the connectedness of the discovered sub-graphs. Almost all the graph based approaches require the NP-complete graph isomorphism test. Also, the existing graph based methods do not use the additional relevant background knowledge effectively. This thesis contains a study of the experiments we have carried out in order to investigate whether it is possible to increase the predictive performance of a learner by eliminating one or more of the limitations coupled with the state-of-the-art methods for learning from structured data. Our approach in this regard is a graph propositionalization method that is described in the three papers included in this thesis. The first paper introduces a method called finger printing, which discovers substructures (sub-graphs) from the structured data. These substructures are the set of features that is used in the attribute value learner to build the predictive model. The second paper expands the concept of graph propositionalization introduced in the first paper in such a way that it could avoid the problem of graph isomorphism as well as handle the problem of not being able to discover disconnected sub-graphs. The third paper extends the approach further and expands the general graph representation in order to include relevant background knowledge into graphs and thereby achieve enhanced classification accuracy for graph based learning. Results obtained from our experiments reveal that our method outperforms the state-of-the-art methods. It also is shown that the predictive performance of a graph based learner is significantly improved by incorporating the additional relevant background knowledge effectively in the graphs. <em>Submitted to Stockholm University in partial fulfilment of the requirements for the degree of Licentiate of Philosophy</em>

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.