Structure Discovery Queries in Disk-Based Semantic Web Databases

Started by aruljothi, Apr 11, 2009, 06:53 PM

Previous topic - Next topic


Link analysis tasks are fundamental to analytical applications in scientific research, business, national security, etc. Such tasks involve finding associations or interactions between entities e.g. people, chemical or genes. In graph theoretic terms, this amounts to finding arbitrary sub-graph structures that link a given set of entities. On the other hand, the traditional graph pattern matching query paradigm focuses on finding sub-graphs that match the structure given in a query. Consequently, an important problem is developing methods for evaluating such queries, particularly, when data resides on disk. In such cases, query processing techniques must avoid loading the whole database graph into memory and must utilize indexing and query processing techniques that mitigate the inherently I/O bound nature of navigating disk based graphs. In this paper, we present a computational framework for efficiently evaluating a class of structure discovery queries. It is based on an algebraic approach to solving path problems that leads to a natural disk storage model for graph data using traditional B+ tree index structures. We present some very promising preliminary evaluation results which show a very significant improvement in query performance over other approaches.