In this paper, a new technique for the optimization of (partially) bound queries over disjunctive Datalog programs with stratified negation is presented. The technique exploits the propagation of query bindings and extends the Magic Set optimization technique (originally defined for non-disjunctive programs).
An important feature of disjunctive Datalog programs is non-monotonicity, which calls for non-deterministic implementations, such as backtracking search. A distinguishing characteristic of the new method is that the optimization can be exploited also during the non-deterministic phase. In particular, after some assumptions have been made during the computation, parts of the program may become irrelevant to a query under these assumptions. This allows for dynamic pruning of the search space. In contrast, the effect of the previously defined Magic Set methods for disjunctive Datalog is limited to the deterministic portion of the process. In this way, the potential performance gain by using the proposed method can be exponential, as could be observed empirically.
The correctness of the method is established and proved in a formal way thanks to a strong relationship between Magic Sets and unfounded sets that has not been studied in the literature before. This knowledge allows for extending the method and the correctness proof also to programs with stratified negation in a natural way.
The proposed method has been implemented in the DLV system and various experiments on synthetic as well as on real-world data have been conducted. The experimental results on synthetic data confirm the utility of Magic Sets for disjunctive Datalog, and they highlight the computational gain that may be obtained by the new method with respect to the previously proposed Magic Set method for disjunctive Datalog programs. Further experiments on data taken from a real-life application show the benefits of the Magic Set method within an application scenario that has received considerable attention in recent years, the problem of answering user queries over possibly inconsistent databases originating from integration of autonomous sources of information.
Downloads
Downloads per month over past year