Fixpoint semantics and optimization of recursive Datalog programs with aggregates*

CARLO ZANIOLO; MOHAN YANG; ARIYAM DAS; ALEXANDER SHKAPSKY; TYSON CONDIE; MATTEO INTERLANDI

doi:10.1017/S1471068417000436

Fixpoint semantics and optimization of recursive Datalog programs with aggregates*

Published online by Cambridge University Press: 23 August 2017

CARLO ZANIOLO ,

MOHAN YANG ,

ARIYAM DAS ,

ALEXANDER SHKAPSKY ,

TYSON CONDIE and

MATTEO INTERLANDI

Show author details

CARLO ZANIOLO: Affiliation:
University of California, Los Angeles, Los Angeles, CA, USA (e-mails: [email protected], [email protected], [email protected], [email protected], [email protected])
MOHAN YANG: Affiliation:
University of California, Los Angeles, Los Angeles, CA, USA (e-mails: [email protected], [email protected], [email protected], [email protected], [email protected])
ARIYAM DAS: Affiliation:
University of California, Los Angeles, Los Angeles, CA, USA (e-mails: [email protected], [email protected], [email protected], [email protected], [email protected])
ALEXANDER SHKAPSKY: Affiliation:
University of California, Los Angeles, Los Angeles, CA, USA (e-mails: [email protected], [email protected], [email protected], [email protected], [email protected])
TYSON CONDIE: Affiliation:
University of California, Los Angeles, Los Angeles, CA, USA (e-mails: [email protected], [email protected], [email protected], [email protected], [email protected])
MATTEO INTERLANDI: Affiliation:
Microsoft, Redmond, WA, USA (e-mail: [email protected])

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

A very desirable Datalog extension investigated by many researchers in the last 30 years consists in allowing the use of the basic SQL aggregates min, max, count and sum in recursive rules. In this paper, we propose a simple comprehensive solution that extends the declarative least-fixpoint semantics of Horn Clauses, along with the optimization techniques used in the bottom-up implementation approach adopted by many Datalog systems. We start by identifying a large class of programs of great practical interest in which the use of min or max in recursive rules does not compromise the declarative fixpoint semantics of the programs using those rules. Then, we revisit the monotonic versions of count and sum aggregates proposed by Mazuran et al. (2013b, The VLDB Journal 22, 4, 471–493) and named, respectively, mcount and msum. Since mcount, and also msum on positive numbers, are monotonic in the lattice of set-containment, they preserve the fixpoint semantics of Horn Clauses. However, in many applications of practical interest, their use can lead to inefficiencies, that can be eliminated by combining them with max, whereby mcount and msum become the standard count and sum. Therefore, the semantics and optimization techniques of Datalog are extended to recursive programs with min, max, count and sum, making possible the advanced applications of superior performance and scalability demonstrated by BigDatalog (Shkapsky et al. 2016. In SIGMOD. ACM, 1135–1149) and Datalog-MC (Yang et al. 2017. The VLDB Journal 26, 2, 229–248).

Keywords

Datalog Constraints Recursion Aggregates

Type: Regular Papers
Information: Theory and Practice of Logic Programming , Volume 17 , Special Issue 5-6: 33rd International Conference on Logic Programming , September 2017 , pp. 1048 - 1065

DOI: https://doi.org/10.1017/S1471068417000436 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

†

Work done while at UCLA.

This work was supported in part by NSF grants IIS-1218471, IIS-1302698 and CNS-1351047 and U54EB020404 awarded by NIH Big Data to Knowledge (BD2K).

References

Aref, M. et al. 2015. Design and implementation of the LogicBlox system. In Proc. of SIGMOD. ACM, 1371–1382.Google Scholar

Arni, F., Ong, K., Tsur, S., Wang, H. and Zaniolo, C. 2003. The deductive database system LDL++. Theory and Practice of Logic Programming 3, 1, 61–94.CrossRef Google Scholar

Chimenti, D., O'Hare, A. B., Krishnamurthy, R., Tsur, S., West, C. and Zaniolo, C. 1987. An overview of the LDL system. IEEE Data Engineering Bulletin 10, 4, 52–62.Google Scholar

Condie, T., et al. 2017. Advanced Applications by Least-Fixpoint Algorithms Specified using Aggregates in Datalog. Technical Report 170012, UCLA CSD.Google Scholar

Faber, W., Pfeifer, G. and Leone, N. 2011. Semantics and complexity of recursive aggregates in answer set programming. Artificial Intelligence 175, 1, 278–298.Google Scholar

Furfaro, F., Greco, S., Ganguly, S. and Zaniolo, C. 2002. Pushing extrema aggregates to optimize logic queries. Information Systems 27, 5 (July), 321–343.CrossRef Google Scholar

Ganguly, S., Greco, S. and Zaniolo, C. 1995. Extrema predicates in deductive databases. Journal of Computer and System Sciences 51, 2, 244–259.Google Scholar

Gelfond, M. and Zhang, Y. 2014. Vicious circle principle and logic programs with aggregates. Theory and Practice of Logic Programming 14, 4–5, 587–601.Google Scholar

Greco, S., Zaniolo, C. and Ganguly, S. 1992. Greedy by choice. In Proc. of PODS. ACM, 105–113.Google Scholar

Kemp, D., Meenakshi, K., Balbin, I. and Ramamohanarao, K. 1989. Propagating constraints in recursive deductive databases. In North American Conference on Logic Programming, 981–998.Google Scholar

Kemp, D. B. and Stuckey, P. J. 1991. Semantics of logic programs with aggregates. In Proc. of ISLP, 387–401.Google Scholar

Liu, Y. A., Stoller, S. D., Lin, B. and Gorbovitski, M. 2012. From clarity to efficiency for distributed algorithms. SIGPLAN Notices 47, 10 (October), 395–410.Google Scholar

Mazuran, M., Serra, E. and Zaniolo, C. 2013a. A declarative extension of Horn clauses, and its significance for Datalog and its applications. Theory and Practice of Logic Programming 13, 4–5, 609–623.Google Scholar

Mazuran, M., Serra, E. and Zaniolo, C. 2013b. Extending the power of Datalog recursion. The VLDB Journal 22, 4, 471–493.Google Scholar

Morris, K. A., Ullman, J. D. and Gelder, A. V. 1986. Design overview of the NAIL! system. In Proc. of ICLP, 554–568.Google Scholar

Mumick, I. S., Pirahesh, H. and Ramakrishnan, R. 1990. The magic of duplicates and aggregates. In VLDB. Morgan Kaufmann Publishers Inc., 264–277.Google Scholar

Mumick, I. S. and Shmueli, O. 1995. How expressive is stratified aggregation? Annals of Mathematics and Artificial Intelligence 15, 3–4, 407–435.CrossRef Google Scholar

Pelov, N., Denecker, M. and Bruynooghe, M. 2007. Well-founded and stable semantics of logic programs with aggregates. Theory and Practice of Logic Programming 7, 3, 301–353.Google Scholar

Przymusinski, T. C. 1988. Perfect model semantics. In Proc. of ICLP/SLP, 1081–1096.Google Scholar

Ramakrishnan, R., Srivastava, D. and Sudarshan, S. 1992. CORAL - control, relations and logic. In Proc. of PVLDB, 238–250.Google Scholar

Ross, K. A. and Sagiv, Y. 1992. Monotonic aggregation in deductive databases. In Proc. of PODS, 114–126.Google Scholar

Seo, J., Park, J., Shin, J. and Lam, M. S. 2013. Distributed socialite: a Datalog-based language for large-scale graph analysis. Proceedings of the VLDB Endowment 6, 14, 1906–1917.Google Scholar

Shkapsky, A., Yang, M., Interlandi, M., Chiu, H., Condie, T. and Zaniolo, C. 2016. Big data analytics with Datalog queries on Spark. In Proc. of SIGMOD. ACM, 1135–1149.Google Scholar

Shkapsky, A., Yang, M. and Zaniolo, C. 2015. Optimizing recursive queries with monotonic aggregates in DeALS. In Proc. of ICDE. IEEE, 867–878.Google Scholar

Shkapsky, A., Zeng, K. and Zaniolo, C. 2013. Graph queries in a next-generation Datalog system. Proceedings of the VLDB Endowment 6, 12, 1258–1261.CrossRef Google Scholar

Son, T. C. and Pontelli, E. 2007. A constructive semantic characterization of aggregates in answer set programming. Theory and Practice of Logic Programming 7, 3, 355–375.Google Scholar

Srivastava, D. and Ramakrishnan, R. 1992. Pushing constraint selections. In Journal of Logic Programming, 301–315.Google Scholar

Sudarshan, S. and Ramakrishnan, R. 1991. Aggregation and relevance in deductive databases. In Proc. of VLDB, 501–511.Google Scholar

Swift, T. and Warren, D. S. 2010. Tabling with answer subsumption: Implementation, applications and performance. In Proc. of JELIA, 300–312.Google Scholar

Vaghani, J., Ramamohanarao, K., Kemp, D. B., Somogyi, Z., Stuckey, P. J., Leask, T. S. and Harland, J. 1994. The Aditi deductive database system. VLDB Journal 3, 2, 245–288.Google Scholar

van Emden, M. H. and Kowalski, R. A. 1976. The semantics of predicate logic as a programming language. Journal of the ACM 23, 4, 733–742.Google Scholar

Van Gelder, A. 1993. Foundations of aggregation in deductive databases. In Deductive and Object-Oriented Databases. Springer, 13–34.Google Scholar

Wang, J., Balazinska, M. and Halperin, D. 2015. Asynchronous and fault-tolerant recursive Datalog evaluation in shared-nothing engines. Proceedings of the VLDB Endowment 8, 12, 1542–1553.Google Scholar

Yang, M., Shkapsky, A. and Zaniolo, C. 2015. Parallel bottom-up evaluation of logic programs: DeALS on shared-memory multicore machines. In Technical Communications of ICLP.Google Scholar

Yang, M., Shkapsky, A. and Zaniolo, C. 2017. Scaling up the performance of more powerful Datalog systems on multicore machines. The VLDB Journal 26, 2, 229–248.CrossRef Google Scholar

Yang, M. and Zaniolo, C. 2014. Main memory evaluation of recursive queries on multicore machines. In Proc. of IEEE Big Data, 251–260.Google Scholar

Zaniolo, C., Ceri, S., Faloutsos, C., Snodgrass, R. T., Subrahmanian, V. S. and Zicari, R. 1997. Advanced Database Systems. Morgan Kaufmann.Google Scholar

Zaniolo, C., Yang, M., Das, A. and Interlandi, M. 2016. The magic of pushing extrema into recursion: Simple, powerful Datalog programs. In Proc. of AMW.Google Scholar

Zhou, N.-F., Barták, R. and Dovier, A. 2015. Planning as tabled logic programming. Theory and Practice of Logic Programming 15, 4–5, 543–558.CrossRef Google Scholar

Zhou, N.-F., Kameya, Y. and Sato, T. 2010. Mode-directed tabling for dynamic programming, machine learning, and constraint solving. In Proc. of ICTAI '10. Washington, DC, USA, 213–218.Google Scholar

Article contents

Fixpoint semantics and optimization of recursive Datalog programs with aggregates*

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests