Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (2024)

Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (2)

Advanced Search

Browse

Article

Free Access

  • Authors:
  • Xintong Guo Harbin Institute of Technology, Harbin, China

    Harbin Institute of Technology, Harbin, China

    Search about this author

    ,
  • Hong Gao Harbin Institute of Technology, Harbin, China

    Harbin Institute of Technology, Harbin, China

    Search about this author

    ,
  • Zhaonian Zou Harbin Institute of Technology, Harbin, China

    Harbin Institute of Technology, Harbin, China

    Search about this author

Database Systems for Advanced Applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part IApr 2019Pages 742–759https://doi.org/10.1007/978-3-030-18576-3_44

Published:24 April 2019Publication History

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • Publisher Site

Database Systems for Advanced Applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part I

Leon: A Distributed RDF Engine for Multi-query Processing

Pages 742–759

PreviousChapterNextChapter

Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (3)

Abstract

As similar queries keep springing up in real query logs, few RDF systems address this problem. In this paper, we propose Leon, a distributed RDF system, which can also deal with multi-query problem. First, we apply a characteristic-set-based partitioning scheme. This scheme (i) supports the fully parallel processing of join within characteristic sets; (ii) minimizes data communication by applying direct transmission of intermediate results instead of broadcasting. Then, Leon revisits the classical problem of multi-query optimization in the context of RDF/SPARQL. In light of the NP-hardness of the multi-query optimization for SPARQL, we propose a heuristic algorithm that partitions the input batch of queries into groups, and discover the common sub-query of multiple SPARQL queries. Our MQO algorithm incorporates with a subtle cost model to generate execution plans.

Our experiments with synthetic and real datasets verify that: (i) Leon’s startup overhead is low; (ii) Leon consistently outperforms centralized RDF engines by 1–2 orders of magnitude, and it is competitive with state-of-the-art distributed RDF engines; (iii) Our MQO approach consistently demonstrates 10 speedup over the baseline method.

References

  1. 1.Abdelaziz, I., Al-Harbi, R., Khayyat, Z., Kalnis, P.: A survey and experimental comparison of distributed SPARQL engines for very large RDF data. In: PVLDB (2017)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (4)
  2. 2.Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix “Bit” loaded: a scalable lightweight join query processor for RDF data. In: WWW (2010)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (5)
  3. 3.Carroll, J.J., Dickinson, I., Dollin, C., Reynolds, D., Seaborne, A., Wilkinson, K.J.: Jena: implementing the semantic web recommendations. In: WWW (2004)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (6)
  4. 4.Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. (2005)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (7)
  5. 5.Feng, J., Meng, C., Song, J., Zhang, X., Feng, Z., Zou, L.: SPARQL query parallel processing: a survey. In: 2017 IEEE BigData Congress (2017)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (8)
  6. 6.Gurajada, S.: TriAD: a distributed shared-nothing RDF engine based on asynchronous message passing. In: SIGMOD Conference (2014)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (9)
  7. 7.Harbi RAbdelaziz IKalnis PMamoulis NAccelerating SPARQL queries by exploiting hash-based locality and adaptive partitioningVLDB J.201625335538010.1007/s00778-016-0420-yGoogle ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (10)Digital Library
  8. 8.Hong, M., Demers, A.J., Gehrke, J., Koch, C., Riedewald, M.: Massively multi-query join processing in publish/subscribe systems. In: SIGMOD Conference (2007)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (12)
  9. 9.Hose, K., Schenkel, R.: WARP: workload-aware replication and partitioning for RDF. In: 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW), pp. 1–6 (2013)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (13)
  10. 10.Kaoudi ZManolescu IRDF in the clouds: a surveyVLDB J.201424679110.1007/s00778-014-0364-zGoogle ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (14)Digital Library
  11. 11.Karypis GKumar VA fast and high quality multilevel scheme for partitioning irregular graphsSIAM J. Sci. Comput.1998201359392163907310.1137/S1064827595287997Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (16)Digital Library
  12. 12.Kementsietsidis ANeven Fde Craen DVVansummeren SScalable multi-query optimization for exploratory queries over federated scientific databasesPVLDB200811627Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (18)Digital Library
  13. 13.Kim, I., Lee, K.H., Lee, K.C.: SAMUEL: a sharing-based approach to processing multiple SPARQL queries with MapReduce. In: EDBT (2018)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (20)
  14. 14.Le, W., Kementsietsidis, A., Duan, S., Li, F.: Scalable multi-query optimization for SPARQL. In: 2012 IEEE 28th International Conference on Data Engineering (2012)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (21)
  15. 15.Lee KLiu LScaling queries over big RDF graphs with semantic hash partitioningPVLDB2013618941905Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (22)Digital Library
  16. 16.Liu, C., Qu, J., Qi, G., Wang, H., Yu, Y.: HadoopSPARQL: a hadoop-based engine for multiple SPARQL query answering. In: ESWC (2012)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (24)
  17. 17.Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 984–994 (2011)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (25)
  18. 18.Neumann TWeikum GRDF-3X: a RISC-style engine for RDFPVLDB200811647659Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (26)Digital Library
  19. 19.Papailiou, N., Konstantinou, I., Tsoumakos, D.: HRDF+: high-performance distributed joins over large-scale RDF graphs. In: BigData Conference (2013)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (28)
  20. 20.Ren XWang JMulti-query optimization for subgraph isomorphism searchPVLDB201610121132Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (29)Digital Library
  21. 21.Rohloff, K., Schantz, R.E.: High-performance, massively scalable distributed systems using the MapReduce software framework: the SHARD triple-store. In: PSI EtA (2010)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (31)
  22. 22.Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. In: SIGMOD Conference (2000)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (32)
  23. 23.Schätzle APrzyjaciel-Zablocki MSkilevic SLausen GS2RDF: RDF querying with SPARQL on sparkPVLDB20169804815Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (33)Digital Library
  24. 24.Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: SIGMOD Conference (2013)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (35)
  25. 25.Srivastava, D.: Navigation- vs. index-based XML multi-query processing. In: Proceedings of the ICDE, pp. 139–150 (2003)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (36)
  26. 26.Trigoni NYao YDemers AGehrke JRajaraman RPrasanna VKIyengar SSSpirakis PGWelsh MMulti-query optimization for sensor networksDistributed Computing in Sensor Systems2005HeidelbergSpringer30732110.1007/11502593_24Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (37)Digital Library
  27. 27.Walker DWDongarra JJMPI: a standard message passing interfaceSupercomputer1996125668Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (39)
  28. 28.Wu, B., Zhou, Y., Yuan, P., Liu, L., Jin, H.: Scalable SPARQL querying using path partitioning. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 795–806 (2015)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (40)
  29. 29.Yuan PLiu PWu BJin HZhang WLiu LTripleBit: a fast and compact system for large scale RDF dataPVLDB20136517528Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (41)Digital Library
  30. 30.Zhang, X., Chen, L., Tong, Y., Wang, M.: EAGRE: towards scalable I/O efficient SPARQL query evaluation on the cloud. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 565–576 (2013)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (43)
  31. 31.Zhao, Y., Deshpande, P., Naughton, J.F., Shukla, A.: Simultaneous optimization and evaluation of multiple dimensional queries. In: SIGMOD Conference (1998)Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (44)
  32. 32.Zou LMo JChen LÖzsu MTZhao DgStore: answering SPARQL queries via subgraph matchingPVLDB201148482493Google ScholarLeon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (45)Digital Library

Cited By

View all

Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (47)

    Recommendations

    • Towards distributed processing of RDF path queries

      A technical infrastructure for storing, querying and managing RDF data is a key element in the current semantic web development. Systems like Jena, Sesame or the ICS-FORTH RDF Suite are widely used for building semantic web applications. Currently, none ...

      Read More

    • Distributed stream join query processing with semijoins

      This paper addresses the distributed stream processing of window-based multi-way join queries considering the semijoin as a key join operator. In distributed stream processing, data streams arriving at remote sites need to be shipped to the processing ...

      Read More

    • Processing SPARQL queries over distributed RDF graphs

      We propose techniques for processing SPARQL queries over a large RDF graph in a distributed environment. We adopt a "partial evaluation and assembly" framework. Answering a SPARQL query Q is equivalent to finding subgraph matches of the query graph Q ...

      Read More

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Publication

    • Information
    • Contributors
    • Published in

      Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (48)

      Database Systems for Advanced Applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part I

      Apr 2019

      828 pages

      ISBN:978-3-030-18575-6

      DOI:10.1007/978-3-030-18576-3

      • Editors:
      • Guoliang Li

        Tsinghua University, Beijing, China

        ,
      • Jun Yang

        Duke University, Durham, NC, USA

        ,
      • Joao Gama

        University of Porto, Porto, Portugal

        ,
      • Juggapong Natwichai

        Chiang Mai University, Chiang Mai, Thailand

        ,
      • Yongxin Tong

        Beihang University, Beijing, China

      © Springer Nature Switzerland AG 2019

      Sponsors

        In-Cooperation

          Publisher

          Springer-Verlag

          Berlin, Heidelberg

          Publication History

          • Published: 24 April 2019

          Qualifiers

          • Article

          Conference

          Funding Sources

          • Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (49)

            Other Metrics

            View Article Metrics

          • Bibliometrics
          • Citations0
          • Article Metrics

            • Total Citations

              View Citations
            • Total Downloads

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            View Author Metrics

          • Cited By

            This publication has not been cited yet

          Digital Edition

          View this article in digital edition.

          View Digital Edition

          • Figures
          • Other

            Close Figure Viewer

            Browse AllReturn

            Caption

            View Table of Contents

            Export Citations

              Your Search Results Download Request

              We are preparing your search results for download ...

              We will inform you here when the file is ready.

              Download now!

              Your Search Results Download Request

              Your file of search results citations is now ready.

              Download now!

              Your Search Results Download Request

              Your search export query has expired. Please try again.

              Leon: A Distributed RDF Engine for Multi-query Processing | Database Systems for Advanced Applications (2024)
              Top Articles
              Latest Posts
              Article information

              Author: Fr. Dewey Fisher

              Last Updated:

              Views: 6011

              Rating: 4.1 / 5 (42 voted)

              Reviews: 89% of readers found this page helpful

              Author information

              Name: Fr. Dewey Fisher

              Birthday: 1993-03-26

              Address: 917 Hyun Views, Rogahnmouth, KY 91013-8827

              Phone: +5938540192553

              Job: Administration Developer

              Hobby: Embroidery, Horseback riding, Juggling, Urban exploration, Skiing, Cycling, Handball

              Introduction: My name is Fr. Dewey Fisher, I am a powerful, open, faithful, combative, spotless, faithful, fair person who loves writing and wants to share my knowledge and understanding with you.