Article Free Access
- Authors:
- Xintong Guo Harbin Institute of Technology, Harbin, China
Harbin Institute of Technology, Harbin, China
Search about this author
- Hong Gao Harbin Institute of Technology, Harbin, China
Harbin Institute of Technology, Harbin, China
Search about this author
- Zhaonian Zou Harbin Institute of Technology, Harbin, China
Harbin Institute of Technology, Harbin, China
Search about this author
Database Systems for Advanced Applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part IApr 2019Pages 742–759https://doi.org/10.1007/978-3-030-18576-3_44
Published:24 April 2019Publication History
- 0citation
- 0
- Downloads
Metrics
Total Citations0Total Downloads0Last 12 Months0
Last 6 weeks0
- Get Citation Alerts
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
New Citation Alert!
Please log in to your account
- Publisher Site
Database Systems for Advanced Applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part I
Leon: A Distributed RDF Engine for Multi-query Processing
Pages 742–759
PreviousChapterNextChapter
Abstract
As similar queries keep springing up in real query logs, few RDF systems address this problem. In this paper, we propose Leon, a distributed RDF system, which can also deal with multi-query problem. First, we apply a characteristic-set-based partitioning scheme. This scheme (i) supports the fully parallel processing of join within characteristic sets; (ii) minimizes data communication by applying direct transmission of intermediate results instead of broadcasting. Then, Leon revisits the classical problem of multi-query optimization in the context of RDF/SPARQL. In light of the NP-hardness of the multi-query optimization for SPARQL, we propose a heuristic algorithm that partitions the input batch of queries into groups, and discover the common sub-query of multiple SPARQL queries. Our MQO algorithm incorporates with a subtle cost model to generate execution plans.
Our experiments with synthetic and real datasets verify that: (i) Leon’s startup overhead is low; (ii) Leon consistently outperforms centralized RDF engines by 1–2 orders of magnitude, and it is competitive with state-of-the-art distributed RDF engines; (iii) Our MQO approach consistently demonstrates 10 speedup over the baseline method.
References
- 1.Abdelaziz, I., Al-Harbi, R., Khayyat, Z., Kalnis, P.: A survey and experimental comparison of distributed SPARQL engines for very large RDF data. In: PVLDB (2017)Google Scholar
- 2.Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix “Bit” loaded: a scalable lightweight join query processor for RDF data. In: WWW (2010)Google Scholar
- 3.Carroll, J.J., Dickinson, I., Dollin, C., Reynolds, D., Seaborne, A., Wilkinson, K.J.: Jena: implementing the semantic web recommendations. In: WWW (2004)Google Scholar
- 4.Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. (2005)Google Scholar
- 5.Feng, J., Meng, C., Song, J., Zhang, X., Feng, Z., Zou, L.: SPARQL query parallel processing: a survey. In: 2017 IEEE BigData Congress (2017)Google Scholar
- 6.Gurajada, S.: TriAD: a distributed shared-nothing RDF engine based on asynchronous message passing. In: SIGMOD Conference (2014)Google Scholar
- 7.Harbi RAbdelaziz IKalnis PMamoulis NAccelerating SPARQL queries by exploiting hash-based locality and adaptive partitioningVLDB J.201625335538010.1007/s00778-016-0420-yGoogle ScholarDigital Library
- 8.Hong, M., Demers, A.J., Gehrke, J., Koch, C., Riedewald, M.: Massively multi-query join processing in publish/subscribe systems. In: SIGMOD Conference (2007)Google Scholar
- 9.Hose, K., Schenkel, R.: WARP: workload-aware replication and partitioning for RDF. In: 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW), pp. 1–6 (2013)Google Scholar
- 10.Kaoudi ZManolescu IRDF in the clouds: a surveyVLDB J.201424679110.1007/s00778-014-0364-zGoogle ScholarDigital Library
- 11.Karypis GKumar VA fast and high quality multilevel scheme for partitioning irregular graphsSIAM J. Sci. Comput.1998201359392163907310.1137/S1064827595287997Google ScholarDigital Library
- 12.Kementsietsidis ANeven Fde Craen DVVansummeren SScalable multi-query optimization for exploratory queries over federated scientific databasesPVLDB200811627Google ScholarDigital Library
- 13.Kim, I., Lee, K.H., Lee, K.C.: SAMUEL: a sharing-based approach to processing multiple SPARQL queries with MapReduce. In: EDBT (2018)Google Scholar
- 14.Le, W., Kementsietsidis, A., Duan, S., Li, F.: Scalable multi-query optimization for SPARQL. In: 2012 IEEE 28th International Conference on Data Engineering (2012)Google Scholar
- 15.Lee KLiu LScaling queries over big RDF graphs with semantic hash partitioningPVLDB2013618941905Google ScholarDigital Library
- 16.Liu, C., Qu, J., Qi, G., Wang, H., Yu, Y.: HadoopSPARQL: a hadoop-based engine for multiple SPARQL query answering. In: ESWC (2012)Google Scholar
- 17.Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 984–994 (2011)Google Scholar
- 18.Neumann TWeikum GRDF-3X: a RISC-style engine for RDFPVLDB200811647659Google ScholarDigital Library
- 19.Papailiou, N., Konstantinou, I., Tsoumakos, D.: HRDF+: high-performance distributed joins over large-scale RDF graphs. In: BigData Conference (2013)Google Scholar
- 20.Ren XWang JMulti-query optimization for subgraph isomorphism searchPVLDB201610121132Google ScholarDigital Library
- 21.Rohloff, K., Schantz, R.E.: High-performance, massively scalable distributed systems using the MapReduce software framework: the SHARD triple-store. In: PSI EtA (2010)Google Scholar
- 22.Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. In: SIGMOD Conference (2000)Google Scholar
- 23.Schätzle APrzyjaciel-Zablocki MSkilevic SLausen GS2RDF: RDF querying with SPARQL on sparkPVLDB20169804815Google ScholarDigital Library
- 24.Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: SIGMOD Conference (2013)Google Scholar
- 25.Srivastava, D.: Navigation- vs. index-based XML multi-query processing. In: Proceedings of the ICDE, pp. 139–150 (2003)Google Scholar
- 26.Trigoni NYao YDemers AGehrke JRajaraman RPrasanna VKIyengar SSSpirakis PGWelsh MMulti-query optimization for sensor networksDistributed Computing in Sensor Systems2005HeidelbergSpringer30732110.1007/11502593_24Google ScholarDigital Library
- 27.Walker DWDongarra JJMPI: a standard message passing interfaceSupercomputer1996125668Google Scholar
- 28.Wu, B., Zhou, Y., Yuan, P., Liu, L., Jin, H.: Scalable SPARQL querying using path partitioning. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 795–806 (2015)Google Scholar
- 29.Yuan PLiu PWu BJin HZhang WLiu LTripleBit: a fast and compact system for large scale RDF dataPVLDB20136517528Google ScholarDigital Library
- 30.Zhang, X., Chen, L., Tong, Y., Wang, M.: EAGRE: towards scalable I/O efficient SPARQL query evaluation on the cloud. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 565–576 (2013)Google Scholar
- 31.Zhao, Y., Deshpande, P., Naughton, J.F., Shukla, A.: Simultaneous optimization and evaluation of multiple dimensional queries. In: SIGMOD Conference (1998)Google Scholar
- 32.Zou LMo JChen LÖzsu MTZhao DgStore: answering SPARQL queries via subgraph matchingPVLDB201148482493Google ScholarDigital Library
Cited By
View all
Recommendations
- Towards distributed processing of RDF path queries
A technical infrastructure for storing, querying and managing RDF data is a key element in the current semantic web development. Systems like Jena, Sesame or the ICS-FORTH RDF Suite are widely used for building semantic web applications. Currently, none ...
Read More
- Distributed stream join query processing with semijoins
This paper addresses the distributed stream processing of window-based multi-way join queries considering the semijoin as a key join operator. In distributed stream processing, data streams arriving at remote sites need to be shipped to the processing ...
Read More
- Processing SPARQL queries over distributed RDF graphs
We propose techniques for processing SPARQL queries over a large RDF graph in a distributed environment. We adopt a "partial evaluation and assembly" framework. Answering a SPARQL query Q is equivalent to finding subgraph matches of the query graph Q ...
Read More
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in
Full Access
Get this Publication
- Information
- Contributors
Published in
Database Systems for Advanced Applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part I
Apr 2019
828 pages
ISBN:978-3-030-18575-6
DOI:10.1007/978-3-030-18576-3
- Editors:
- Guoliang Li
Tsinghua University, Beijing, China
, - Jun Yang
Duke University, Durham, NC, USA
, - Joao Gama
University of Porto, Porto, Portugal
, - Juggapong Natwichai
Chiang Mai University, Chiang Mai, Thailand
, - Yongxin Tong
Beihang University, Beijing, China
© Springer Nature Switzerland AG 2019
Sponsors
In-Cooperation
Publisher
Springer-Verlag
Berlin, Heidelberg
Publication History
- Published: 24 April 2019
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics
- Bibliometrics
- Citations0
Article Metrics
- View Citations
Total Citations
Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet
Digital Edition
View this article in digital edition.
View Digital Edition
- Figures
- Other
Close Figure Viewer
Browse AllReturn
Caption
View Table of Contents
Export Citations
Your Search Results Download Request
We are preparing your search results for download ...
We will inform you here when the file is ready.
Download now!
Your Search Results Download Request
Your file of search results citations is now ready.
Download now!
Your Search Results Download Request
Your search export query has expired. Please try again.