Efficient access methods for very large distributed graph databases
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10347/26514
Files in this item
Metadata
Title: | Efficient access methods for very large distributed graph databases |
Author: | Luaces Cachaza, David Ríos Viqueira, José Ramón Cotos Yáñez, José Manuel Flores González, Julián Carlos |
Affiliation: | Universidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías da Información Universidade de Santiago de Compostela. Departamento de Electrónica e Computación |
Subject: | Graph databases | Subgraph search | Graph query processing | Graph indexing | Subgraph isomorphism | Large scale processing | |
Date of Issue: | 2021 |
Publisher: | Elsevier |
Citation: | Information Sciences, 573 (2021), 65-81. https://doi.org/10.1016/j.ins.2021.05.047 |
Abstract: | Subgraph searching is an essential problem in graph databases, but it is also challenging due to the involved subgraph isomorphism NP-Complete sub-problem. Filter-Then-Verify (FTV) methods mitigate performance overheads by using an index to prune out graphs that do not fit the query in a filtering stage, reducing the number of subgraph isomorphism evaluations in a subsequent verification stage. Subgraph searching has to be applied to very large databases (tens of millions of graphs) in real applications such as molecular substructure searching. Previous surveys have identified the FTV solutions GraphGrepSX (GGSX) and CT-Index as the best ones for large databases (thousands of graphs), however they cannot reach reasonable performance on very large ones (tens of millions graphs). This paper proposes a generic approach for the distributed implementation of FTV solutions. Besides, three previous methods that improve the performance of GGSX and CT-Index are adapted to be executed in clusters. The evaluation shows how the achieved solutions provide a great performance improvement (between 70% and 90% of filtering time reduction) in a centralized configuration and how they may be used to achieve efficient subgraph searching over very large databases in cluster configurations |
Publisher version: | https://doi.org/10.1016/j.ins.2021.05.047 |
URI: | http://hdl.handle.net/10347/26514 |
DOI: | 10.1016/j.ins.2021.05.047 |
ISSN: | 0020-0255 |
Rights: | © 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Attribution-NonCommercial-NoDerivatives 4.0 Internacional |
Collections
-
- CiTIUS-Artigos [177]
- EC-Artigos [146]
The following license files are associated with this item: