GRAPE for fast and scalable graph processing and random-walk-based embedding

Cappelletti, Luca; Cano Gutiérrez, Carlos

doi:10.1038/s43588-023-00465-8

s43588-023-00465-8.pdf (2.731Mb)

Identificadores

URI: https://hdl.handle.net/10481/84883

DOI: 10.1038/s43588-023-00465-8

Exportar

Editorial

Springer Nature

Materia

Mathematics and computing

Physical sciences

Software

Fecha

2023-06-26

Referencia bibliográfica

Cappelletti, L., Fontana, T., Casiraghi, E. et al. GRAPE for fast and scalable graph processing and random-walk-based embedding. Nat Comput Sci 3, 552–568 (2023). [https://doi.org/10.1038/s43588-023-00465-8]

Patrocinador

National Center for Gene Therapy and Drugs based on RNA Technology, PNRR-NextGenerationEU program G43C22001320007; United States Department of Health & Human Services National Institutes of Health (NIH) - USA NIH National Cancer Institute (NCI) U01-CA239108-02; Transition Grant Line 1A Project NIMI PARTENARIATI H2020' 1R24OD011883-01; United States Department of Health & Human Services National Institutes of Health (NIH) - USA U01-CA239108-02 DE-AC02-05CH11231; United States Department of Energy (DOE); European Union (EU) Marie Curie Actions PSR2015-1720GVALE_01 PID2021-128970OA-I00

Resumen

Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately 1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing utilities, and over 80,000 graphs from the literature and other sources. Standardized interfaces allow a seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of graph-representation-learning methods, therefore also positioning GRAPE as a software resource that performs a fair comparison between methods and libraries for graph processing and embedding.

Colecciones

DCCIA - Artículos

Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 4.0 Internacional