| dc.contributor.author | Fernández Basso, Carlos Jesús | |
| dc.contributor.author | Ruiz Jiménez, María Dolores | |
| dc.contributor.author | Martín Bautista, María José | |
| dc.date.accessioned | 2023-05-31T07:21:48Z | |
| dc.date.available | 2023-05-31T07:21:48Z | |
| dc.date.issued | 2023-04-30 | |
| dc.identifier.citation | Fernandez-Basso, C. et al. New Spark solutions for distributed frequent itemset and association rule mining algorithms. Cluster Computing. [https://doi.org/10.1007/s10586-023-04014-w] | es_ES |
| dc.identifier.uri | https://hdl.handle.net/10481/82042 | |
| dc.description | Funding for open access publishing: Universidad de Gran-
ada/CBUA. The research reported in this paper was partially sup-
ported by the BIGDATAMED project, which has received funding
from the Andalusian Government (Junta de Andalucı ́a) under grant
agreement No P18-RT-1765, by Grants PID2021-123960OB-I00 and
Grant TED2021-129402B-C21 funded by Ministerio de Ciencia e
Innovacio ́n and, by ERDF A way of making Europe and by the
European Union NextGenerationEU. In addition, this work has been
partially supported by the Ministry of Universities through the EU-
funded Margarita Salas programme NextGenerationEU. Funding for
open access charge: Universidad de Granada/CBUA | es_ES |
| dc.description.abstract | The large amount of data generated every day makes necessary the re-implementation of new methods capable of handle with
massive data efficiently. This is the case of Association Rules, an unsupervised data mining tool capable of extracting information
in the form of IF-THEN patterns. Although several methods have been proposed for the extraction of frequent itemsets (previous
phase before mining association rules) in very large databases, the high computational cost and lack of memory remains a major
problem to be solved when processing large data. Therefore, the aim of this paper is three fold: (1) to review existent algorithms for
frequent itemset and association rule mining, (2)to develop new efficient frequent itemset Big Data algorithms using distributive
computation, as well as a new association rule mining algorithm in Spark, and (3) to compare the proposed algorithms with the
existent proposals varying the number of transactions and the number of items. To this purpose, we have used the Spark platform
which has been demonstrated to outperform existing distributive algorithmic implementations. | es_ES |
| dc.description.sponsorship | Universidad de Granada/CBUA | es_ES |
| dc.description.sponsorship | Junta de Andalucia
P18-RT-1765 | es_ES |
| dc.description.sponsorship | Ministry of Science and Innovation, Spain (MICINN)
Instituto de Salud Carlos III
Spanish Government
PID2021-123960OB-I00,
TED2021-129402B-C21 | es_ES |
| dc.description.sponsorship | ERDF A way of making Europe | es_ES |
| dc.description.sponsorship | European Union NextGenerationEU | es_ES |
| dc.description.sponsorship | Ministry of Universities through the EU | es_ES |
| dc.language.iso | eng | es_ES |
| dc.publisher | Springer | es_ES |
| dc.rights | Atribución 4.0 Internacional | * |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
| dc.subject | Big Data | es_ES |
| dc.subject | Data Mining | es_ES |
| dc.subject | Association Rule | es_ES |
| dc.subject | Frequent Itemset | es_ES |
| dc.subject | Distributed computing | es_ES |
| dc.subject | Spark | es_ES |
| dc.title | New Spark solutions for distributed frequent itemset and association rule mining algorithms | es_ES |
| dc.type | journal article | es_ES |
| dc.rights.accessRights | open access | es_ES |
| dc.identifier.doi | 10.1007/s10586-023-04014-w | |
| dc.type.hasVersion | VoR | es_ES |