Evaluation of the Limit of Detection in Network Dataset Quality Assessment with PerQoDA

Wasielewska, Katarzyna; Soukup, Dominik; Cejka, Tomas; Camacho Páez, José

Versión de autor (561.3Kb)

Identificadores

URI: https://hdl.handle.net/10481/81204

Exportar

Editorial

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2022, 4th Workshop on Machine Learning for Cybersecurity (MLCS)

Materia

Dataset quality assessment

Permutation testing

Network dataset

Network security

Attack detection

Machine learning

Classification

Fecha

2022

Patrocinador

This work is partially funded by the European Union’s Horizon 2020 research, innovation programme under the Marie Sk lodowska-Curie grant agreement No 893146, by the Agencia Estatal de Investigaci´on in Spain, grant No PID2020- 113462RB-I00, and by the Ministry of Interior of the Czech Republic (Flow- Based Encrypted Traffic Analysis) under grant number VJ02010024. The authors would like to thank Szymon Wojciechowski for his support on the Weles tool.

Resumen

Machine learning is recognised as a relevant approach to detect attacks and other anomalies in network traffic. However, there are still no suitable network datasets that would enable effective detection. On the other hand, the preparation of a network dataset is not easy due to privacy reasons but also due to the lack of tools for assessing their quality. In a previous paper, we proposed a new method for data quality assessment based on permutation testing. This paper presents a parallel study on the limits of detection of such an approach. We focus on the problem of network flow classification and use well-known machine learning techniques. The experiments were performed using publicly available network datasets.

Colecciones

DTSTC - Comunicaciones congresos, conferencias, ...

Excepto si se señala otra cosa, la licencia del ítem se describe como Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License