A novel keyframe extraction method for video classification using deep neural networks

Kiziltepe, Rukiye Savran; Gan, John Q.; Escobar Pérez, Juan José

doi:10.1007/s00521-021-06322-x

SavranKızıltepe2021.pdf (1.071Mb)

Identificadores

URI: http://hdl.handle.net/10481/70404

DOI: 10.1007/s00521-021-06322-x

Exportar

Editorial

Springer

Materia

Deep learning

Convolutional neural networks

Recurrent neural networks

Keyframe extraction

Video classification

Fecha

2021-08-02

Referencia bibliográfica

Savran Kızıltepe, R., Gan, J.Q. & Escobar, J.J. A novel keyframe extraction method for video classification using deep neural networks. Neural Comput & Applic (2021). [https://doi.org/10.1007/s00521-021-06322-x]

Patrocinador

Ministry of National Education - Turkey; Spanish Ministry of Science, Innovation, and Universities PGC2018-098813-B-C31; ERDF fund

Resumen

Combining convolutional neural networks (CNNs) and recurrent neural networks (RNNs) produces a powerful architecture for video classification problems as spatial–temporal information can be processed simultaneously and effectively. Using transfer learning, this paper presents a comparative study to investigate how temporal information can be utilized to improve the performance of video classification when CNNs and RNNs are combined in various architectures. To enhance the performance of the identified architecture for effective combination of CNN and RNN, a novel action template-based keyframe extraction method is proposed by identifying the informative region of each frame and selecting keyframes based on the similarity between those regions. Extensive experiments on KTH and UCF-101 datasets with ConvLSTM-based video classifiers have been conducted. Experimental results are evaluated using one-way analysis of variance, which reveals the effectiveness of the proposed keyframe extraction method in the sense that it can significantly improve video classification accuracy.

Colecciones

DICAR - Artículos

Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 3.0 España