Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation

Martín Doñas, Juan M.; Jensen, Jesper; Tan, Zheng-Hua; Gómez García, Ángel Manuel; Peinado Herreros, Antonio Miguel

doi:10.1109/TASLP.2020.3036776

dc.contributor.author	Martín Doñas, Juan M.
dc.contributor.author	Jensen, Jesper
dc.contributor.author	Tan, Zheng-Hua
dc.contributor.author	Gómez García, Ángel Manuel
dc.contributor.author	Peinado Herreros, Antonio Miguel
dc.date.accessioned	2021-11-15T08:08:01Z
dc.date.available	2021-11-15T08:08:01Z
dc.date.issued	2020-11-09
dc.identifier.citation	Martín-Doñas, J. M., Jensen, J., Tan, Z. H., Gomez, A. M., & Peinado, A. M. (2020). Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 3080-3094.	es_ES
dc.identifier.uri	http://hdl.handle.net/10481/71502
dc.description.abstract	This article presents a recursive expectation-maximization algorithm for online multichannel speech enhancement. A deep neural network mask estimator is used to compute the speech presence probability, which is then improved by means of statistical spatial models of the noisy speech and noise signals. The clean speech signal is estimated using beamforming, single-channel linear postfiltering and speech presence masking. The clean speech statistics and speech presence probabilities are finally used to compute the acoustic parameters for beamforming and postfiltering by means of maximum likelihood estimation. This iterative procedure is carried out on a frame-by-frame basis. The algorithm integrates the different estimates in a common statistical framework suitable for online scenarios. Moreover, our method can successfully exploit spectral, spatial and temporal speech properties. Our proposed algorithm is tested in different noisy environments using the multichannel recordings of the CHiME-4 database. The experimental results show that our method outperforms other related state-of-the-art approaches in noise reduction performance, while allowing low-latency processing for real-time applications.	es_ES
dc.description.sponsorship	Spanish MICINN/FEDER (Grant Number: PID2019-104206GB-I00)	es_ES
dc.description.sponsorship	Spanish Ministry of Universities National Program FPU (Grant Number: FPU15/04161)	es_ES
dc.language.iso	eng	es_ES
dc.publisher	IEEE	es_ES
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 España	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.subject	Deep learning (DL)	es_ES
dc.subject	Speech enhancement	es_ES
dc.subject	beamforming	es_ES
dc.title	Online Multichannel Speech Enhancement Based on Recursive EM and DNN-Based Speech Presence Estimation	es_ES
dc.type	journal article	es_ES
dc.rights.accessRights	open access	es_ES
dc.identifier.doi	10.1109/TASLP.2020.3036776

Files in this item

Name:: single.pdf
Size:: 1.846Mb
Format:: PDF

This item appears in the following Collection(s)

DTSTC - Artículos

Show simple item record

Except where otherwise noted, this item's license is described as Atribución-NoComercial-SinDerivadas 3.0 España