Home > Published Issues > 2024 > Volume 15, No. 2, 2024 >
JAIT 2024 Vol.15(2): 281-287
doi: 10.12720/jait.15.2.281-287

An Integrated Deep Learning Model for Concurrent Speech Dereverberation and Denoising

Vijay M. Mane 1,*, Seema S. Arote 1, and Shakil A Shaikh 2
1. Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Technology, Pune, India
2. Department of Electronics and Computer Engineering, Pravara Rural Engineering College, Loni, India
Email: vijay.mane@vit.edu (V.M.M.); seema.arote@gmail.com (S.S.A.); shaikhshakil1968@gmail.com (S.A.S.)
*Corresponding author

Manuscript received August 10, 2023; revised September 13, 2023; accepted October 7, 2023; published February 24, 2024.

Abstract—Speech is most likely the simplest and efficient type of human-human communication, as well as the most intuitive and effective way of human-machine interaction. Human voice is often damaged in real-world contexts by both reverberation and noise from the surroundings, which has a detrimental impact on speech intelligibility and quality. In terms of denoising, a model-based approach has been thoroughly researched, and several practical solutions have been created. In comparison, study on dereverberation has been sparse. Significant advances have been achieved in the study of a model-based strategy for dereverberation. The resultant approach may be used to any deep neural network that provides masks in the time-frequency domain with just a few extra variables that can be trained and an overhead of computation that is low for state-of-the-art neural networks. A deep learning-based approach in this article is developed that eliminates early reverberations, late reverberations, and noise from speech signals in order to enhance speech signal quality. The method is tested using data from three simulated rooms—a conference room, a seminar hall, and a room from reference paper number seven—with Reverberation Time (RT60) of 0.3 s and variety of noise like Additive White Gaussian Noise (AWGN), realistic noise such as babble, restaurant and a variety of signal-to-noise ratio values. The proposed technique outperforms baseline multichannel dereverberation and denoising algorithms as well as a cutting-edge multichannel dereverberation and denoising algorithm, resulting in a considerable improvement.
Keywords—deep learning, dereverberation, denoising, Room Impulse Response (RIR)

Cite: Vijay M. Mane, Seema S. Arote, and Shakil A Shaikh, "An Integrated Deep Learning Model for Concurrent Speech Dereverberation and Denoising ," Journal of Advances in Information Technology, Vol. 15, No. 2, pp. 281-287, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.