Single-Channel Speech Enhancement Based on Deep Neural Networks

2020

Author : Zhiheng Ouyang
Release : 2020
Genre :
Kind : eBook
Book Rating : /5 ( reviews)

Book Synopsis Single-Channel Speech Enhancement Based on Deep Neural Networks by : Zhiheng Ouyang

Download or read book Single-Channel Speech Enhancement Based on Deep Neural Networks written by Zhiheng Ouyang. This book was released on 2020. Available in PDF, EPUB and Kindle. Book excerpt: Speech enhancement (SE) aims to improve the speech quality of the degraded speech. Recently, researchers have resorted to deep-learning as a primary tool for speech enhancement, which often features deterministic models adopting supervised training. Typically, a neural network is trained as a mapping function to convert some features of noisy speech to certain targets that can be used to reconstruct clean speech. These methods of speech enhancement using neural networks have been focused on the estimation of spectral magnitude of clean speech considering that estimating spectral phase with neural networks is difficult due to the wrapping effect. As an alternative, complex spectrum estimation implicitly resolves the phase estimation problem and has been proven to outperform spectral magnitude estimation. In the first contribution of this thesis, a fully convolutional neural network (FCN) is proposed for complex spectrogram estimation. Stacked frequency-dilated convolution is employed to obtain an exponential growth of the receptive field in frequency domain. The proposed network also features an efficient implementation that requires much fewer parameters as compared with conventional deep neural network (DNN) and convolutional neural network (CNN) while still yielding a comparable performance. Consider that speech enhancement is only useful in noisy conditions, yet conventional SE methods often do not adapt to different noisy conditions. In the second contribution, we proposed a model that provides an automatic "on/off" switch for speech enhancement. It is capable of scaling its computational complexity under different signal-to-noise ratio (SNR) levels by detecting clean or near-clean speech which requires no processing. By adopting information maximizing generative adversarial network (InfoGAN) in a deterministic, supervised manner, we incorporate the functionality of SNR-indicator into the model that adds little additional cost to the system. We evaluate the proposed SE methods with two objectives: speech intelligibility and application to automatic speech recognition (ASR). Experimental results have shown that the CNN-based model is applicable for both objectives while the InfoGAN-based model is more useful in terms of speech intelligibility. The experiments also show that SE for ASR may be more challenging than improving the speech intelligibility, where a series of factors, including training dataset and neural network models, would impact the ASR performance.

Deep Neural Network Approach for Single Channel Speech Enhancement Processing

2016 University of Ottawa theses

Author : Dongfu Li
Release : 2016
Genre : University of Ottawa theses
Kind : eBook
Book Rating : /5 ( reviews)

GET EBOOK

Book Synopsis Deep Neural Network Approach for Single Channel Speech Enhancement Processing by : Dongfu Li

Download or read book Deep Neural Network Approach for Single Channel Speech Enhancement Processing written by Dongfu Li. This book was released on 2016. Available in PDF, EPUB and Kindle. Book excerpt:

Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments

2024-09-04 Computers

Author : Xiao-Lei Zhang
Release : 2024-09-04
Genre : Computers
Kind : eBook
Book Rating : 575/5 ( reviews)

GET EBOOK

Book Synopsis Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments by : Xiao-Lei Zhang

Download or read book Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments written by Xiao-Lei Zhang. This book was released on 2024-09-04. Available in PDF, EPUB and Kindle. Book excerpt: Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments provides a detailed discussion of deep learning-based robust speech processing and its applications. The book begins by looking at the basics of deep learning and common deep network models, followed by front-end algorithms for deep learning-based speech denoising, speech detection, single-channel speech enhancement multi-channel speech enhancement, multi-speaker speech separation, and the applications of deep learning-based speech denoising in speaker verification and speech recognition. Provides a comprehensive introduction to the development of deep learning-based robust speech processing Covers speech detection, speech enhancement, dereverberation, multi-speaker speech separation, robust speaker verification, and robust speech recognition Focuses on a historical overview and then covers methods that demonstrate outstanding performance in practical applications

New Era for Robust Speech Recognition

2017-10-30 Computers

Author : Shinji Watanabe
Release : 2017-10-30
Genre : Computers
Kind : eBook
Book Rating : 80X/5 ( reviews)

GET EBOOK

Book Synopsis New Era for Robust Speech Recognition by : Shinji Watanabe

Download or read book New Era for Robust Speech Recognition written by Shinji Watanabe. This book was released on 2017-10-30. Available in PDF, EPUB and Kindle. Book excerpt: This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field. This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.

Audio Source Separation

2018-03-01 Technology & Engineering

Author : Shoji Makino
Release : 2018-03-01
Genre : Technology & Engineering
Kind : eBook
Book Rating : 312/5 ( reviews)

GET EBOOK

Book Synopsis Audio Source Separation by : Shoji Makino

Download or read book Audio Source Separation written by Shoji Makino. This book was released on 2018-03-01. Available in PDF, EPUB and Kindle. Book excerpt: This book provides the first comprehensive overview of the fascinating topic of audio source separation based on non-negative matrix factorization, deep neural networks, and sparse component analysis. The first section of the book covers single channel source separation based on non-negative matrix factorization (NMF). After an introduction to the technique, two further chapters describe separation of known sources using non-negative spectrogram factorization, and temporal NMF models. In section two, NMF methods are extended to multi-channel source separation. Section three introduces deep neural network (DNN) techniques, with chapters on multichannel and single channel separation, and a further chapter on DNN based mask estimation for monaural speech separation. In section four, sparse component analysis (SCA) is discussed, with chapters on source separation using audio directional statistics modelling, multi-microphone MMSE-based techniques and diffusion map methods. The book brings together leading researchers to provide tutorial-like and in-depth treatments on major audio source separation topics, with the objective of becoming the definitive source for a comprehensive, authoritative, and accessible treatment. This book is written for graduate students and researchers who are interested in audio source separation techniques based on NMF, DNN and SCA.

Popular eBooks

Single-Channel Speech Enhancement Based on Deep Neural Networks

Deep Neural Network Approach for Single Channel Speech Enhancement Processing

Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments

New Era for Robust Speech Recognition

Audio Source Separation

You may also like...