Understanding Transient Interference Suppression in Speech Enhancement
Transient interference, characterized by abrupt sounds followed by decaying oscillations, poses a challenge for standard speech enhancement algorithms. This article delves into the statistical modeling and problem formulation of transient suppression, exploring band-to-band filters, spectral variances, and more to effectively mitigate transient effects in speech signals.
- Speech enhancement
- Transient interference
- Statistical modeling
- Spectral variance
- Band-to-band filters
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Introduction Transient Interference Suppression Transient is an abrupt or impulsive sound followed by decaying oscillations, e.g. keyboard typing and door knocking Common single-channel speech enhancement algorithms are not suitable for the abrupt nature of transients For example: spectral subtraction [boll, 79 ], decision directed [Ephraim & Malah, 84 ], LSA [Ephraim & Malah, 85 ], OM-LSA [Cohen & Berdugo, 01 ] Noisy Speech Enhanced via OM-LSA Cohen, Gannot and Talmon 2 PART II: TRANSIENT SUPPRESSION \11
Introduction Problem Formulation Let denote a speech signal and let and be contaminating transient interference and stationary noise The signal measured by a microphone is given by The transient interference is where is a sequence of impulses of varying amplitudes is an impulse response that characterizes the transient interference type. Cohen, Gannot and Talmon 3 PART II: TRANSIENT SUPPRESSION \11
Introduction Problem Formulation Let denote the STFT of the measured signal in time frame and frequency bin where , and are the STFT of , and We use analysis and synthesis windows of length and time shift The transient interference in the STFT domain is where is the STFT of , and is the band-to-band filter of Cohen, Gannot and Talmon 4 PART II: TRANSIENT SUPPRESSION \11
Transient Interference Modeling Statistical Model We propose a statistical model of the band-to-band filters denotes the decay rate of the filter and are zero-mean mutually i.i.d. Gaussian random variables, representing the abrupt and the decaying parts of the transient Let and be the spectral variances The spectral variance of the filter Cohen, Gannot and Talmon 5 PART II: TRANSIENT SUPPRESSION \11
Transient Interference Modeling Statistical Model We model the spectral variance as a fixed value across the frequency bins, determined by the impulse amplitude Let denote the set of time frame indices that consist of an impulse where are i.i.d. positive random variables with mean and variance The speech is uncorrelated with the transient amplitude Thus, the spectral variance of the measurements is given by where , and Cohen, Gannot and Talmon 6 PART II: TRANSIENT SUPPRESSION \11
Transient Enhancement Transient Enhancement Using OM-LSA Observation As the stationary noise is slowly varying w.r.t speech, speech is slowly varying w.r.t transient interference. Employ the MCRA [Cohen & Berdugo, 02] to estimate the PSD of the slower speech Use short time frames ( ) to reduce the variations of the speech between sequential frames Carry out temporal smoothing with small recursion parameter Use two sliding windows to capture speech phoneme onsets where Cohen, Gannot and Talmon 7 PART II: TRANSIENT SUPPRESSION \11
Transient Enhancement Transient Enhancement Using OM-LSA Let be the spectral gain of the OM-LSA estimator, based on the two-sliding windows and fast recursion Initial estimation of the transient interference Cohen, Gannot and Talmon 8 PART II: TRANSIENT SUPPRESSION \11
Transient Enhancement Abrupt and Oscillatory Parts Estimation Adapt a statistical model for room reverberation [Habets et al, 09 ] Exponentially decaying spectral envelope Random oscillations Extract the spectral variance of the decaying part Cohen, Gannot and Talmon 9 PART II: TRANSIENT SUPPRESSION \11
Transient Enhancement Abrupt and Oscillatory Parts Estimation Assume and are mutually independent It can be represented as with and [Habets et al., 09] Cohen, Gannot and Talmon 10 PART II: TRANSIENT SUPPRESSION \11
Transient Enhancement Experimental Results Clean transient event Noisy speech Enhanced transient Via OM-LSA Estimated decaying part Cohen, Gannot and Talmon 11 PART II: TRANSIENT SUPPRESSION \11