Blind modelling of binaural unmasking for binaural speech intelligibility modelling at positive and negative SNRs
The equalization cancellation (EC) model predicts the binaural masking level difference by equalizing interaural differences in level and time and increasing the signal-to-noise ratio (SNR) using destructive and constructive interferences. Here, a blind EC model is introduced that relies solely on the mixture of speech and noise, replacing the unrealistic requirement of the separated clean speech and noise signals in previous versions. The model uses two parallel EC paths, which either maximize or minimize the EC output level in each frequency band. If SNR is negative, minimization improves the SNR by removing the interferer component from the mixed signal. If SNR is positive, maximization improves the SNR by enhancing the target component. Either the minimizing or maximizing path in each frequency band is selected blindly based on an envelope frequency-selective amplitude modulation (AM) analysis. The requirement of considering positive SNRs is investigated using a binaural speech intelligibility experiment, where SRTs are obtained at positive SNRs. Results show a clear binaural release from masking for speech in noise at positive SNRs. The suggested AM-steered selection in the EC stage demonstrates that a simple signal driven process can be used to explain binaural unmasking of speech in humans.