University of Salford

Utilising the Precedence Effect With an Object-Based Approach to Audio to Improve Speech Intelligibility.

Posted on 2022-09-12 - 18:17 authored by Philippa Demonte
Resources used to conduct a subjective, quantitative speech-in-noise test (SINT), and the data collected.

The listening experiment tested how the psychoacoustic phenomenon of the precedence effect can be utilised with augmented loudspeaker arrays in an object-based audio paradigm to improve speech intelligibility in the home environment. A practical application of this research will be in the implementation of media device orchestration, i.e. the creation of low-cost, ad-hoc loud speaker arrays using commonly found devices, such as mobile phones, laptop computers, tablets, smart speakers, and so on, to spatialise audio in the home.

This speech-in-noise test was conducted under controlled conditions in the Listening Room at the University of Salford in March 2020. With audio reproduced by one of three different arrays of loudspeakers in a given trial, subjects listened to spoken sentences played simultaneously with noise. They were tasked with correctly identifying target words. Correct word scores collated and converted to word recognition percentages act as a quantifiable proxy for speech intelligibility. After confirming that they fulfilled the criterion for use, data were statistically analysed using 2-way RMANOVA.

The three configurations of loudspeaker arrays were:

* L1R1_base (a two-loudspeaker control condition):
a stereo pair of front left and front right loudspeakers at -/+30 degrees azimuth and 2m distance from the listener position; speech + noise reproduced by both loudspeakers.

* L1R1C2 (three loudspeakers):
L1R1_base + an additional (AUX) loudspeaker in the true front centre position (0 degrees azimuth and 1.7m distance from listener position) reproducing just speech.

* L1R1R2 (three loudspeakers):
L1R1_base + an AUX loudspeaker in the right-hand position (+90 degrees azimuth and 1.7m distance from listener position) reproducing just speech.

For the array configurations with the three loudspeakers, the precedence effect was initiated by applying a 10 ms delay to the speech signal reproduced by the AUX loudspeaker, such that the sound source (first arrivals) would still be perceived as being from the phantom centre between the L1 and R1 loudspeakers, but with a boost to the speech signal. The relevant equalisation (EQ) was applied to the speech signal for the C2 and R2 AUX loudspeakers though to maintain the same perceived comb filtering effects for all three loudspeaker array configurations.

Analysis of the results is provided in the PhD thesis by P. Demonte.


Select your citation style and then place your mouse over the citation text to select it.


S3A: Future Spatial Audio for an Immersive Listener Experience at Home

Engineering and Physical Sciences Research Council


need help?