Effect of Background Music Arrangement and Tempo on Foreground Speech Intelligibility: wav audio files - background music
A zip folder with sub-folders containing .wav files of background music and speech-shaped noise (SSN) (control masking noise). As used with Tang & Cooke's (2016) HEGP OIM (high energetic glimpse proportion objective intelligibility metrics) and in a quantitative, subjective speech-in-noise test to investigate whether or not background music arrangement (in terms of timbre and music arrangement) and tempo have a significant effect on speech intelligibility.
The investigation was conducted at the University of Salford in 2018 towards the PhD thesis by P. Demonte (2022).
The speech-in-noise test used the original dialogue recording of the Revised Speech Perception In Noise test (RSPIN) spoken sentences (Kalikow, Stevens and Elliott, 1977; Bilger, 1984; Bilger et al., 1984) available on CD-r. Headphone playback of the dialogue was calibrated to an average level of 63 dB A.
The master background music audio files were generated in Garage Band using Apple Loops. The control background noise - speech-shaped noise - a purely energetic masker used for comparison against music, was produced using white noise and samples of the spoken dialogue.
The background music and speech-shaped noise audio files in this zip folder were set relative to the dialogue playback level to produce a glimpse proportion value for the dialogue of 10 (GP10), as per the output of Tang & Cooke's (2016) HEGP OIM within a Matlab script using an interative 'for' loop. That is to say, all the background masking noises were set to different speech-to-noise ratios, but to produce the same energetic masking level, such that any significant differences with regards to effect on speech intelligibility would be attributable to other factors.
Playback of the dialogue and masking noise audio files was via an Adobe Audition digital audio work station.
For an overview of the speech-to-noise ratios and glimpse proportions of each speech-noise .wav file pairing, see the Excel spreadsheet: https://doi.org/10.17866/rd.salford.19753936 - Effect of Background Music Arrangement and Tempo on Foreground Speech Intelligibility: Listening experiment settings (SNRs, GP, HEGP) spreadsheets.
Music - created in Garage Band using Apple Loops
M1 (Apple Loop: Fireplace All): string quartet playing in a legato style;
M2 (Apple Loop: Countdown Cello 01): solo cello playing a single note in a staccato, bowed style;
M3 (Apple Loops: Countdown Cello 01; Laid Back Classic 01; African King Gyl 04; Big Maracas 03): cello, electric guitar, and lightly-percussive instrumentation;
M4 (Apple Loops: Countdown Cello 01; Laid Back Classic 01; African King Gyl 04; Big Maracas 03; Lake Shift Bass; Barricade Arpeggio; High Octane Arpeggio; Altered State Beat 02): cello, electric guitar, and more heavily percussive instrumentation;
M5_T0: speech-shaped noise (SSN); a purely energetic masking noise used as a control condition to compare any effects of the background music against. No defined tempo.
T1: 60 beats per minute (BPM);
T2: 100 bpm;
T3: 140 bpm.
GP10 refers to the arbitrary glimpse proportion (=10) of the spoken sentences relative to the background music or speech-shaped noise level.
The audio file names in this zip folder also reflect:
* RSPIN list number;
* RSPIN sentence number;
* the semantic level of the RSPIN sentence that corresponds to each masking noise file (HP = high predictability; LP = low predictability);
* the target word of the RSPIN sentence that corresponds to each masking noise.
For further details, contact:
email (1): email@example.com
email (2): firstname.lastname@example.org
S3A: Future Spatial Audio for an Immersive Listener Experience at Home
Engineering and Physical Sciences Research CouncilFind out more...