ReadMe_Harvard_British_English_recording_2019.txt Author: Philippa Demonte (ORCID ID: orcid.org/0000-0001-5810-2737; SCOPUS ID: 57202968540) Acoustics Research Group, University of Salford, United Kingdom e-mail: p.demonte@edu.salford.ac.uk tel: +44 (0)7979 578 482 Year: 2019 OVERVIEW: The HARVARD speech corpus is a database of 720 phonetically-balanced sentences, divided into 72 lists of 10 sentences. See harvard.txt for an overview of the sentences and the online- and journal references. This document outlines the details of a high-quality digital audio recording of the HARVARD speech corpus in its entirety by a female native British English speaker. AVAILABILITY: The audio .wav files which constitute this recording of the corpus are hosted on the University of Salford's Figshare site (https://salford.figshare.com). The files include: * HARVARD_raw_wav_201218.zip (approximately 859 MB uncompressed) - featuring 72 .wav files of the lists of spoken sentences and one .wav file of recorded room atmos * HARVARD_edited_wav_201218.zip (approximately 350 MB uncompressed) - featuring 720 .wav files of each individual spoken sentence * HARVARD_edited_EndPointed_wav_201218.zip (approximately 505 MB uncompressed) - featuring 1440 .wav files: 720 files with the audio end-pointed, and 720 files additionally front- and end- zero-padded such that all are of 5 seconds duration * EndPoint.m - the Matlab script created for end-pointing the edited audio files The audio is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/legalcode). RECORDING DETAILS: * Recording date: 20th December 2018 * Talker / Recording engineer: Philippa Demonte ==> a female native (Standard Southern) British English speaker in her 40s. ==> The talker made a conscious effort to take in-breaths away from the microphone, to articulate clearly, and to utter all sentences at a steady rate. * Location: semi-anechoic chamber, Acoustics Research Centre, University of Salford, UK ==> acoustically treated to standards BS ISO 3744 / ISO 3745 and BS 4196 * Microphone: Electro-Voice RE20 ==> Element type: Dynamic ==> Large diaphragm ==> Frequency response: 45 Hz - 18,000 Hz, i.e. relatively flat response ==> Polar Pattern: Cardioid; no coloration at 180-degrees off-axis ==> Impedance: 150 ohms balanced ==> Sensitivity, Open Circuit Voltage, 1 kHz: 1.5 mV/pascal ==> Hum pickup level, typical (60 Hz/1 millioersted field): -130 dBm ==> mic on stand at height of ~ 1.40 m placed: 0.89 m from front of room, 1.66 m from both sides of room, 3.18 m from back of room ==> the talker faced towards the front of the room at a distance of 0.1 m from the microphone. A 2-screen pop filter was placed inbetween at 0.05 m from microphone. * Soundcard: Focusrite Scarlett 2i2 ==> mic plugged in to Channel 1; Line; without pad; without 48V (phantom power) * DAW: Adobe Audition 2017 * Reproduction: Mono * Sampling rate: 48,000 Hz * Bit rate: 32 bit * Gain: around -33 to -20 dB, i.e. balance point between having a high enough gain for the dialogue, whilst trying to minimise electrical noise from mic cable. * Input: Focusrite; Output: Focusrite; Master Clock: Focusrite; Latency: 200 ms; Monitoring: via headphones. * Saved as: .wav files (+ accompanying .pkf files). The filename format is: Harvard list (number) .wav * Sentences were recorded in groups of 10, i.e. by list POST-PRODUCTION PROCESSING: * The raw .wav files were edited into individual .wav files of each HARVARD sentence using Adobe Audition 2017. * The edited .wav files were then end-pointed using a Matlab script (see EndPoint.m). The script determines the locations of the upcross points in each file, and then turns all gain amplitude values to zero before the 3rd index and after the 3rd-to-last index. The filename format is: Harvard - list number - sentence number - _0.wav * For the purpose of use in a speech-in-noise test, the researcher required all .wav files to be of 5 seconds in duration. Hence, each edited and end-pointed .wav file was additional zero-padded by 1 second upfront and 1+ seconds at the end. The filename format is: Harvard - list number - sentence number - _5.wav No further processing was applied to these .wav files, as filtering, EQ, amplitude normalisation, and so on would have compromised the high quality of this audio recording.