CONVENTION PAPER(10011): Philippa J. Demonte, Yan Tang, Richard J. Hughes, Trevor J. Cox, Bruno M. Fazenda, Ben G. Shirley, (2018). Speech-To-Screen: Spatial separation of dialogue from noise towards improved speech intelligibility for the small screen. Convention Paper presented at the 144th Audio Engineering Society Convention, Milan, Italy. http://www.aes.org/e-lib/browse.cfm?elib=19407 Acknowledgements: This work was supported by the EPSRC Programme Grant S3A: Future Spatial Audio for an Immersive Experience at Home (EP/L000539/1) FIGSHARE PROJECT CONTAINS: * S2S Raw Data.zip - zip folder containing 100 x Matlab files with raw data * VOS_data.txt - text file containing correct word percentages for each of 20 subjects for each of 16 combinations of conditions tested 3RD PARTY / EXTERNAL LINKS USED IN THIS PARTICULAR LISTENING EXPERIMENT: * GRID audio-visual speech corpus: http://spandh.dcs.shef.ac.uk/gridcorpus/ * Salford-BBC Spatially-sampled Binaural Room Impulse Responses: http://www.bbc.co.uk/rd/publications/sbsbrir SET-UP FOR 'SPEECH-TO-SCREEN' SPEECH-IN-NOISE LISTENING EXPERIMENT: 2 x masking noise types: SSN-speech-shaped noise; SMN-speech-modulated noise 2 x video conditions: 0 (video off)-audio-only; 1 (video on**)-video + audio 4 x auralisations: INT - 'internalised'; all audio as uncorrelated stereo in headphones (control condition) NS - noise in stereo; speech binaurally auralised at screen using HRTFs SN - speech in stereo; noise binaurally auralised at screen using HRTFs EXT - 'externalised'; all audio binaurally auralised at screen using HRTFs SBBC ss BRIRs: 0 deg for speech (point source); +/-30 deg for noise (diffuse source) Playback via headphones, with head-tracking (OMNI-Track-Trio system + BBC dynamic rendering) Matlab GUI for audio-visual playback and entry of data by participants ** NOTE: upfront of the experiment, the 320 x selected audio-video clips from the GRID corpus required syncing for the 'video-on' condition. Cross-correlation and time-syncing were applied to do so. GRID SENTENCES USED (20 per speaker) 8 male speakers (2,3,12,17,19,26,27,32) and 8 female speakers (4,7,15,16,22,23,25,34) selected from GRID corpus SPEAKER 2: bbal7p; bbal8a; bbas1p; bbas2a; bbaz5p; bbaz6a; bbbf7p; bbbf8a; bgaa5n; bgag9n; bgah1p; bgah2a; bgahzs; bgan3n; bgan4s; bgan5p; bgan6a; bgat7n; bgat8s; bgat9p SPEAKER 3: bbaf3a; bbal4n; bbal5s; bbal6p; bbal7a; bbar8n; bbar9s; bbas1a; bbaszp; bbaz2n; bbaz3s; bbaz4p; bbaz5a; bbbf4n; bbbf5s; bbbf6p; bbbf7a; bgaa4n; bgag8n; bgat6n SPEAKER 4: bbbf4s; bbir3n; lbad6a; lbaq2s; lbid1p; lbij4s; lbip8s; lbix2s; lbix3p; lrwe8s; lrwfza; pgip9n; pgix4s; prbc9p; prwqzs; pwaj5p; pwap7n; srbh6s; swab3n; swin9p SPEAKER 7: bbal3a; bbar5s; bbay8n; bbaz1a; bgaa2p; bgat2n; brir7s; brizzn; bwwn1s; bwwt4n; lgal2n; lgar6n; lgilzp; lgir4p; lrwy9a; lwbl1s; sbba3s; srinzp; srit3s; srit5a SPEAKER 12: bbae1n; bbae2s; bbae3p; bbak6s; bbak8a; bbaq9n; bbar1p; bbar2a; bbarzs; bbay3n; bbay4s; bbay5p; bbay6a; bbbe6s; bbiq5n; bgam3n; bgis3n; lgbe7n; lwwk9n; pbah4s SPEAKER 15: bbad8n; bbak5a; bbay2p; bgiz7a; brae4n; brik4n; briq8n; lgbe5s; lgid9a; lgik2p; lgiyzp; lrwq6p; lrwy1a; lwwe2n; lwwe3s; lwwr2p; pbia2n; pbig6n; sragzn; srim1s SPEAKER 16: bbak4a; bgaz9p; bgbg1p; bgbm3n; bgbm6a; bgbtza; bwbz7p; bwwgza; bwws8a; lgid7p; lrwk1p; lwbq8a; lwby2a; sbbf9p; sbbm4a; sbbs7p; sbws9n; sram5p; sriz5p; swig2s SPEAKER 17: bbak3a; bbakzn; bbaq4n; bbaq6p; bbaq7a; bbax8n; bbax9s; bbay1a; bbayzp; bbid2n; bgwa1a; bgwazp; bgwg3s; bgwg4p; bgwg5a; bgwm7s; bgwm8p; bgwm9a; pbat9a; sbwt1a SPEAKER 19: bbaj8n; bbaj9s; bbak1a; bbakzp; bbaq3s; bbaq4p; bbaz6n; bbax7s; bbax8p; bbbd8n; bbbd9s; bbbezp; lbao6n; lbavzn; lbio5a; lbiu8p; lbiu9a; pbag7s; pbbt8n; pbit2p SPEAKER 22: bgiy7n; brbk7p; brid5p; lbab5n; lbih7p; lbiu5p; lrac2s; lrii1n; lrio8a; lrwp9p; lwbd4a; lwbx6a; pbam8s; pbbn2s; pgav2a; sbae9p; sbbl5n; sbws3n; srif1p; swis4s SPEAKER 23: bbad3a; bbaj7a; bbaq1a; bbax5a; bgwm2p; bgws5s; brij8p; bwwf2p; lbab4n; lbah9s; lbaizp; lbao3s; lrib7s; pgab6n; pgih8p; pgio2p; sbbfnz; srbm1s; srbmzn; swas6n SPEAKER 25: bgwl9s; lbab2n; lbab3s; lbau5s; lrai2n; lrav3a; lrbi9a; lriu6n; lwap3s; lwbj2n; pgiozp; pgiu4p; pgwv7a; prwh6p; prwo1a; pwwb8p; pwwo4n; pwwo6p; sbar5a; sggbnzn SPEAKER 26: bbac7n; bbac8s; bbac9p; bbadza; bbaj1n; bbaj3p; lbab2s; lbih3p; lbin6s; lbiu1p; lgaj4s; lgiv9p; lrih9p; lrwp5p; lwbp6s; lwbx1p; pbwa5p; prbuza; prwn7n; srizza SPEAKER 27: bbajzn; bbaxzp; bgal1a; lbab1s; lbabzn; lbah4n; lbah5s; lban8n; lbau2n; lbau3s; lbia6n; lbia7s; lbia8p; lbia9a; lgaj2n; lrwv9a; lwbp4n; lwbp5s; lwbv8n; pbas6n SPEAKER 32: bbai6s; bbap2a; bbav4s; bbib7n; bric4s; brii7n; briv5n; lbwu7p; lgac6a; lgap1n; lgipza; lrw09p; lwbc1n; lwbi5n; lwbi8a; lwbv3n; lwbv6a; lwwc5n; pbwg3p; pgan8a SPEAKER 34: bbac1p; bbai3n; bbai4s; bbbc6a; bric3p; brii5n; brii7p; bripzs; briv4s; bwaj6s; lgao9n; lgib9p; lgicza; lgio7p; lwau7n; lwbv1n; pgig7p; sbid1n; sbix4s; srbr3n SECTIONS IN .MAT FILES (RAW DATA) 'pract' = practice session with 10 sentences and different combinations of conditions 'duration' = duration (seconds) of each section with 80 speech-in-noise sentences 'subj' = listening experiment identifier for each participant (1-20) 'sent' = (column 1) identifier for sentence played, e.g. lgipza.wav ("Lay green in P-zero again") (column 2) identifier for combination of playback conditions, e.g. SMN_0_EXT 'heard' = target words heard - letter-number pairs inputted by participants via GUI, e.g. 'G4' 'rts' = reaction time (seconds) of each participant for each input after SIN sentence plaeyd 'speaker' = identifier for GRID speaker, who had uttered each sentence FOR STATISTICAL ANALYSIS Ignoring the practice sessions, apply a correct word score for each sentence, i.e. compare letter-number from position 4 and 5 in 'sent' with 'heard': * 1 - if participant correctly hears BOTH the letter and number * 0.5 - if the participant correctly hears EITHER the letter OR the number * 0 - if the participant incorrectly hears both the letter and number Calculate averages: - 20 sentences per participant, per combination of playback condition (16 combinations) Calculate percentages Data checked for normal distribution and sphericity Apply 3-way Repeated Measures ANOVA Apply post-hoc pairwise analyses and calculate ratio-gain improvements For further information, contact: * Philippa Demonte (p.demonte@edu.salford.ac.uk) * Yan Tang (y.tang@salford.ac.uk) * Rick Hughes (r.j.hughes@salford.ac.uk) * Trevor Cox (PI for University of Salford/S3A) (t.j.cox@salford.ac.uk) * Bruno Fazenda (b.m.fazenda@salford.ac.uk) * Ben Shirley (b.g.shirley@salford.ac.uk) ReadMe Figshare.txt - last updated: 29 June 2018 by P. Demonte