Inharmonic Speech Segregation

Authors: Sara PophamDana BoebingerDan P. W. EllisHideki KawaharaJosh H. McDermott
Updated: Tue 29 May 2018
Source: http://mcdermottlab.mit.edu/inharmonic_speech_examples/index.html
Type: audio files
Languages: English
Keywords: sound-sourcesbrainfrequencyharmonicsEnglish
Open Access: yes
License: CC BY 4.0
Publications: Popham, S., Boebinger, D., Ellis, D.P. et al. (2018). Inharmonic speech reveals the role of harmonicity in the cocktail party problem. Nature Communications. 9, 2122. https://doi.org/10.1038/s41467-018-04551-8
Citation: Popham, S., Boebinger, D., Ellis, D., Kawahara, H., McDermott, J. (2018). Example Stimuli from Experiments on Inharmonic Speech Segregation. Massachusetts Institute of Technology: McDermott Lab. http://mcdermottlab.mit.edu/inharmonic_speech_examples/index.html
Summary:

The “cocktail party problem” requires us to discern individual sound sources from mixtures of sources. The brain must use knowledge of natural sound regularities for this purpose. One much-discussed regularity is the tendency for frequencies to be harmonically related (integer multiples of a fundamental frequency). To test the role of harmonicity in real-world sound segregation, we developed speech analysis/synthesis tools to perturb the carrier frequencies of speech, disrupting harmonic frequency relations while maintaining the spectrotemporal envelope that determines phonemic content. We find that violations of harmonicity cause individual frequencies of speech to segregate from each other, impair the intelligibility of concurrent utterances despite leaving intelligibility of single utterances intact, and cause listeners to lose track of target talkers. However, additional segregation deficits result from replacing harmonic frequencies with noise (simulating whispering), suggesting additional grouping cues enabled by voiced speech excitation. Our results demonstrate acoustic grouping cues in real-world sound segregation.