Max Planck Institute for Empirical Aesthetics
Lecture by Emmanuel Ponsot: Uncovering the mental code
of high-level inferences from speech prosody
Beyond words, speech carries a lot of information about a speaker through its prosodic structure. Humans have developed a remarkable ability to infer others’ states and attitudes from the temporal dynamics of the different dimensions of speech prosody (i.e. pitch, intensity, timbre, rhythm). However, we still do not have a computational understanding of how high-level social or emotional impressions are built from these low-level dimensions. We recently developed a data-driven framework combining voice-processing techniques and psychophysical reverse-correlation to expose the mental representations or ‘prototypes’ that underlie such inferences in speech. In this talk, I will show different examples of how this framework has been used, including a recent study in which we investigated how intonation drives social traits of dominance and trustworthiness. This approach offers a principled way to reverse-engineer the algorithms the brain uses to make high-level inferences from the acoustical characteristics of others’ speech. It holds promise for future research seeking to understand why certain emotional or social judgments differ across cultures and why these inferences may be impaired in some neurological disorders.