What neuronal and cognitive representations and computations form the basis for the transformation from “vibrations in the ear” (sounds) to “abstractions in the head” (words)? Successful communication using spoken language requires a speech processing system that can negotiate between the demands of auditory perception and motor outputs, on the one hand, and the representational and computational requirements of the language system, on the other. The speech processing system is situated at the interface of acoustics and sensorimotor physiology, at one end, and linguistics, at the other end. The effortlessness and automaticity of the perceptuo-motor and linguistic processes as we experience them belies the number and complexity of the bottom-up and top-down operations that - in aggregate - constitute the perception of speech and comprehension of language. Developing a mechanistic understanding of this system remains one of the foundational challenges for cognitive neuroscience. The research carried out draws - in a relentlessly interdisciplinary fashion - from neuroscience, linguistics, psychology computation, and other relevant domain.
Screaming is an ability we share with many other primates, and which we possess long before we learn to express our affective state with speech. Previous studies focusing on fearful screams highlighted certain acoustic features, such as roughness, unexploited by speech (Arnal et al., 2015), leading to activation of the amygdala and other subcortical structures critical for danger appraisal.
The tone of the voice carries information about the emotional state or intention of a speaker. Whereas the nature of acoustic features of contrasted prosodic signals has attracted a lot of attention in the last decades (particularly since Banse & Scherer, 1996), the communication of emotions/intentions remains poorly understood.
In a foundational study published in 1955, Miller and Nicely measured the perceptual confusions among 19 consonants followed by the vowel [a] (ba, da, ga, etc.). The stimuli were subjected to different kinds of linear distortions, i.e., additive noise and variations in bandwidth.
Bilingualism has become common. Infants' auditory cortex undergoes structural maturation during the first three years (Yakovlev & Lecours, 1967). They develop an auditory capacity to specifically recognize acoustic patterns used in their native language (e.g., Kuhl et al. 1992).
Humans naturally tune in to the rhythm of speech (Giraud & Poeppel, 2012). Recent work has shown that low-frequency brain rhythms have been shown concurrently to track the main constituents in a linguistic hierarchy: phrases and sentences (Ding et al., 2016).