When we listen to someone speaking, we are able to quickly and effortlessly understand the content of the spoken language. This ability, however, obscures the complexity of the neural processes that underlie comprehension. One of the first steps that the perceptual system has to accomplish is to break the incoming speech stream into units or segments that can provide the basis for the next processing steps. Very little, however, is known about the neural mechanisms of linguistic decoding, that is, how information about the physical stimulus is mapped onto stored linguistic information in the brain (informally speaking, words).
Natural sounds, music, and vocal sounds have a rich temporal structure over multiple timescales, and behaviorally relevant acoustic information is usually carried on more than one timescale. For example, speech conveys linguistic information at several scales: 20-80 ms for phonemic information, 100-300 ms for syllabic information, and more than 1000 ms for intonation information. Therefore, successful perceptual analysis of auditory signals requires the auditory system to extract acoustic information at multiple scales.
The precise role of cortical oscillations in speech processing is under investigation. According to current research, the phase alignment of Δ/θ-band (2-8 Hz) neural oscillations in the auditory cortex is involved in the segmentation of speech. Neural oscillations in the θ band correspond to the slow energy fluctuations in the speech signal at the syllabic rate.
Recently, Overath, McDermott, Zarate and Poeppel  showed that brain regions involved in speech-specific processing (i.e. superior temporal sulcus) are activated even by strongly corrupted speech stimuli.
In a recent study, Ding et al. (2016) showed that spectral peaks of brain waves corresponded to multiple levels of linguistic structure (e.g., peaks in the delta and theta range corresponded to the phrase and syllable rate, respectively). Because no acoustic/prosodic cues at this time scale were present, the peaks in the delta range must be generated internally.