RE: The quest for the ideal dataset: do we have the right tools?

Perhaps this () may be of relevance/interest. It is a syllabus from a graduate course offered by Chris Shera and Jennifer Melcher back in 2005 or so. See pgs.2ff for the list of questions provided. CB

Tuvan throat singing

Slide relevant to the talk (7/7/17) and small-group discussion (7/18/17) can be found here(large ppt) and here(much smaller pdf, sans audio/video)

Identification of linear time-variant systems (applied in a completely different domain):

Learning Speech Signal Processing in Automatic Speech Recognition

Tutorial: Neurobiology of Birdsong

This is a general review of the neurobiology of birdsong and song learning: Translating Birdsong

This review proposes a model of birdsong learning that draws from the mammalian basal ganglia/RL literature: Reinforcement learning in the songbird

A short review on comparative anatomy of mammalian and avian basal ganglia circuitry: Birdbrains and basal ganglia anatomy

Some original research articles on:

This is a powerpoint of the tutorial: Birdsong Neurobiology Tutorial

Small-group discussion: State-dependent auditory filters in biological systems and their applicability to speech recognition

Board from discussion

Feed-forward adaptation (dynamic range adaptation, gain control, statistics of natural sounds):

Robust encoding of noisy speech & other natural sounds:

Stimulus-specific adaptation


Plasticity, learning, behavior (top-down). This is a VERY small sample of a big literature.

Speaker adaptation: automatic speech recognition perspective:

Small-group discussion: Bayesianity in neural coding -- Predictive of detection and discrimination in noise?

Evidence of Bayesian effects in perception

Humans integrate visual and haptic information optimally

A Bayesian explanation of human audiovisual integration in speech recognition

Bayesianity in visual perception, including skewed distributions

A Bayesian Hypothesis on Olfaction

Dealing with anti-Bayesian percepts
Efficient coding in a Bayesian framework:

Hypotheses on the mechanism of probabilistic inference in neural networks
Representing likelihood distributions with a recurrent neural network:

Dynamic adaptation of population information to match sound statistics:

Bayesian inference in auditory categorization:

How can Probabilistic Inference be achieved in natural neural networks (Pecevski and Maas)

Bayesian inference via population coding

Optimal representation of acoustic information
Efficient neural coding of natural sounds

The other side of the discussion - weaknesses of the Bayesian approach in explaining neural information integration

Tutorial by Stephen David: Integrating behavioral state into auditory encoding models

Low-dimensional parameterization of encoding models:

Reward-dependent plasticity of STRFs in auditory cortex:

Task-related plasticity of STRFs in midbrain vs. cortex:

Seminar by Nick Lesica: The neural code for speech

Here are a few of the recent papers that I will touch on:

Tutorial by Nick Lesica: Why do hearing aids fail to restore normal auditory perception?

This is a draft of a review, comments very welcome!

This is a good summary of the effects of hearing loss on the auditory nerve representation of speech

Talks by Phil Garner on vocoding and prosody

A paper on the continuous pitch tracker:
A paper by Pierre-Edouard on emphasis transfer:

Discussion of analysis approaches for multi-neuron recordings

NIPS paper by Pachitariu et al. describing fitting linear dynamical systems to multi-neuron data

Discussion on biologically feasible machine recognition of speech

Notes by Phil Garner; updated after the second discussion. Correlation with actual discussion decays with time.

Notes by Hynek Hermansky:

Tutorial by Andrei Kozlov: Central auditory neurons are flexible gates in a multidimensional feature space

Group discussion on applications to Machine Learning

Paper about gamma-theta coupled oscillations in spiking network to decode syllable boundaries:

Seminar by Jonathan Simon: Neural representations of speech in human auditory cortex

Ding, N. and J. Z. Simon (2012) Neural Coding of Continuous Speech in Auditory Cortex during Monaural and Dichotic Listening, J Neurophysiol 107, 78-89.

Ding, N. and J. Z. Simon (2012) The Emergence of Neural Encoding of Auditory Objects While Listening to Competing Speakers, PNAS, 109(29), 11854-11859.

Ding, N. and J. Z. Simon (2013) Adaptive Temporal Encoding Leads to a Background Insensitive Cortical Representation of Speech, J Neurosci 33(13), 5728-5735.

Ding, N., M. Chatterjee and J. Z. Simon (2014) Robust Cortical Entrainment to the Speech Envelope Relies on the Spectro-temporal Fine Structure, NeuroImage 88 41–46.

Presacco, A., J. Z. Simon and S. Anderson (2016) Evidence of Degraded Representation of Speech in Noise, in the Aging Midbrain and Cortex, J Neurophysiol 116, 2346–2355.

Presacco, A., J. Z. Simon and S. Anderson (2016) Effect of Informational Content of Noise on Speech Representation in the Aging Midbrain and Cortex, J Neurophysiol 116, 2356–2367.

Puvvada, K. C., and J. Z. Simon (2017) Cortical Representations of Speech in a Multi-talker Auditory Scene, bioRxiv 124750. doi:10.1101/124750

Seminar by Heather Read on Cortical physiology for discriminating ears

Tutorial by Andrea Hasenstaub on Biophysics of cortical computation

Discussion on Hopf bifurcation and traveling waves

Tutorial by Pascal Martin: The hair bundle as sensor and amplifier in the hair cell

Seminar by Pascal Martin: On the physical limit of hair-bundle mechanosensitivity

Tutorial by Dolores Bozovic: The role of chaos and noise in hair cell sensitivity

Tutorial by Hynek Hermansky: Human hearing and speech technology

Tutorial by Jim Hudspeth: Mechanical amplification by ion channels and myosin molecules in hair cells of the inner ear

Seminar by Frank Jülicher: Critical amplification in the cochlea

Tutorial talk by Richard Lyon on auditory images

Primarily consisting of Chapter 21 in Dick's newly available book "Human and Machine Hearing", which you can learn a bit about here.

Small Discussion by Ankit Patel: Theory of convolutional neural nets

Our NIPS paper on theory of Convnets

Tutorial by Heather Read: Cortical pathway organization for encoding frequency, timing and location of sound

Rodriquez, Read & Escabi, J Neurophys, 2010
Spectral and Temporal Modulation Tradeoff in the Inferior Colliculus

Chen, Read, & Escabi, J Neurosci, 2012
Precise feature based time scales and frequency decorrelation lead to a sparse auditory code

Higgins et al., J Neurosci, 2010
Specialization of Binaural Responses in Ventral Auditory Cortices

Escabi, Read, Viventi et al., J Neurophys, 2014
A high-density, high-channel count, multiplexed μECoG array for auditory-cortex recordings
Storace, Higgins, & Read, J Comp. Neurology, 2011
Thalamocortical pathway specialization for sound frequency resolution

Storace, Higgins, & Read, J Comp. Neurology, 2010
Thalamic label patterns suggest primary and ventral auditory fields are distinct core regions

Storace et al., J. Neurosci, 2012
Gene expression identifies distinct ascending glutamatergic pathways to frequency-organized auditory cortex in the rat brain

Read, Winer, & Schreiner, PNAS, 2001
Modular organization of intrinsic connections associated with spectr
al tuning in cat auditory cortex

Tutorial by Tobias Reichenbach: Active mechanics and fluid dynamics of the inner ear

All articles from Tobias are available here.

Tutorial talk by Daniel Robert (some cited articles)

Tutorial by Jonathan Simon: How the brain solves the cocktail party problem: evidence from human auditory neuroscience

Tutorial Slides

The Cocktail Party Book:
The Auditory System at the Cocktail Party, Middlebrooks, J., J. Z. Simon, A. R. Popper and R. R. Fay (Eds.), (Springer: New York) 2017.

Simon, J. Z. (2017) Human Auditory Neuroscience and the Cocktail Party Problem, In The Auditory System at the Cocktail Party, Ed.: Middlebrooks, J., J. Z. Simon, A. R. Popper and R. R. Fay (Springer: New York), 169-197.

Gutschalk, A., & Dykstra, A. R. (2014). Functional imaging of auditory scene analysis. Hearing Research, 307, 98–110.

Snyder, J. S., Gregg, M. K., Weintraub, D. M., & Alain, C. (2012). Attention, awareness, and the perception of auditory scenes. Frontiers in Psychology, 3, 15.

Scott, S. K., & McGettigan, C. (2013). The neural processing of masked speech. Hearing Research, 303, 58–66.

Lee, A. K., Larson, E., Maddox, R. K., & Shinn-Cunningham, B. G. (2014). Using neuroimaging to understand the cortical mechanisms of auditory selective attention. Hearing Research, 307, 111–120.

Ahveninen, J., Kopco, N., & Jaaskelainen, I. P. (2014). Psychophysics and neuronal bases of sound localization in humans. Hearing Research, 307, 86–97.

Simon, J. Z. (2015) The Encoding of Auditory Objects in Auditory Cortex: Insights from Magnetoencephalography, Intl J Psychophysiol 95, 184–190.

Also, articles by Simon are available at <>, including:

Akram, S., A. Presacco, J. Z. Simon, S. A. Shamma and B. Babadi (2016) Robust Decoding of Selective Auditory Attention from MEG in a Competing-Speaker Environment via State-Space Modeling, NeuroImage 124, 906–917.

Akram, S., J. Z. Simon and B. Babadi (2016) Dynamic Estimation of the Auditory Temporal Response Function from MEG in Competing-Speaker Environments, IEEE Trans Biomed Eng, doi: 10.1109/TBME.2016.2628884.

Tutorial by Malcolm Slaney: Auditory attention: from saliency to models to applications

Lecture by Malcolm Slaney: Objective measures of listening effort: a better metric for hearing

Lecture by Hynek Hermansky: Life-long learning in machine recognition of speech

Discussion led by Malcolm Slaney: Stamp-collecting in auditory cortex

On representations in ANNs:

Discussion led by Ralf Schlüter: Problems of segmentation and search in speech

Lecture by Catherine Carr: Evolution of hearing

Lecture by Reudi Stoop: What can Sex, flies and videotapes and Noam Chomsky tell us about the brain?

Peter Norvig's discussion on Chomsky and statistical learning: