Elia Formisano et Bruno L. Giordano © Bruno L. Giordano

NASCE

ERC Synergy Grant

 

NASCE (Horizon Europe - ERC SyG 101167313)
Natural Auditory Scenes in Humans and Machines

In an office, on a metro, or at home, diverse sounds may be surrounding you: a computer fan, footsteps and indistinct chatter, distant cars, the train slowing down. These sounds shape our environmental awareness, even when visual cues are absent. Despite its ecological relevance, understanding how our brain parses the acoustic scene into semantic objects remains a major scientific challenge. Moreover, individuals with hearing impairments, including those relying on hearing aids or cochlear implants, face challenges, both auditory and cognitive, in environments with multiple sound sources.

The NASCE project aims to mechanistically comprehend real-world auditory scene analysis (ASA) through a novel framework: the Semantic Segmentation Hypothesis (SSH). The SSH posits that semantic representations drive real-world ASA, shifting the focus from early acoustic processing to the semantic analysis of everyday sounds. It addresses fundamental questions, such as: How does the brain create semantic sound source representations? How do they interact with acoustic processing, and aid our ability to recognize sounds in a scene?

NASCE integrates cognitive psychology, neuroscience, and AI methods. Employing neuroimaging and behavioral paradigms, we aim to reveal how the brain dynamically represents auditory scenes across brain regions for relevant listening tasks. Using deep neural networks and ontologies, we aim to construct neuroscientifically grounded computational models simulating cerebral and behavioral responses under the same scenes and tasks as in the experiments. Finally, with advanced analytical methods we will consolidate behavioral, computational, and neuroscientific insights and establish SSH as a groundbreaking theory of ASA. NASCE promotes a paradigm shift, fundamentally reshaping our comprehension of ecological hearing. Moreover, it paves the way for the applications in machine hearing of computational models that mimic human auditory cognition.

  • 1Aix-Marseille Université/CNRS