
Automatic speech recognition (ASR) models encode rich phonetic information in their internal representations, yet the structure and dynamics of these representations remain difficult to interpret. This talk explores how articulatory features such as place and manner of articulation and voicing can be traced across transformer-based ASR models using probing methods. Drawing on analyses of wav2vec 2.0 and Whisper, we examine how these models encode coarticulation and how robust their representations remain under signal perturbations such as added noise and silent or noisy gaps. By following articulatory feature trajectories across encoder layers, we reveal patterns consistent with phonetic expectations and highlight differences between ASR architectures in their ability to represent and recover linguistic information. Together, these findings demonstrate how linguistically informed probing can improve the interpretability of latent representations in ASR systems.
Dr. Iona Gessingers Forschungsinteressen liegen in der Art und Weise, wie Menschen Sprache wahrnehmen und produzieren und wie und warum sie diese an verschiedene Gesprächspartner anpassen, wie z. B. an virtuelle Agenten. Sie wurde 2022 für ihre Arbeit zur „Phonetic Accommodation of Human Interlocutors in the Context of Human-Computer Interaction“ an der Universität Saarbrücken promoviert und forschte anschließend am University College Dublin, wo sie mit Prof. Dr. Benjamin Cowan in der School of Information and Communication Studies und Prof. Dr. Julie Carson-Berndsen in der School of Computer Science zusammengearbeitet hat. Derzeit ist sie Postdoktorandin im DFG-Projekt „ Judeo-Spanish in Bulgaria: a contact language between archaism and innovation“, das von Prof. Dr. Christoph Gabriel (JGU Mainz) und Prof. Dr. Bistra Andreeva (Universität Saarbrücken) geleitet wird.