Modeling of Speech-dependent Own Voice Transfer Characteristics for Hearables with an In-ear Microphone

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo

Abstract

Many hearables contain an in-ear microphone, which may be used to capture the own voice of its user. However, due to the hearable occluding the ear canal, the in-ear microphone mostly records body-conducted speech, typically suffering from band-limitation effects and amplification at low frequencies. Since the occlusion effect is determined by the ratio between the air-conducted and body-conducted components of own voice, the own voice transfer characteristics between the outer face of the hearable and the in-ear microphone depend on the speech content and the individual talker. In this paper, we propose a speech-dependent model of the own voice transfer characteristics based on phoneme recognition, assuming a linear time-invariant relative transfer function for each phoneme. We consider both individual models as well as models averaged over several talkers. Experimental results based on recordings with a prototype hearable show that the proposed speech-dependent model enables to simulate in-ear signals more accurately than a speech-independent model in terms of technical measures, especially under utterance mismatch and talker mismatch. Additionally, simulation results show that talker-averaged models generalize better to different talkers than individual models.

Links

Journal paper: https://doi.org/10.1051/aacus/2024032

Arxiv preprint: https://arxiv.org/abs/2310.06554

Dataset of German own voice recordings: https://doi.org/10.5281/zenodo.10844599

Spectrograms

Example Spectrograms
Spectrograms recorded own voice signals at outer and in-ear microphone, and simulated in-ear own voice signals.

Audio Examples

recorded outer microphone
recorded in-ear microphone
simulated in-ear
(speech-independent individual)
simulated in-ear
(speech-independent talker-averaged)
simulated in-ear
(speech-dependent individual)
simulated in-ear
(speech-dependent talker-averaged)