Subjective quality evaluation of personalized own voice reconstruction systems

Mattes Ohlenbusch, Christian Rollwage, Simon Doclo, Jan Rennies

Abstract

Own voice pickup technology for hearable devices facilitates communication in noisy environments. Own voice reconstruction (OVR) systems enhance the quality and intelligibility of the recorded noisy own voice signals. Since disturbances affecting the recorded own voice signals depend on individual factors, personalized OVR systems may outperform generic ones. In this paper, we propose personalizing OVR systems through data augmentation and fine-tuning, comparing them to their generic counterparts. We investigate the influence of personalization on speech quality assessed by objective metrics and conduct a subjective listening test to evaluate quality under various conditions. In addition, we assess the prediction accuracy of the objective metrics by comparing predicted quality with subjectively measured quality. Our findings suggest that personalized OVR provides benefits over generic OVR for some talkers only. Our results also indicate that performance comparisons between systems are not always accurately predicted by objective metrics. In particular, certain disturbances lead to a consistent over-estimation of quality compared to actual subjective ratings.

Links

Dataset of German own voice recordings: https://doi.org/10.5281/zenodo.10844599

Transfer function measurements for simulating environmental noise at hearable microphones: https://doi.org/10.5281/zenodo.11196867

Results

MUSHRA ratings low predicted benefit
Subjective MUSHRA quality ratings (averaged over sentences) for speech in the low predicted benefit case.
MUSHRA ratings high predicted benefit
Subjective MUSHRA quality ratings (averaged over sentences) for speech in the high predicted benefit case.

Audio Examples (recorded pseudo-diffuse factory noise at 0 dB SNR)

Processing condition Low predicted benefit from personalization High predicted benefit from personalization
Clean outer microphone
Noisy outer microphone
Noisy in-ear microphone
EBEN
MWF
Generic data augmentation
Generic data augmentation, generic fine-tuning
Generic data augmentation, personalized fine-tuning
Personalized data augmentation, personalized fine-tuning