CVX-Optimized Beamforming and Vector Taylor Series Compensation with German ASR Employing Star-Shaped Microphone Array

详细信息

作者：Juan A. Morales-Cordovilla (23) Hannes Pessentheiner (23) Martin Hagmüller (23) José A. González (24) Gernot Kubin (23)
关键词：distant speech recognition ; cvx ; optimized beamforming ; vector Taylor series compensation ; star ; shaped microphone array ; reverberant and noisy environment ; natural mixing ; German database
刊名：Lecture Notes in Computer Science
年：2014
期：1
DOI：10.1007/978-3-319-13623-3_16
来源：SpringerLink
类型：期刊

摘要

This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a star-shaped microphone array and vector Taylor series (VTS) compensation. First, a beamformer yields an enhanced single-channel signal by applying convex (CVX) optimization over three spatial dimensions given the spatio-temporal position of the target speaker as prior knowledge. Then, VTS compensation is applied over the speech features extracted from the temporal signal obtained by the beamformer. Finally, the compensated features are used for speech recognition. Due to a lack of existing resources in German to evaluate the proposed enhancement framework, this paper also introduces a new speech database. In particular, we present a medium-vocabulary German database for microphone array made of embedded clean signals contaminated with real room impulsive responses and mixed in a ‘natural-way with real noises. We show that the proposed enhancement framework performs better than other related systems on the presented database.