Speaker Localization and tracking in the Presence of Unreliable Microphones

In many scenarios, an estimation of the speakers' location and/or speaker tracking is required. The estimation problem is complicated by the presence of sensor and environment noise. Due to the reverberation phenomenon, secondary reflections are added to the speaker signal, that bias its directionality. In addition, a scenario with the several concurrent speakers requires multiple estimations of several dominant directions.

In this work, multiple speaker schemes, with a known number of speakers, utilizing multiple microphone pairs are considered. Novel localization and tracking methods, for a multiple sources scenario, are presented. The methods, applied in the STFT domain, consist of a probabilistic model for the phase ratio between close microphone pairs. The MoG model is adapted to describe the speakers' locations and the likelihood of the parameters is maximized by utilizing the EM procedure.

In this work we consider the problem of estimating the three-dimensional coordinates of multiple speakers rather than only the associated TDOA. As opposed to TDOA estimation, coordinate estimation necessitates the spatial distribution of microphone pairs. In the case of an arbitrary constellation of microphone pairs it is quite common to observe unreliable readings from a subset of the pairs. This phenomenon can be attributed to microphones that are placed close to noise sources, faulty or hidden microphones, or microphones that suffer from significant sound reflections. To mitigate this problem, we propose a method based on variance estimation for automatic microphone-pair reliability estimation. It is experimentally shown that the proposed scheme improves localization performance by discarding unreliable microphone readings.

For the tracking problem scenario, we propose to use variants of the REM method. Titterington suggested a Newton-based recursion, by replacing the iteration index by the time index. We dub this method the TREM. In this work we also extend Titterington's method to deal with constrained maximization which is encountered in the MoG formulation of the problem at hand. The recursive, constrained MLE, is obtained by incorporating the Lagrange multiplier method in the Newton iterations.

Another REM version was suggested by Cappe and Moulines. We dub this method the CREM. Compared to Titterington's algorithm, this version is more directly related to the usual EM algorithm. The method is based on time smoothing of the results of the E-step. The M-step is similar to the corresponding stage in the batch EM algorithm.

07/11/2012 - 15:00
Ofer Shvarts
דוא"ל להרשמה: 
Bar-Ilan University
building 1103, Room 329