Position Calibration of Acoustic Sensors and Actuators - PowerPoint PPT Presentation

About This Presentation
Title:

Position Calibration of Acoustic Sensors and Actuators

Description:

Many multimedia applications are emerging which use multiple audio/video sensors ... as laptops, PDAs, tablets, cellular phones,and camcorders have become pervasive. ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 53
Provided by: vikasr
Category:

less

Transcript and Presenter's Notes

Title: Position Calibration of Acoustic Sensors and Actuators


1
Position Calibration of Acoustic Sensors and
Actuators on Distributed General Purpose
Computing Platforms Vikas Chandrakant Raykar
University of Maryland, CollegePark
2
Motivation
  • Many multimedia applications are emerging which
    use multiple audio/video sensors and actuators.

Speakers
Microphones
Distributed Capture
Current Thesis
Distributed Rendering
Cameras
Number Crunching
Displays
Other Applications
3
What can you do with multiple microphones
  • Speaker localization and tracking.
  • Beamforming or Spatial filtering.

X
4
Some Applications
Speech Recognition
Hands free voice communication
Novel Interactive audio Visual Interfaces
Multichannel speech Enhancement
Smart Conference Rooms
Audio/Image Based Rendering
Audio/Video Surveillance
Speaker Localization and tracking
MultiChannel echo Cancellation
Source separation and Dereverberation
Meeting Recording
5
More Motivation
  • Current work has focused on setting up all the
    sensors and actuators on a single dedicated
    computing platform.
  • Dedicated infrastructure required in terms of
    the sensors, multi-channel interface cards and
    computing power.
  • On the other hand
  • Computing devices such as laptops, PDAs,
    tablets, cellular phones,and camcorders have
    become pervasive.
  • Audio/video sensors on different laptops can be
    used to form a distributed network of sensors.

6
(No Transcript)
7
(No Transcript)
8
Common TIME and SPACE
  • Put all the distributed audio/visual input/output
    capabilities of all the laptops into a common
    TIME and SPACE.
  • This thesis deals with common SPACE i.e estimate
    the 3D positions of the sensors and actuators.
  • Why common SPACE
  • Most array processing algorithms require that
    precise positions of microphones be known.
  • Painful, tedious and imprecise to do a manual
    measurement.

9
This thesis is about..
Z
Y
X
10
If we know the positions of speakers.
Y
If distances are not exact
If we have more speakers
Solve in the least square sense
?
X
11
If positions of speakers unknown
  • Consider M Microphones and S speakers.
  • What can we measure?

Distance between each speaker and all
microphones. Or Time Of Flight (TOF) MxS TOF
matrix Assume TOF corrupted by Gaussian
noise. Can derive the ML estimate.
Calibration signal
12
Nonlinear Least Squares..
More formally can derive the ML estimate using a
Gaussian Noise model
Find the coordinates which minimizes this
13
Maximum Likelihood (ML) Estimate..
If noise is Gaussian and independent ML is same
as Least squares
we can define a noise model and derive the ML
estimate i.e. maximize the likelihood ratio
Gaussian noise
14
Reference Coordinate System
Reference Coordinate system Multiple Global
minima
Positive Y axis
Similarly in 3D 1.Fix origin (0,0,0) 2.Fix X
axis (x1,0,0) 3.Fix Y axis (x2,y2,0) 4.Fix
positive Z axis x1,x2,y2gt0
Origin
X axis
Which to choose? Later
15
On a synchronized platform all is well..
16
However On a Distributed system..
17
The journey of an audio sample..
Network
This laptop wants to play a calibration signal
on the other laptop. Play comand in software.
When will the sound be actually played out
from The loudspeaker.
18
On a Distributed system..
Time Origin
Signal Emitted by source j
t
Playback Started
Signal Received by microphone i
Capture Started
t
19
MS TOF Measurements
Joint Estimation..
Microphone and speaker Coordinates 3(MS)-6
Microphone Capture Start Times M -1 Assume
tm_10
Speaker Emission Start Times S
Totally 4M4S-7 parameters to estimates MS
observations Can reduce the number of parameters
20
Use Time Difference of Arrival (TDOA)..
Formulation same as above but less number of
parameters.
21
Assuming MSK Minimum K required..
22
Nonlinear least squares..
Levenberg Marquadrat method
Function of a large number of parameters Unless
we have a good initial guess may not converge to
the minima. Approximate initial guess required.
23
Closed form Solution..
  • Say if we are given all pairwise distances
    between N points can we get the coordinates.

24
Classical Metric Multi Dimensional Scaling
dot product matrix Symmetric positive
definite rank 3
Given B can you get X ?....Singular Value
Decomposition
Same as Principal component Analysis But we can
measure Only the pairwise distance matrix
25
How to get dot product from the pairwise distance
matrix
i
j
26
Centroid as the origin
Later shift it to our orignal reference
Slightly perturb each location of GPC into two to
get the initial guess for the microphone and
speaker coordinates
27
Example of MDS
28
MDS is more general..
  • Instead of pairwise distances we can use pairwise
    dissimilarities.
  • When the distances are Euclidean MDS is
    equivalent to PCA.
  • Eg. Face recognition, wine tasting
  • Can get the significant cognitive dimensions.

29
Can we use MDS..Two problems
1. We do not have the complete pairwise
distances 2. Measured distances Include the
effect of lack of synchronization
UNKNOWN
UNKNOWN
30
Clustering approximation
31
Clustering approximation
i i
32
Finally the complete algorithm
Approximation
Clustering
TOF matrix
Approx Distance matrix between GPCs
Approx ts
Dot product matrix
Approx tm
Dimension and coordinate system
MDS to get approx GPC locations
TDOA based Nonlinear minimization
perturb
Approx. microphone and speaker locations
Microphone and speaker locations
tm
33
Sample result in 2D
34
Algorithm Performance
  • The performance of our algorithm depends on
  • Noise Variance in the estimated distances.
  • Number of microphones and speakers.
  • Microphone and speaker geometry
  • One way to study the dependence is to do a lot of
    monte carlo simulations.
  • Else can derive the covariance matrix and bias of
    the estimator.
  • The ML estimate is implicitly defined as the
    minimum of a certain error function.
  • Cannot get an exact analytical expression for the
    mean and variance.
  • Or given a noise model can derive bounds on how
    worst can our algortihm perform.
  • The Cramer Rao bound.

35
Estimator Variance
  • Can use implicit function theorem and Taylors
    series expansion to get approximate expressions
    for bias and variance.
  • J A Fessler. Mean and variance of implicitly
    defined biased estimators (such as penalized
    maximum likelihood) Applications to tomography.
    IEEE Tr. Im. Proc., 5(3)493-506, 1996.
  • Amit Roy Chowdhury and Rama Chellappa,
    "Statistical Bias and the Accuracy of 3D
    Reconstruction from Video", Submitted to
    International Journal of Computer Vision
  • Using first order taylors series expansion

Rank Deficit..remove the Known parameters
Jacobian
36
  • Gives the lower bound on the variance of any
    unbiased estimator.
  • Does not depends on the estimator. Just the data
    and the noise model.
  • Basically tells us to what extent the noise
    limits our performance i.e. you cannot get a
    variance lesser than the CR bound.

Rank Deficit..remove the Known parameters
Jacobian
37
Different Estimators..
38
Number of sensors matter
39
Number of sensors matter
40
Geometry also matters
41
Geometry also matters
42
(No Transcript)
43
Calibration Signal
44
Time Delay Estimation
  • Compute the cross-correlation between the signals
    received at the two microphones.
  • The location of the peak in the cross correlation
    gives an estimate of the delay.
  • Task complicated due to two reasons
  • 1.Background noise.
  • 2.Channel multi-path due to room
    reverberations.
  • Use Generalized Cross Correlation(GCC).
  • W(w) is the weighting function.
  • PHAT(Phase Transform) Weighting

45
Time Delay Estimation
46
Synchronized setup bias 0.08 cm sigma 3.8 cm
47
Distributed Setup
48
Experimental results using real data
49
Related Previous work
  • J. M. Sachar, H. F. Silverman, and W. R.
    Patterson III. Position calibration of
  • large-aperture microphone arrays. ICASSP 2002
  • Y. Rockah and P. M. Schultheiss. Array shape
    calibration using sources in unknown
  • locations Part II Near-field sources and
    estimator implementation. IEEE Trans. Acoust.,
  • Speech, Signal Processing, ASSP-35(6)724-735,
    June 1987.
  • J. Weiss and B. Friedlander. Array shape
    calibration using sources in unknow locations a
    maximum-likelihood approach. IEEE Trans. Acoust.,
    Speech, Signal Processing , 37(12)1958-1966,
    December 1989.
  • R. Moses, D. Krishnamurthy, and R. Patterson. A
    self-localization method for wireless
  • sensor networks. Eurasip Journal on Applied
    Signal Processing Special Issue on Sensor
  • Networks, 2003(4)348358, March 2003.
  • index.htm

50
Our Contributions
  • Novel setup for array processing.
  • Position calibration in a distributed scenario.
  • Closed form solution for the non-linear
    minimization routine.
  • Expression for the mean and variance of the
    esimators.
  • Study the effect of sensor geometry.


51
Acknowledgements
  • Dr. Ramani Duraiswami and Prof. Rama Chellappa
  • Prof. Yegnanarayana
  • Dr. Igor Kozintsev and Dr. Rainer Lienhart,
    Intel Research
  • Prof. Min Wu and Prof. Shihab Shamma
  • Prof. Larry Davis


52
Thank You ! Questions ?
Write a Comment
User Comments (0)
About PowerShow.com