Title: Object Recognition
1Object Recognition
2Geometric Task
Given two configurations of points in the
three dimensional space,
find those rotations and translations of one
of the point sets which produce large
superimpositions of corresponding 3-D
points.
3Geometric Task (continued)
- Aspects
- Object representation (points, vectors, segments)
- Object resemblance (distance function)
- Transformation (translations, rotations, scaling)
4Transformations
- Translation
- Translation and Rotation
- Rigid Motion (Euclidian Trans.)
- Translation, Rotation Scaling
-
5Distance Functions
- Two point sets Aai i1n
- Bbj j1m
- Pairwise Correspondence
- (ak1,bt1) (ak2,bt2) (akN,btN)
(1) Exact Matching aki bti0
(2) RMSD (Root Mean Square Distance)
Sqrt( Saki bti2/N) lt e
- Hausdorff distance h(A,B)maxa?A minb?B a
b - H(A,B)max(
h(A,B), h(B,A))
6Exact Point Matching in R2
- Determine the centroids CA,CB (i.e. arithmetic
means) of the sets A and B.
2. Determine the polar coordinates of all
points in A using CA as the origin. Then sort A
lexicographically with respect to these polar
coordinates (angle,length) obtaining a sequence
(f1,r1)(fn,rn). Let SA(?1,r1)(?n,rn), where ?i
fi mode n fi-1 . Compute in the same way the
correspondence sequence SB of the set B.
3. Determine whether SB is a cyclic shift of
SA (i.e. SB is a substring of SASA).
O(n log n)
7Approximate Matching in R2, R3 (Hausdorff
distance)
E- Euclidian motion (translation and rotation),
Am, Bn
- Select from A diametrically opposing points r and
k. O(m log(m)) - For each r from B define Tr translation that
takes r to r. - For each k (k!r) define Rk rotation around
r that makes r,k,k collinear. - Let Erk Rk Tr . Let E, h(E(A),B)minrk
h(Erk(A),B). - h(E(A),B) lt 4h(Eopt(A),B)
- O(n2mlog2(n))
- R3
- h(E(A),B) lt 8h(Eopt(A),B)
- O(n3mlog2(n))
M.T. Goodrich, J.S.B. Mitchell, M.W. Orletsky
8Superposition - best least squares(RMSD) rigid
alignment
Given two sets of 3-D points Ppi, Qqi ,
i1,,n find a 3-D rotation R0 and translation
T0, such that minR,T S iRpi T - qi 2 S
iR0pi T0- qi 2 .
A closed form solution exists for this task. It
can be computed in O(n) time.
9Model Database
10Scene
11Recognition
Lamdan, Schwartz, Wolfson, Geometric
Hashing,1988.
12Geometric Matching task Geometric Pattern
Discovery
13Remarks
- The superimposition pattern is not known
a-priori pattern discovery . - The matching recovered can be inexact.
- We are looking not necessarily for the
- largest superimposition, since other
- matchings may have biological meaning.
14Straightforward Algorithm
- For each pair of triplets, one from each molecule
which define almost congruent triangles compute
the rigid motion that superimposes them. - Count the number of point pairs, which are
almost superimposed and sort the hypotheses by
this number.
15Naive algorithm (continued )
- For the highest ranking hypotheses improve the
transformation by replacing it by the best RMSD
transformation for all the matching pairs. - Complexity assuming order of n points in both
molecules - O(n7) . - (O(n3) if one exploits protein backbone
geometry.)
16Geometric Hashing - Preprocessing
- Pick a reference frame satisfying pre-specified
constraints. - Compute the coordinates of all the other points
(in a pre-specified neighborhood) in this
reference frame. - Use each coordinate as an address to the hash
(look-up) table and record in that entry the
(ref. frame, shape sign.,point). - Repeat above steps for each reference frame.
17Geometric Hashing - Recognition 1
- For the target protein do
- Pick a reference frame satisfying pre-specified
constraints. - Compute the coordinates of all other points in
the current reference frame . - Use each coordinate to access the hash-table to
retrieve all the records (ref.fr., shape sign.,
pt.).
18Geometric Hashing - Recognition 2
- For records with matching shape sign. vote for
the (ref.fr.). - Compute the transformations of the high scoring
hypotheses. - Repeat the above steps for each ref.fr.
- Cluster similar transformation.
- Extend best matches.
19A 3-D reference frame can be uniquely defined by
the ordered vertices of a non-degenerate triangle
p1
p2
p3
20(No Transcript)
21Complexity of Geometric Hashing
O(n4 n4 BinSize) O(n5 )
(Naive alg. O(n7))
22Advantages
- Sequence order independent.
- Can match partial disconnected substructures.
- Pattern detection and recognition.
- Highly efficient.
- Can be applied to protein-protein interfaces,
surface motif detection, docking. - Database Object Recognition a trivial extension
to the method - Parallel Implementation straight forward
23Structural Comparison Algorithms
- Ca backbone matching.
- Secondary structure configuration matching.
- Molecular surface matching.
- Multiple Structure Alignment.
- Flexible (Hinge - based) structural alignment.
24Protein Structural Comparison
PDB files
Feature Extraction
Geometric Matching
Verification and Scoring
Rotation and Translation Possibilities
Least Square Analysis
Ca
Other Inputs
Geometric Hashing
Backbone
Secondary Structures
Transformation Clustering
Flexible Geometric Hashing
H-bonds
Sequence Dependent Weights
25Problems
- Redundancy in representation
- Solution clustering
- Numerical Stability
- Solution add geometrical constraints
- Accuracy is not always the best policy
- Always compute in a give error threshold