Title: Making the Sky Searchable: Solving the Blind Astrometry Problem
1Making the Sky Searchable Solving the Blind
Astrometry Problem
- Dustin Lang, Sam Roweis Keir Mierle
- University of Toronto
- David Hogg Michael Blanton
- New York University
2Basic Problem
- You show me a picture of the night sky.
- I tell you where on the sky it came from.
3Rules of the game
- We start with a catalogue of stars in the sky,
and from it build an index which is used to
assist us in locating (solving) new test images.
4Rules of the game
- We start with a catalogue of stars in the sky,
and from it build an index which is used to
assist us in locating (solving) new test images.
- We can spend as much time as we want building the
index but solving should be fast. - Challenges1) The sky is big.2) Both catalogues
and pictures are noisy.
5Distractors and Dropouts
- Bad newsQuery images may contain some extra
stars that are not in your index catalogue, and
some catalogue stars may be missing from the
image.
- These distractors dropouts mean that naïve
matching techniques will not work.
6You try
7You try
Hint 1 Missing stars.
8You try
Hint 1 Missing stars.
Hint 2 Extra stars.
9You try
10Solving the search problem
- This is a huge search problem.
- Exhaustive search? Ha!
- There are millions of patches of the scale of a
typical test image on the sky, plus rotation.
?
The Sky is Big
TM
11Index of Features
- We define features that can be extracted from
any particular view of the sky (image). - Our index is a specially chosen subset of the
features that exist in the catalogue, along with
a record of the place on the sky each feature
came from. - We target each index at a particular range of
image scales.
12Matching a test image
- When we see a new test image, we compute which
features are present, and use our index to look
up which possible views from the catalogue also
have those features. - Each feature generates a list of places on the
sky from which the test image may have come.
The features in our index actas hash codes for
locations on the sky.
13Caching Computation
- The idea of an index is that is pushes the
computation from search time back to index
construction time. - We actually do perform an exhaustive search of
sorts, but it happens during the building of the
index and not at search time, so queries can
still be fast.
14Robust Features for Geometric Hashing
- In simple search domains like text, the indexing
idea can be applied directly. - However, in our star matching task, the features
we chose must be invariant to scale, rotation and
translation. - They must also be robust to small positional
noise. - Finally, there is the additional problem of
distractor dropout stars.
The features we use are the relative
positions of nearby quadruples of stars.
(quads)
15Quads as Robust Features
- We encode the relative positions of nearby
quadruples of stars (ABCD) using a coordinate
system defined by the most widely separated pair
(AB). - Within this coordinate system, the positions of
the remaining two stars form a 4-dimensional code
for the shape of the quad. - Swapping AB or CD reflects the code
degeneracy. - We require C,D to lie in the circle with diameter
AB.
B
C
D
A
16Quads as Robust Features
- This geometric hash code is invariant to scale,
translationand rotation. - It has good positional noise properties.
- It also has the property that if stars are
uniformly distributedin space, codes are
uniformly distributed in 4D.
B
C
D
A
17Catalogues USNO-B 1.0 TYCHO-2
- USNO-B is an all-sky catalogue compiled from
scans of old Schmidt plates.Contains about 109
objects, both stars and galaxies. - TYCHO-2 is a tiny subset of 2.5Mbrightest stars.
18Making a uniform catalogue
- Starting with USNO TYCHO we cut to get a
spatially uniform set of the 150M brightest
stars galaxies. - We lay down a fine grid and take the brightest K
objects in each square. - We split the sky into 12 bite-sized chunks
(healpixes).
Star density (heat map)
19Building the index
- Start with the cut catalogue.
- Place a fine grid on the sky.
- Within each pixel, identify a valid quad made of
bright stars whose size is within the target
range of the index. - Compute 4D codes for those quads enter them into
the index along with their original locations. - Use kd-trees to do it quickly.
20(No Transcript)
21(No Transcript)
22A Typical Final Index
- 144M stars(6 quads/star)
- 205M quads (4-5 arcmin)
- 12 healpixes
Codes in4D
Quadson the sky
23Solving a new test image
- Identify objects (starsgalaxies) in the image
and create a list of their 2D positions. - Cycle through all possible valid quads (brightest
first) and compute their corresponding codes. - Look up the codes in the index to find matches
within some tolerance this stage incurs some
false positive and false negative matches. - Each code match represents a candidate position,
rotation and scale on the sky. As soon as N quads
agree on a candidate, we proceed to verify that
candidate, using all objects in the image.
24A Real Example from SDSS
Query image(after object detection).
An all-sky catalogue.
25A Real Example from SDSS
Query image(after object detection).
Zoomed in by a factor of 1 million.
26A Real Example from SDSS
Query image(after object detection).
The objects in our index.
27A Real Example from SDSS
All the quads in our index whichare present in
the query image.
28A Real Example from SDSS
A single quad which we happened to try.
29A Real Example from SDSS
The query image scaled, translated rotated as
specified by the quad.
30A Real Example from SDSS
The proposed match, on which we run verification.
31A Real Example from SDSS
The verified answer, overlaid on the original
catalogue.
The proposed match, on which we run verification.
32Final Verification
- After finding N quads that agree about where they
came from on the sky, we run a slower
verification process - Project each object in the test image onto the
sky according to the matched quad - Count how many objects in the test image are
close to objects in the index. - Simple, but it works.
33Preliminary Results SDSS
- The Sloan Digital Sky Survey (SDSS) covers 1/4 of
the sky at high resolution in five different
wavelengths. - The 2.5 m telescope is located at Apache Point
Observatory. - 120 MP, liquid nitrogen cooled camera.
- Fields are 14x9arcmin (10-6 of the sky),
2048x1361 pixels.
34Preliminary Results SDSS
- 336,554 fieldsscience grade
- 0 false positives
- 99.84 solved 530 unsolved
- 99 solve by looking at just the 60 brightest
objects
Assume the pixel scale is known to within 5 (to
speed up solving)
35Preliminary Results GALEX
- GALEX is a space-based telescope, seeing only in
the ultraviolet. - It was launched in April 2003 by CaltechNASA and
is just about finished collecting data now. - It takes huge (80 arcmin) circular fields with
5arcsec resolution and spectraof all objects.
36Preliminary Results GALEX
- GALEX NUV (near-UV) fields can be solved easily
using an index built from bright blue USNO stars.
37Preliminary Results GALEX
- GALEX FUV (far-UV) fields are much harder to
solve using USNO as a source catalogue.
Frequency band(s) of the test images must have
substantial overlap with those of the catalogue.
38Speed/Memory/Disk
SDSS
- Indexing takes 12 hours, uses 2 GB of memory
and 100 GB of disk. - Solving a test image almost always takes (not includingobject detection).
- Solving many fields is done by coarse
parallelization on about 100 shared CPUs.
All the work is in the hardest 10 of fields
39Applications
- Live on the web provide the solver as a service
to astronomers. - Monitor telescopes in real time ensure that the
images and headers make sense. - Merge large catalogues allow searches for all
images that cover a region fix up existing
catalogues. - Collaborative observing let astronomers add
their own images to the archive and communicate
with each other. - Historical images bring in, eg, the Harvard
Photographic Plate Archive 500,000 glass plates
taken 1880-1990. - Amateur astronomers currently a huge but
untapped resource for professional astronomers!
40Thanks!