Title: Classification of GAIA data
1Classification of GAIA data
- Coryn A.L. Bailer-Jones
- Max-Planck-Institut für Astronomie, Heidelberg
- calj_at_mpia.de
Overview GAIA classification objectives and
available data Approaches to classification
principles and problems Example classification
using RVS-like data Some specific issues Summary
2GAIA classification objectives
- discrete classification of objects
- as star, galaxy, quasar, solar system object,
supernovae etc. - determination of astrophysical parameters (APs)
for stars - Teff, logg, Fe/H, ?/Fe, CNO, A(?), Vrot,
Vrad, activity - combination with parallax to determine stellar
- luminosity, radius, (mass, age)
- identification of unresolved binaries (and
parametrization of components where possible) - efficient identification of new types of objects
Goal catalogue of object classifications and
astrophysical parameters
3GAIA data
- BBP 4 broad band filters all objects
- MBP 10-20 medium band filters all objects
- ? object classification stellar Teff, logg,
Fe/H, A(?) - RVS 849-874 nm spectrum, 0.04 nm/pixel Glt17
- ? stellar Vrad, Vrot, specific element
abundances - Astrometry ? parallax, kinematics, unresolved
binaries - Time domain ? 50 epochs over 5 years
(photometric variability) - ? Inhomogeneous data
Redshiftproblem to get RV, need correct SpT
template, but to determine SpT (may) need to
know ? shift ? use MBP data to give SpT and
iterate Generally use MBP data to give initial
classification of RVS data
4Classification principles
- Supervised approach
- use pre-classified data (templates) to infer the
desired mapping - apply mapping to any new data to give APs or
classes - But, the desired mapping is generally
degenerate...
5(No Transcript)
6(No Transcript)
7Minimum Distance Methods (MDMs)
- Search for nearest neighbours (templates) in data
space - Assign parameters according to these
- Generally interpolate
- either in data space ? f(d w)
- or in parameter space D g(? w)
- Need to scale data dimensions
- e.g. k-nn, ?2 min, cross-correlation
- a local classification method
? astrophysical parameter(s) d1,d2
data D distance to a template
8(No Transcript)
9Classification principles
- selecting just local neighbours in data space can
lead to systematic errors or missed solutions - need to find global (forward) mapping and
identify degenerate regions - more complex in higher dimensional spaces (data
or parameters) - severity of degeneracy depends upon the density
of template grid and noise in the data
10Artificial Neural Networks (ANNs)
- Functional mapping astrophysical parameters
f(data weights) - Weights determined by training on pre-classified
data (templates) - ? least squares minimization of total
classification error (numerical methods) - ? global interpolation of data
11Classification example with high-res spectra
- Database of 611 real stellar spectra from Cenarro
et al. (2001) - variation over Teff, logg, Fe/H
- coverage 849 - 874 nm (same as GAIA RVS)
- resolution 0.15 nm _at_ 0.075 nm/pixel (poorer
than GAIA?) - SNR median70 90 in range 20-140
- Randomly split data set into two sets
- train a neural network on one set and test its
performance on the other.
12Distribution over APs in Cenarro et al. data
blue training data (300) red test data (311)
13Results Teff and logg
14Results Fe/H
15Requirements of the classification scheme
- produce both discrete classification and
continuous parametrization (e.g. star vs. quasar,
APs of stars) - recognition of degeneracies in presence of noise
- (i.e. recognise multiple classifications for
given data vector) - robustly handle missing and censored data
- possible RVS lossy compression (as function of
magnitude) - ? handle different amounts/formats of data
- reliable determination of parametrization
uncertainties - accommodate ever-improving stellar models
- all this for a very wide range of type of objects
...
16Classification schemes
P probability APs astrophysical parameters
17Model training
- Real spectra and synthetic spectra not identical
- systematic differences (modelling uncertainties,
e.g. opacities) - increased cosmic scatter in real spectra
(unaccounted-for APs)
1. Can synthetic spectra be used to reliably
parametrize GAIA data? 2. Are performances
representative of what can be achieved? 3. Do
synthetic spectra give the best optimization of
phot/spec systems?
23 require accurate synthetic spectra (or large
set of real spectra)
- Can overcome mismatch problem for (1)
- use real GAIA data of pre-selected targets to
apply corrections to synthetic SEDs - APs of these targets determined from higher
resolution spectra from ground-based spectra
18Summary
- classification with GAIA data is a challenging
problem - methods used so far in (astronomical)
classification literature are suboptimal for this
purpose - ? further development of methods is a high
priority - particular problems to overcome are
- - degeneracy (especially with MBP data and
compressed RVS data) - - inhomogeneous data
- development of classification methods is very
dependent on appropriate data (real or synthetic) - - both of targets of interest
- - and of contaminating objects
19ICAP the GAIA classification working group
- WG responsible for addressing classification
issues for GAIA - 14 core members 17 associate members
GAIA Classification meeting 2-3
December Heidelberg, Germany Anyone interested
in classification issues broadly related to GAIA
is welcome to attend http//www.mpia.de/GAIA/
20Use of astrometric data in classification
- Parallax and proper motion very important for
- identifying quasars
- identifying solar system objects (NEOs,
asteroids, KBOs) - determining stellar properties (gives radius and
helps binarity detection)
However, proper motion and Galactic position
should not be used to help determine stellar
parameters ? introduces biases based on current
knowledge of the Galaxy ? understanding
relationship between kinematics and intrinsic
stellar parameters is key GAIA science goal