Title: Calibrating Photometric Redshifts beyond Spectroscopic Limits
1Calibrating Photometric Redshifts beyond
Spectroscopic Limits
Jeffrey Newman Lawrence Berkeley National
Laboratory
2A critical problem
DETF Task Force Report
3But a difficult one
- Future DE experiments plan to use photo-zs for
objects far too faint to get spectroscopic zs
for en masse - High-z/faint spectroscopic redshift survey
samples are far from complete - - Photo-z calibrations for brighter galaxies may
not apply directly to fainter galaxies at same z
(smaller galaxies start star formation later-what
about Pop. III?) - How can we test photo-zs for faint galaxies if
we cant get complete sets of spectroscopic
redshifts?
4But a difficult one
- Future DE experiments plan to use photo-zs for
objects far too faint to get spectroscopic zs
for en masse - High-z/faint spectroscopic redshift survey
samples are far from complete - - Photo-z calibrations for brighter galaxies may
not apply directly to fainter galaxies at same z
(smaller galaxies start star formation later-what
about Pop. III?) - How can we test photo-zs for faint galaxies if
we cant get complete sets of spectroscopic
redshifts?
5But a difficult one
- Future DE experiments plan to use photo-zs for
objects far too faint to get spectroscopic zs
for en masse - High-z/faint spectroscopic redshift survey
samples are far from complete - - Photo-z calibrations for brighter galaxies may
not apply directly to fainter galaxies at same z
(smaller galaxies start star formation later-what
about Pop. III?) - How can we test photo-zs for faint galaxies if
we cant get complete sets of spectroscopic
redshifts?
6Because galaxies cluster together in 3D, they
also cluster together on the sky
Both because dark matter halos cluster with each
other and because more galaxies are found in more
massive halos, all populations of galaxies
cluster with each other - both in 3D and in
projection on the sky.
7Cross-correlations can tell us about p(z)
Consider objects in some photo-z bin, in a region
where there is another set of objects with
spectroscopic zs.
zphot0.7
8No overlap in z
If none of the photo-z objects are in fact at the
same z as a spectroscopic object, they will not
cluster with it on the sky.
9Some overlap in z
Those photo-z objects which are close in z to a
spectroscopic object will yield a clustering
signal.
10Maximal overlap in z
The cross-correlation is stronger at redshifts
where a greater fraction of the photo-z objects
truly reside.
11Two-point correlation statistics
The simplest clustering observable is the
two-point correlation function, the excess
probability over random that a second object will
be found some distance from another dP n (1
?(r) ) dV where ?(r) denotes the real-space
two-point autocorrelation function of this class
(which has average density n) at separation r.
?(r) is the Fourier transform of the power
spectrum. It is described well by a power law,
?(r) (r/r0 )-g where r0 3-5 h-1 Mpc,
depending on galaxy type, and g ? 1.8 .
12Angular cross-correlations
For galaxies in a small spectroscopic bin (e.g.
Dz 0.01) we can measure the cross-correlation of
photometric galaxies about a spectroscopic
galaxy, defined by dPsp (?) Sp (1 wsp(?) )
dW , where wsp (?) ? ?sp(y) p(z) dz , y (l2
D2 ?2)1/2 , Phillipps (1985) first used
cross-correlations to measure clustering (also
applied by Masjedi et al. 2006), but well use it
to get redshift distributions instead (cf. also
Schneider et al. 2006, Padmanabhan et al. 2006).
13Additional observables
In addition to wsp (?) ? ?sp(y) p(z) dz , we
also measure the real-space autocorrelation for
spectroscopic galaxies ?ss And the angular
autocorrelation for photometric galaxies wpp
? ?pp(y) p(z)2 dz For simple biasing, ?sp (?ss
?pp )1/2 , providing enough information to solve
separately for ?sp and p(z)
14Assumptions for the following
- We have a spectroscopic sample of galaxies with
well-measured redshifts. For starters, assume it
has a flat redshift distribution (constant
dNs/dz), e.g. 25k galaxies/unit z.
- We want to measure p(z) for a sample of galaxies
in one photometric redshift bin with true
redshift distribution a Gaussian with mean z1
and sigma sz. For a standard scenario, we take
surface density Sp 10/sq. arcmin and sz 0.1 .
15Assumptions (continued)
- We can ignore lensing, which can also cause
correlations (can be removed iteratively). - The clustering of the photometric sample is
independent of z . - We measure correlations within a 5 h-1 Mpc
comoving radius (trade-off of signal-to-noise vs.
nonlinearities). - We can ignore sample (cosmic) variance
(minimize by using many fields/sampling widely
separated regions of sky, remove to first order
using the observed fluctuations in dNs/dz ).
16Monte Carlo simulations
Generate realizations with realistic correlation
measurement errors in bins and do Gaussian fits
to inferred p(z) in each
17Scaling with Sp
18Scaling with sz
19Dominant Errors
Random errors 1.0 ?10-3 (sz/0.1)1.5
((dNs/dz) / 25,000)-0.5(Sp/10)-0.5 Field-to-field
zero point variations lt 4.1 ?10-3 (szp/0.01)
(Npatch/4)-0.5 Systematic errors in ?ss lt 1.6
?10-3 (ssys/0.02) (sz/0.1) Assuming no bias
evolution though it exists lt 3 ?10-3 (db/dz /
b)/0.3 (sz/0.1)2
20Near-future prospects
Blue SDSS AGES VVDS DEEP21700
galaxies/unit z at high z
Red add zCOSMOS PRIMUS WiggleZ 5000
galaxies/unit z at high z
21Monte Carlos for real surveys
Redshift samples will be 3-10x larger than today
at most z, with correspondingly smaller errors
Current
Future
22Conclusions
- Reasonably-sized spectroscopic datasets can
establish redshift distributions for objects in
photometric-only samples, with precisions right
around what is necessary for future surveys. - The spectroscopic sample does not need to be
complete, very precise, etc. - we can pick the
easiest galaxies to get redshifts for, restrict
to only the most secure redshifts, and so forth. - To minimize systematic errors (and sample
variance), best to have many surveys/fields
sampled - What is needed most are larger samples of
galaxies at z0.2-0.7 (under way) and especially
z gt 1.4.
23Net scaling
- For both the uncertainty in the mean z of the
photometric galaxies or the uncertainty in sz ,
we get - 1.0 ?10-3 (sz/0.1)1.5 ((dNs/dz) /
25,000)-0.5(Sp/10)-0.5 - If p(z) is made up of multiple, nonoverlapping
Gaussian peaks each containing fpeak of the
probability, errors scale as fpeak-1/2.
24Other sources of error
LSST tolerance is 0.002(1z) matches worst-case
systematic errors at z1.
25What if bias evolves with z for the sample?
- To get these uncertainties, I assumed that the
biasing/clustering of the photometric galaxies is
constant with z. We can use the angular
autocorrelation, plus dN/dz, to infer the average
bias of the photometric galaxies. - If we assume db/dz 0, and it is not, then we
will get a biased estimate of the true ltzgt for
bb0 (1(db/dz)(z-z0), we will make an error of
(db/dz) ? (sz)2 - Observed db/dz for reasonable samples is 0.3, so
this corresponds to an error of 3 ? 10-3 for sz
0.1. - In actuality, we should get some handle on db/dz
from comparing e.g. photo-z slices this error
can be reduced substantially.
26Measuring ? in a redshift survey
The observed clustering of galaxies is not
isotropic, as the redshift separation of objects
is a combined result of their distance and
peculiar motions induced by gravity.
Therefore, we commonly measure wp(rp) the
excess probability two objects are a given
separation apart, projected on the sky. This
avoids redshift-space effects. If distance gtgt
r0 , wp(rp) ? ?(r) dz f(g) rp1-g /r0g
Conroy et al. 2005/Coil et al. 2005
27Cross-correlations
Generally, we measure the autocorrelation of some
sort of object with other objects of the same
sort. However, we can also measure
cross-correlations the excess probability of
finding an object of type 2 near an object of
type 1.
For simple, linear biasing, x12 (x11 x22) 0.5
Galaxy-QSO clustering vs. galaxy-galaxy clustering
Coil et al. 2006
28Given large sets of galaxies with redshifts, we
can infer dN/dz from cross-correlation techniques
Phillipps (1985) showed that high-quality
correlation function measurements can be obtained
by measuring the angular correlation of galaxies
without redshifts (but seen in photometry) around
galaxies of known redshift. This can get around
the usual problems with angular correlations we
generally must assume luminosity and clustering
are uncoupled and then use a known redshift
distribution of sources to interpret angular
correlation functions (via Limbers equation).
By cross-correlating with galaxies with
spectroscopic redshifts, though, the analysis
becomes much simpler.
29Angular correlations
Much larger sets of galaxies have photometry than
spectroscopy/redshifts. Their clustering
statistics can be studied using the angular
correlation function w(?).
To interpret angular correlations, we need to
know the redshift distribution of sources
w(?)? (dN/dz)2 ?(r,z) dz
Coil et al. 2004
30Hybrid methods are also possible
Phillipps (1985) showed that high-quality
correlation function measurements can be obtained
by measuring the angular correlation of galaxies
without redshifts (but seen in photometry) around
galaxies of known redshift, e.g. in cases where
the photometric dataset is gtgt the redshift
dataset. This can get around the usual problems
with angular correlations we typically must
assume luminosity and clustering are uncoupled
and then use a known redshift distribution of
sources to interpret angular correlation
functions (via Limbers equation). By
cross-correlating with galaxies with
spectroscopic redshifts, though, the analysis
becomes much simpler.
31From cross-correlations to dN/dz
Assume a spectroscopic survey of a total of Ns
galaxies has been performed over the same region
in which we desire to calibrate redshift
distributions (e.g. for a given photo-z bin).
From that, we know dN/dz for the spectroscopic
sample/ns(z), plus the two-point autocorrelation
function for those galaxies, ?ss(r). For the
photometric-only sample, we know its total
surface density on the sky, Sp, and its two-point
angular autocorrelation function, wpp(?).
32Key observables
Assume a spectroscopic survey of a total of Ns
galaxies has been performed over the same region
in which we desire to calibrate redshift
distributions (e.g. for a given photo-z bin).
From that, we know dN/dz for the spectroscopic
sample/ns(z), plus the two-point autocorrelation
function for those galaxies, ?ss(r). For the
photometric-only sample, we know its total
surface density on the sky, Sp, and its two-point
angular autocorrelation function, wpp(?).
33Then
We can measure the cross-correlation of galaxies
in a small spectroscopic bin (e.g. Dz 0.01) with
the photometric sample, defined by dPsp (?)
Sp (1 wsp(?) ) dW where wsp (?) np(z)/ Sp
? ?sp(y) dl, and y (l2 dA2 ?2)1/2 . So given
a sample of galaxies with known z and known
clustering, we can derive the fraction of a
separate sample of photometric galaxies at that z
(as after we measure the cross-correlations, we
can get the average clustering of the photometric
sample from its angular autocorrelation).
34A recent application
35This allows us to infer dN/dz for a photometric
sample using a spectroscopic survey
The observed clustering on the sky between
galaxies in some photometric-only sample and
galaxies known to be at a given z depends on the
product of the real-space cross-correlation
between locations of the two populations and the
fraction of the photometric sample at that z. In
general, we have sufficient information to
measure the autocorrelation function of the
spectroscopic sample with itself and the angular
autocorrelation of the photometric sample along
with the angular cross-correlation. This provides
enough information to get out dN/dz for the
photometric sample, so long as biasing is simple
or well-modeled.