Network analysis and statistical issues presentation

About This Presentation

Transcript and Presenter's Notes

Title: Network analysis and statistical issues

1
Network analysis and statistical issues
Lucio Baggio An introductive seminar to ICRRs GW
group
2
Topics of this presentation
Gravitational wave bursts networks
From the single detector to a worldwide network
IGEC (International GW Collaboration)
Long-term search with four detectors directional
search and statistical issues
Setting confidence intervals
From raw data to probability statements
likelihood/Byesian vs frequentist methods
False discovery probability
Multiple tests and large surveys change the
overall confidence of the first detection
Miscellaneous topics
The LIGO-AURIGA white paper on joint data
analysisproblems with non-aligned or different
detectors coherent data analysis.
3
Network analysis is unavodable, as far as
background estimation is concerned
4
Gravitational wave burst events
For fast (110ms) gw signals the impulse
response of the optimal filter for the signal
amplitude is an exponentially damped oscillation
Even at a very low amplitude the signals from
astrophysical sources are expected to be rare.
A candidate event in the gravitational wave
channel is any single extreme value in a more or
less constant time window.
Background events come from the extreme
distribution for an (almost) Gaussian stochastic
process
5
The background in practice (1)
Amplitude distribution of events AURIGA, Jun
12-21 1997
simulation (gaussian)
vetoed (?2 test)
L. Baggio et al. ? 2 testing of optimal filters
for gravitational wave signals an experimental
implementation. Phys. Rev. D, 611020019, 2000
6
The background in practice (2)
Amplitude distribution of events AURIGA Nov.
13-14, 2004
7
The background in practice (3)
Cumulative power distribution of events TAMA Nov.
13-14, 2004 from the presentation at The 9th
Gravitational Wave Data Analysis Workshop
(December 15-18, 2004, Annecy, France)
8
The background in practice (4)

Environmental Monitoring
Try to eliminate locally all possible false
signals
Detectors for many possible sources (seismic,
acoustic, electromagnetic, muon)
Also trend (slowly-varying) information (tilts,
temperature, weather)
Matched filter techniques for known' signals ?
this can only decrease background (no confidece
for not matched signal) but not increase the
(unknown) confidence for remaining signals.

Two good reasons for multiple detector analysis
the rate of background candidates can be
estimated reliably
the background rate of the network can be less
than that of the single detector

Non-coeherent methods coincidences among
detectors (also non-GW e.g., optical, g-ray ,
X-ray, neutrino)
Coeherent methods Correlations Maximum likelihood
(e.g. weighted average)
9
M-fold coincidence search
10
M-fold coincidence search (2)
From IGEC 1997-2000 example of predicted mean
false alarm rates. Notice the dramatic
improvement when adding a third detector the
occurrence of a 3-fold coincidence would be
interpreted inevitably as a gravitational wave
signal.
In practice, when no signal is detected in
coincidence, the upper limit is determined by the
total observation time
11
International networks of GW detectors
Interferometers Operative GEO600
(Germany/UK) LIGO Hanford 2km (USA) LIGO
Hanford 4km (USA) LIGO Livingstone 4km
(USA) TAMA300 (Japan) Upcoming VIRGO
(Italy/France) CLIO (Japan)
Resonant bars ALLEGRO (USA) AURIGA
(Italy) EXPLORER (CERN, Geneva) NAUTILUS
(Italy)
EXPLORER
GEO600
Virgo, AURIGA, NAUTLUS
LIGO
TAMA300 CLIO100
12
International networks of GW detectors
1969 -- Argonne National Laboratory and at the
University of Maryland J. Weber, Phys. Rev.
Lett. 22, 13201324 (1969) 1973-1974 Phys. Rev.
D 14, 893-906 (1976)
15 years of worldwide networks 1989 2 bars, 3
months E. Amaldi et al., Astron. Astrophys. 216,
325 (1989). 1991 2 bars, 120 days P. Astone et
al., Phys. Rev. D 59, 122001 (1999). 1995-1996
2 detectors, 6 months P. Astone et al.,
Astropart. Phys. 10, 83 (1999). 1989 2
interferometers, 2 days D. Nicholson et al.,
Phys. Lett. A 218, 175 (1996). 1997-2000 2, 3,
4 resonant detectors, resp. 2 years, 6 months, 1
month P. Astone et al., Phys. Rev. D 68, 022001
(2003). 2001 2 detectors, 11 days TAMA300-LISM
collaboration (2004) Phys. Rev. D 70, 042003
(2004) 2001 2 detectors, 90 days P. Astone et
al., Class. Quant. Grav 19, 5449 (2002). 2002 3
detectors, 17 days LIGO collaboration B. Abbott
et al., Phys. Rev. D 69, 102001 (2004)
GW detected? If NOT, why?
13
The International Gravitational Event
Collaboration
14
The International Gravitational Event
Collaboration
http//igec.lnl.infn.it
LSU group ALLEGRO (LSU) http//gravity.phys.lsu.
edu Louisiana State University, Baton Rouge -
Louisiana
AURIGA group AURIGA (INFN-LNL) http//www.auriga.
lnl.infn.it INFN of Padova, Trento, Ferrara,
Firenze, LNL Universities of Padova, Trento,
Ferrara, Firenze IFN- CNR, Trento Italia
ROG group EXPLORER (CERN)
http//www.roma1.infn.it/rog/rogmain.html NAUTIL
US (INFN-LNF) INFN of Roma and LNF Universities
of Roma, LAquila CNR IFSI and IESS, Roma -
Italia
NIOBE group NIOBE (UWA)
http//www.gravity.pd.uwa.edu.au University of
Western Australia, Perth, Australia
15
The IGEC protocol
The source of IGEC data are different data
analysis applied to individual detector outputs.
The IGEC members are only asked to follow a few
general guidelines in order to characterize in a
consistent way the parameters of the candidate
events and the detector status at any time.
Further data conditioning and background
estimation are performed in a coordinated way
16
Exchanged periods of observation 1997-2000
ALLEGRO
AURIGA
NAUTILUS
EXPLORER
NIOBE
fraction of time in monthly bins
17
The exchanged data
gaps
events amplitude and time of arrival
amplitude (Hz-110-21)
time (hours)
minimum detectable amplitude (aka exchange
threshold)
18
M-fold coincidence search (revised)
A coincidence is defined when for all 0lti,jltM
?t i t j?lt ?tij0.1 sec
Coincidence windows ?tij depend on timing error,
which is
To make things even worse, we would like the
sequence of event times to be described by a
(possibly non-homogeneous) Poisson point series,
which means rare and independent triggers, but
this was not the case.
19
Timing error uncertainty (AURIGA, for ?-like
bursts )
20
Auto- and cross-correlation of time series
(clustering)
? Auto-correlation of time of arrival on
timescales 100s
? No cross-correlation
AL ALLEGRO AU AURIGA EX EXPLORER NA
NAUTILUS NI NIOBE x-axis seconds y-axis counts
21
Amplitude distributions of exchanged events
normalized to each detector threshold for trigger
search
typical trigger search thresholds SNR? 3
ALLEGRO, NIOBE SNR? 5 AURIGA, EXPLORER,
NAUTILUS The amplitude range is much wider than
expected extreme distribution non modeled
outliers dominate at high SNR
22
False alarm reduction by amplitude selection
With a small increase of minimum amplitude, the
false alarm rate drops dramatically.
Corollary Selected events have naturally
consistent amplitudes
23
Sensitivity modulation for directional search
amplitude (Hz-110-21)
time (hours)
amplitude (Hz-110-21)
time (hours)
24
A small digression different antenna patterns
and the relevance of signal polarization
25
Introduction

At any given time, the antenna pattern is
it is a sinusoidal function of polarization ?,
i.e. any gravitational wave detector is a linear
polarizer
it depends on declination ? and right ascension ?
through the magnitude A and the phase ?

In order to reconstruct the wave amplitude h, any
amplitude has to be divided by

This has been extensively used by IGEC first
step is a data selection obtained by putting a
threshold ? F-1 on each detector

We will characterize the directional sensitivity
of a detector pair by the product of their
antenna patterns F1 and F2
F1F2 is inversely proportional to the square of
wave amplitude h2 in a cross-correlation search
F1F2 is an extension of the AND logic of
IGEC 2-fold coincidence

26
Linearly polarized signals
For linearly polarized signal, ? does not vary
with time. The product of antenna pattern as a
function of ? is given by
The relative phase ?1-?2 between detectors
affects the sensitivity of the pair.
27
AURIGA -TAMA sky coverage (1) linearly
polarized signal
AURIGA2
TAMA2
28
Circularly polarized signals

If
the signal is circularly polarized
Amplitude h(t) is varying on timescales
longer than 1/f0
Then
The measured amplitude is simply h(t),
therefore it depends only on the magnitude of the
antenna patterns. In case of two detectors
The effect of relative phase ?1-?2 is
limited to a spurious time shift ?t which adds to
the light-speed delay of propagation
(Gursel and
Tinto, Phys Rev D 40, 12 (1989) )

y
29
AURIGA -TAMA sky coverage (2) circularly
polarized signal
AURIGA2
TAMA2
30
AURIGA -TAMA sky coverage
Linearly polarized signal
Circularly polarized signal
AURIGA x TAMA
31
IGEC (continued)
32
Data selection at work
Duty time is shortened at each detector in order
to have efficiency at least 50
A major false alarm reduction is achieved by
excluding low amplitude events.
amplitude (Hz-110-21)
time (hours)
33
Duty cycle cut single detectors
total time when exchange threshold has been lower
than gw amplitude
34
Duty cycle cut network (1)
Galactic Center coverage
35
Duty cycle cut network (2)
search threshold 6 ? 10 -21/Hz
search threshold 3 ? 10 -21/Hz
36
False dismissal probability
A coincidence can be missed because of

data conditioning.
The common search threshold Ht guarantees that
no gw signal in the selected data are lost
because of poor network setup.
however the efficiency of detection is still
undetermined (depends on distribution of signal
amplitude, direction, polarization)

Best choice for 1997-2000 data false dismissal
in time coincidence less than 5 ? 30 no
amplitude consistency test
37
Resampling statistics by time shifts
amplitude (Hz-110-21)
time (hours)
We can approximately resample the stochastic
process by time shift. The in the resampled data
the gw sources are off, along with any correlated
noise Ergodicity holds at least up to timescales
of the order of one hour. The samples are
independent as long as the shift is longer than
the maximum time window for coincidence search
(few seconds)
38
Poisson statistics
? verified
For each couple of detectors and amplitude
selection, the resampled statistics allows to
test Poisson hypothesis for accidental
coincidences.
?
As for all two-fold combinations a fairly big
number of tests are performed, the overall
agreement of the histogram of p-levels with
uniform distribution says the last word on the
goodness-of-the-fit.
?
39
Setting (frequentist) confidence intervals
40
Unified vs flip-flop approach (1)
experimental data
physical results
hypothesis testing (CL)
null
upper limit
x
mup(CL)
estimation (with error bars)
claim
Flip-flop method
m(x) kCL sm
41
Unified vs flip-flop approach (2)
experimental data
physical results
estimation (with confidence interval)
confidence belt
x
mmin(CL) lt m ltmmax(CL)
Unified approach
42
Setting confidence intervals

IGEC approach is
Frequentist in that it computes the confidence
level or coverage as the probability that the
confidence interval contains the true value
Unified in that it prescribes how to set a
confidence interval automatically leading to a gw
detection claim or an upper limit
however, different from FC

References G.J.Feldman and R.D.Cousins, Phys.
Rev. D 57 (1998) 3873 B. Roe and M. Woodroofe,
Phys. Rev. D 63 (2001) 013009 F. Porter, Nucl.
Inst. Meth. A368 (1986), http//www.cithep.caltech
.edu/fcp/statistics/ Particle Data Group
http//pdg.lbl.gov/2002/statrpp.pdf
43
A few basics confidence belts and coverage
physical unknown
experimental data
44
A few basics (2)
physical unknown
confidence interval
coverage
experimental data
45
Freedom of choice of confidence belt
Fixed frequentistic coverage
Maximization of likelyhood
Fine tune of the false discovery probability
Non-unified approaches
Other requirements...
46
Confidence level, likelyhood, maybe probability?
The term CL is often found associated with
equations like
likelihood integral
likelihood ratio relative to the maximum
likelihood ratio (hipothesis testing)
47
Confidence intervals from likelihood integral
48
Example Poisson background Nb 7.0
49
Dependence of the coverage from the background
likelihood integral 0.90
Nb0.01-0.1-1.0-3.0-7.0-10
50
From likelihood integral to coverage
Plot of the likelihood integral vs. minimum
(conservative) coverage minN C(N ), for sample
values of the background counts Nb, spanning the
range Nb0.01-10
51
IGEC results (and what we learned from experience)
52
Setting confidence intervals on IGEC results
GOAL estimate the number (rate) of gw detected
with amplitude ? Ht
Example confidence interval with coverage ? 95
53
Uninterpreted upper limits
on RATE of BURST GW from the GALACTIC CENTER
DIRECTION whose measured amplitude is greater
than the search threshold no model is assumed
for the sources, apart from being a random time
series
ensured minimum coverage
rate (year 1)
search threshold (Hz -1 )
true rate value is under the curves with a
probability coverage
54
Upper limits after amplitude selection
systematic search on thresholds many trials
! all upper limits but one
overall false alarm probability 33 at least one
detection in case NO GW are in the data
NULL HYPOTHESIS WELL IN AGREEMENT WITH THE
OBSERVATIONS
55
Multiple configurations/selection/grouping within
IGEC analysis
56
Resampling statistics of accidental claims
event time series
coverage claims 0.90 0.866 (0.555)
1 0.95 0.404 (0.326) 1
Resampling ? blind analysis!
57
False discovery rate setting the probability of
false claim of detection
58
Why FDR?

When should I care of multiple test procedures?.

All sky surveys many source directions and
polarizations are tried

Template banks

Wide-open-eyes searches many analysis pipelines
are tried altogether, with different amplitude
thresholds, signal durations, and so on

Periodic updates of results every new science
run is a chance for a discovery. Maybe next
one is the good one.

Many graphical representations or aggregations of
the data If I change the binning, maybe the
signal shows up better

59
Preliminary (1) hypothesis testing
60
Preliminary (2) p-level

Assume you have a model for the noise that
affects the measure x.

You derive a test statistics t(x) from x.
F(t) is the distribution of t when x is sampled
from noise only (off-signal). The p-level
associated with t(x) is the value of the
distribution of t in t(x) p F(t) P(tgtt(x))

Example ?2 test ? p is the one-tail ?2
probability associated with n counts (assuming d
degrees of freedom)

Usually, the alternative hypothesis is not known.
61
Usual multiple testing procedures

For each hypothesis test, the condition plt? ?
reject null leads to false positives with a
probability ?

In case of multiple tests (need not to be the
same test statistics, nor the same tested null
hypothesis), let pp1, p2, pm be the set of
p-levels. m is the trial factor.
We select discoveries using a threshold T(p)
pjltT(p)? reject null.

Uncorrected testing T(p) ?
The probability that at least one rejection is
wrong is
P(Bgt0) 1 (1- ?)m m?
hence false discovery is guaranteed for m large
enough

Fixed total 1st type errors (Bonferroni)
T(p) ?/m
Controls familywise error rate in the most
stringent manner
P(Bgt0) ?
This makes mistakes rare
but in the end efficiency (2nd type errors)
becomes negligible!!

62
Controlling false discovery fraction

We desire to control (bound) the ratio of false
discoveries over the total number of claims B/R
B/(BS) ? q.
The level T(p) is then chosen accordingly.

63
Benjamini Hochberg FDR control procedure
Among the procedures that accomplish this task,
one simple recipe was proposed by Benjamini
Hochberg (JRSS-B (1995) 57289-300)

choose your desired FDR q (dont ask too much!)

define c(m)1 if p-values are independent or
positively correlated otherwise c(m)Sumj(1/j)

64
LIGO AURIGAcoincidence vs correlation
65
LIGO-AURIGA MoU

A working group for the joint burst search in
LIGO and AURIGA has been formed, with the purpose
to
develop methodologies for bar/interferometer
searches, to be tested on real data
time coincidence, triggered based search on a
2-week coincidence period (Dec 24, 2003 Jan 9,
2004)
explore coherent methods

best single-sided PSD
Simulations and methodological studies are in
progress.
66
White paper on joint analysis
Two methods will be explored in parallel

Method 1
IGEC style, but with a new definition of
consistent amplitude estimator in order to face
the radically different spectral densities of the
two kind of detectors (interferometers and bars).
To fully exploit IGEC philosophy, as the
detectors are not parallel, polarization effects
should be taken into account (multiple trials on
polarization grid).

Method 2
No assumptions are made on direction or
waveform.
A CorrPower search (see poster) is applied to
the LIGO interferometers around the time of the
AURIGA triggers.
Efficiency for classes of waveforms and source
population is performed through Monte Carlo
simulation, LIGO-style (see talks by Zweizig,
Yakushin, Klimenko).
The accidental rate (background) is obtained
with unphysical time-shifts between data streams.

67
Summary of non-directional IGEC style
coincidence search

Detectors PARALLEL, BARS
Shh SIMILAR FREQUENCY RANGE
Search NON DIRECTIONAL
Template BURST ??(t)
The search coincidence is performed in a subset
of the data such that
the efficiency is at least 50 above the
threshold (HS)
significant false alarm reduction is
accomplished
The number of detectors in coincidence considered
is self-adapting
This strategy can be made directional

HS
68
Cross-correlation search (naïve)

Detectors PARALLEL
Shh SAME FREQUENCY RANGE NEEDED
Search NON DIRECTIONAL
Template NO
Selection based on data quality can be
implemented before cross-correlating.
The efficiency is to be determined a posteriori
using Monte Carlo.
The information which is usually included in
cross-correlation takes into account statistical
properties of the data streams but not
geometrical ones, as those related to antenna
patterns.

69
Comparison between IGEC style and
cross-correlation

IGEC style search was designed for template
searches. The template guarantees that it is
possible to have consistent estimators of signal
amplitude and arrival time. A
bank of templates may be required to cover
different class of signals. Anyway in burst
search we dont know how well the template fits
the signal

Some more work is needed to extend IGEC in case
of template-less search among (spectrally)
different detectors. Hint the amplitude
estimators should have spectral weights common to
all detectors, to be consistent without a
template. The trade-off will be between between
efficiency loss and network gain (sky coverage
and false alarm rate)

A template-less IGEC style search can be easily
implemented in case of detectors with equal
detector bandwidth. In fact it is possible to
define a consistent amplitude estimator.
(Karhunen-Loeve, power)

Template search
Template-less search

Cross-correlation among identical detectors is
the most used method to cope with lack of
templates.

Cross-correlation in general is not efficient
with non-overlapping frequency bandwidths, even
for wide band signals.

Write a Comment

User Comments (0)

About PowerShow.com

Network analysis and statistical issues PowerPoint PPT Presentation