Network analysis and statistical issues - PowerPoint PPT Presentation

About This Presentation
Title:

Network analysis and statistical issues

Description:

NAUTILUS (INFN-LNF) INFN of Roma and LNF. Universities of Roma, L'Aquila ... NA = NAUTILUS. NI = NIOBE. x-axis: seconds ... SNR 5 AURIGA, EXPLORER, NAUTILUS ... – PowerPoint PPT presentation

Number of Views:153
Avg rating:3.0/5.0
Slides: 70
Provided by: luciob
Category:

less

Transcript and Presenter's Notes

Title: Network analysis and statistical issues


1
Network analysis and statistical issues
Lucio Baggio An introductive seminar to ICRRs GW
group
2
Topics of this presentation
Gravitational wave bursts networks
From the single detector to a worldwide network
IGEC (International GW Collaboration)
Long-term search with four detectors directional
search and statistical issues
Setting confidence intervals
From raw data to probability statements
likelihood/Byesian vs frequentist methods
False discovery probability
Multiple tests and large surveys change the
overall confidence of the first detection
Miscellaneous topics
The LIGO-AURIGA white paper on joint data
analysisproblems with non-aligned or different
detectors coherent data analysis.
3
Network analysis is unavodable, as far as
background estimation is concerned
4
Gravitational wave burst events
For fast (110ms) gw signals the impulse
response of the optimal filter for the signal
amplitude is an exponentially damped oscillation
Even at a very low amplitude the signals from
astrophysical sources are expected to be rare.
A candidate event in the gravitational wave
channel is any single extreme value in a more or
less constant time window.
Background events come from the extreme
distribution for an (almost) Gaussian stochastic
process
5
The background in practice (1)
Amplitude distribution of events AURIGA, Jun
12-21 1997
simulation (gaussian)
vetoed (?2 test)
L. Baggio et al. ? 2 testing of optimal filters
for gravitational wave signals an experimental
implementation. Phys. Rev. D, 611020019, 2000
6
The background in practice (2)
Amplitude distribution of events AURIGA Nov.
13-14, 2004
7
The background in practice (3)
Cumulative power distribution of events TAMA Nov.
13-14, 2004 from the presentation at The 9th
Gravitational Wave Data Analysis Workshop
(December 15-18, 2004, Annecy, France)
8
The background in practice (4)
  • Environmental Monitoring
  • Try to eliminate locally all possible false
    signals
  • Detectors for many possible sources (seismic,
    acoustic, electromagnetic, muon)
  • Also trend (slowly-varying) information (tilts,
    temperature, weather)
  • Matched filter techniques for known' signals ?
    this can only decrease background (no confidece
    for not matched signal) but not increase the
    (unknown) confidence for remaining signals.
  • Two good reasons for multiple detector analysis
  • the rate of background candidates can be
    estimated reliably
  • the background rate of the network can be less
    than that of the single detector

Non-coeherent methods coincidences among
detectors (also non-GW e.g., optical, g-ray ,
X-ray, neutrino)
Coeherent methods Correlations Maximum likelihood
(e.g. weighted average)
9
M-fold coincidence search
10
M-fold coincidence search (2)
From IGEC 1997-2000 example of predicted mean
false alarm rates. Notice the dramatic
improvement when adding a third detector the
occurrence of a 3-fold coincidence would be
interpreted inevitably as a gravitational wave
signal.
In practice, when no signal is detected in
coincidence, the upper limit is determined by the
total observation time
11
International networks of GW detectors
Interferometers Operative GEO600
(Germany/UK) LIGO Hanford 2km (USA) LIGO
Hanford 4km (USA) LIGO Livingstone 4km
(USA) TAMA300 (Japan) Upcoming VIRGO
(Italy/France) CLIO (Japan)
Resonant bars ALLEGRO (USA) AURIGA
(Italy) EXPLORER (CERN, Geneva) NAUTILUS
(Italy)
EXPLORER
GEO600
Virgo, AURIGA, NAUTLUS
LIGO
TAMA300 CLIO100
12
International networks of GW detectors
1969 -- Argonne National Laboratory and at the
University of Maryland J. Weber, Phys. Rev.
Lett. 22, 13201324 (1969) 1973-1974 Phys. Rev.
D 14, 893-906 (1976)
15 years of worldwide networks 1989 2 bars, 3
months E. Amaldi et al., Astron. Astrophys. 216,
325 (1989). 1991 2 bars, 120 days P. Astone et
al., Phys. Rev. D 59, 122001 (1999). 1995-1996
2 detectors, 6 months P. Astone et al.,
Astropart. Phys. 10, 83 (1999). 1989 2
interferometers, 2 days D. Nicholson et al.,
Phys. Lett. A 218, 175 (1996). 1997-2000 2, 3,
4 resonant detectors, resp. 2 years, 6 months, 1
month P. Astone et al., Phys. Rev. D 68, 022001
(2003). 2001 2 detectors, 11 days TAMA300-LISM
collaboration (2004) Phys. Rev. D 70, 042003
(2004) 2001 2 detectors, 90 days P. Astone et
al., Class. Quant. Grav 19, 5449 (2002). 2002 3
detectors, 17 days LIGO collaboration B. Abbott
et al., Phys. Rev. D 69, 102001 (2004)
GW detected? If NOT, why?
13
The International Gravitational Event
Collaboration
14
The International Gravitational Event
Collaboration
http//igec.lnl.infn.it
LSU group ALLEGRO (LSU) http//gravity.phys.lsu.
edu Louisiana State University, Baton Rouge -
Louisiana
AURIGA group AURIGA (INFN-LNL) http//www.auriga.
lnl.infn.it INFN of Padova, Trento, Ferrara,
Firenze, LNL Universities of Padova, Trento,
Ferrara, Firenze IFN- CNR, Trento Italia
ROG group EXPLORER (CERN)
http//www.roma1.infn.it/rog/rogmain.html NAUTIL
US (INFN-LNF) INFN of Roma and LNF Universities
of Roma, LAquila CNR IFSI and IESS, Roma -
Italia
NIOBE group NIOBE (UWA)
http//www.gravity.pd.uwa.edu.au University of
Western Australia, Perth, Australia
15
The IGEC protocol
The source of IGEC data are different data
analysis applied to individual detector outputs.
The IGEC members are only asked to follow a few
general guidelines in order to characterize in a
consistent way the parameters of the candidate
events and the detector status at any time.
Further data conditioning and background
estimation are performed in a coordinated way
16
Exchanged periods of observation 1997-2000
ALLEGRO
AURIGA
NAUTILUS
EXPLORER
NIOBE
fraction of time in monthly bins
17
The exchanged data
gaps
events amplitude and time of arrival
amplitude (Hz-110-21)
time (hours)
minimum detectable amplitude (aka exchange
threshold)
18
M-fold coincidence search (revised)
A coincidence is defined when for all 0lti,jltM
?t i t j?lt ?tij0.1 sec
Coincidence windows ?tij depend on timing error,
which is
To make things even worse, we would like the
sequence of event times to be described by a
(possibly non-homogeneous) Poisson point series,
which means rare and independent triggers, but
this was not the case.
19
Timing error uncertainty (AURIGA, for ?-like
bursts )
20
Auto- and cross-correlation of time series
(clustering)
? Auto-correlation of time of arrival on
timescales 100s
? No cross-correlation
AL ALLEGRO AU AURIGA EX EXPLORER NA
NAUTILUS NI NIOBE x-axis seconds y-axis counts
21
Amplitude distributions of exchanged events
normalized to each detector threshold for trigger
search
      typical trigger search thresholds SNR? 3
ALLEGRO, NIOBE SNR? 5 AURIGA, EXPLORER,
NAUTILUS   The amplitude range is much wider than
expected extreme distribution non modeled
outliers dominate at high SNR
22
False alarm reduction by amplitude selection
With a small increase of minimum amplitude, the
false alarm rate drops dramatically.
Corollary Selected events have naturally
consistent amplitudes
23
Sensitivity modulation for directional search
amplitude (Hz-110-21)
time (hours)
amplitude (Hz-110-21)
time (hours)
24
A small digression different antenna patterns
and the relevance of signal polarization
25
Introduction
  • At any given time, the antenna pattern is
  • it is a sinusoidal function of polarization ?,
    i.e. any gravitational wave detector is a linear
    polarizer
  • it depends on declination ? and right ascension ?
    through the magnitude A and the phase ?
  • In order to reconstruct the wave amplitude h, any
    amplitude has to be divided by
  • This has been extensively used by IGEC first
    step is a data selection obtained by putting a
    threshold ? F-1 on each detector
  • We will characterize the directional sensitivity
    of a detector pair by the product of their
    antenna patterns F1 and F2
  • F1F2 is inversely proportional to the square of
    wave amplitude h2 in a cross-correlation search
  • F1F2 is an extension of the AND logic of
    IGEC 2-fold coincidence

26
Linearly polarized signals
For linearly polarized signal, ? does not vary
with time. The product of antenna pattern as a
function of ? is given by
The relative phase ?1-?2 between detectors
affects the sensitivity of the pair.
27
AURIGA -TAMA sky coverage (1) linearly
polarized signal
AURIGA2
TAMA2
28
Circularly polarized signals
  • If
  • the signal is circularly polarized
  • Amplitude h(t) is varying on timescales
    longer than 1/f0
  • Then
  • The measured amplitude is simply h(t),
    therefore it depends only on the magnitude of the
    antenna patterns. In case of two detectors
  • The effect of relative phase ?1-?2 is
    limited to a spurious time shift ?t which adds to
    the light-speed delay of propagation

  • (Gursel and
    Tinto, Phys Rev D 40, 12 (1989) )

y
29
AURIGA -TAMA sky coverage (2) circularly
polarized signal
AURIGA2
TAMA2
30
AURIGA -TAMA sky coverage
Linearly polarized signal
Circularly polarized signal
AURIGA x TAMA
31
IGEC (continued)
32
Data selection at work
Duty time is shortened at each detector in order
to have efficiency at least 50
A major false alarm reduction is achieved by
excluding low amplitude events.
amplitude (Hz-110-21)
time (hours)
33
Duty cycle cut single detectors
total time when exchange threshold has been lower
than gw amplitude
34
Duty cycle cut network (1)
Galactic Center coverage
35
Duty cycle cut network (2)
search threshold 6 ? 10 -21/Hz
search threshold 3 ? 10 -21/Hz
36
False dismissal probability
A coincidence can be missed because of
  • data conditioning.
  • The common search threshold Ht guarantees that
    no gw signal in the selected data are lost
    because of poor network setup.
  • however the efficiency of detection is still
    undetermined (depends on distribution of signal
    amplitude, direction, polarization)

Best choice for 1997-2000 data false dismissal
in time coincidence less than 5 ? 30 no
amplitude consistency test
37
Resampling statistics by time shifts
amplitude (Hz-110-21)
time (hours)
We can approximately resample the stochastic
process by time shift. The in the resampled data
the gw sources are off, along with any correlated
noise Ergodicity holds at least up to timescales
of the order of one hour. The samples are
independent as long as the shift is longer than
the maximum time window for coincidence search
(few seconds)
38
Poisson statistics
? verified
For each couple of detectors and amplitude
selection, the resampled statistics allows to
test Poisson hypothesis for accidental
coincidences.
?
As for all two-fold combinations a fairly big
number of tests are performed, the overall
agreement of the histogram of p-levels with
uniform distribution says the last word on the
goodness-of-the-fit.
?
39
Setting (frequentist) confidence intervals
40
Unified vs flip-flop approach (1)
experimental data
physical results
hypothesis testing (CL)
null
upper limit
x
mup(CL)
estimation (with error bars)
claim
Flip-flop method
m(x) kCL sm
41
Unified vs flip-flop approach (2)
experimental data
physical results
estimation (with confidence interval)
confidence belt
x
mmin(CL) lt m ltmmax(CL)
Unified approach
42
Setting confidence intervals
  • IGEC approach is
  • Frequentist in that it computes the confidence
    level or coverage as the probability that the
    confidence interval contains the true value
  • Unified in that it prescribes how to set a
    confidence interval automatically leading to a gw
    detection claim or an upper limit
  • however, different from FC

References G.J.Feldman and R.D.Cousins, Phys.
Rev. D 57 (1998) 3873 B. Roe and M. Woodroofe,
Phys. Rev. D 63 (2001) 013009 F. Porter, Nucl.
Inst. Meth. A368 (1986), http//www.cithep.caltech
.edu/fcp/statistics/ Particle Data Group
http//pdg.lbl.gov/2002/statrpp.pdf
43
A few basics confidence belts and coverage
physical unknown
experimental data
44
A few basics (2)
physical unknown
confidence interval
coverage
experimental data
45
Freedom of choice of confidence belt
Fixed frequentistic coverage
Maximization of likelyhood
Fine tune of the false discovery probability
Non-unified approaches
Other requirements...
46
Confidence level, likelyhood, maybe probability?
The term CL is often found associated with
equations like
likelihood integral
likelihood ratio relative to the maximum
likelihood ratio (hipothesis testing)
47
Confidence intervals from likelihood integral
48
Example Poisson background Nb 7.0
49
Dependence of the coverage from the background
likelihood integral 0.90
Nb0.01-0.1-1.0-3.0-7.0-10
50
From likelihood integral to coverage
Plot of the likelihood integral vs. minimum
(conservative) coverage minN C(N ), for sample
values of the background counts Nb, spanning the
range Nb0.01-10
51
IGEC results (and what we learned from experience)
52
Setting confidence intervals on IGEC results
GOAL estimate the number (rate) of gw detected
with amplitude ? Ht
Example confidence interval with coverage ? 95
53
Uninterpreted upper limits
on RATE of BURST GW from the GALACTIC CENTER
DIRECTION whose measured amplitude is greater
than the search threshold no model is assumed
for the sources, apart from being a random time
series
ensured minimum coverage
rate (year 1)
search threshold (Hz -1 )
true rate value is under the curves with a
probability coverage
54
Upper limits after amplitude selection
systematic search on thresholds many trials
! all upper limits but one
overall false alarm probability 33 at least one
detection in case NO GW are in the data
NULL HYPOTHESIS WELL IN AGREEMENT WITH THE
OBSERVATIONS
55
Multiple configurations/selection/grouping within
IGEC analysis
56
Resampling statistics of accidental claims
event time series
coverage claims 0.90 0.866 (0.555)
1 0.95 0.404 (0.326) 1
Resampling ? blind analysis!
57
False discovery rate setting the probability of
false claim of detection
58
Why FDR?
  • When should I care of multiple test procedures?.
  • All sky surveys many source directions and
    polarizations are tried
  • Template banks
  • Wide-open-eyes searches many analysis pipelines
    are tried altogether, with different amplitude
    thresholds, signal durations, and so on
  • Periodic updates of results every new science
    run is a chance for a discovery. Maybe next
    one is the good one.
  • Many graphical representations or aggregations of
    the data If I change the binning, maybe the
    signal shows up better

59
Preliminary (1) hypothesis testing
60
Preliminary (2) p-level
  • Assume you have a model for the noise that
    affects the measure x.

You derive a test statistics t(x) from x.
F(t) is the distribution of t when x is sampled
from noise only (off-signal). The p-level
associated with t(x) is the value of the
distribution of t in t(x) p F(t) P(tgtt(x))
  • Example ?2 test ? p is the one-tail ?2
    probability associated with n counts (assuming d
    degrees of freedom)

Usually, the alternative hypothesis is not known.
61
Usual multiple testing procedures
  • For each hypothesis test, the condition plt? ?
    reject null leads to false positives with a
    probability ?

In case of multiple tests (need not to be the
same test statistics, nor the same tested null
hypothesis), let pp1, p2, pm be the set of
p-levels. m is the trial factor.
We select discoveries using a threshold T(p)
pjltT(p)? reject null.
  • Uncorrected testing T(p) ?
  • The probability that at least one rejection is
    wrong is
  • P(Bgt0) 1 (1- ?)m m?
  • hence false discovery is guaranteed for m large
    enough
  • Fixed total 1st type errors (Bonferroni)
    T(p) ?/m
  • Controls familywise error rate in the most
    stringent manner
  • P(Bgt0) ?
  • This makes mistakes rare
  • but in the end efficiency (2nd type errors)
    becomes negligible!!

62
Controlling false discovery fraction
  • We desire to control (bound) the ratio of false
    discoveries over the total number of claims B/R
    B/(BS) ? q.
  • The level T(p) is then chosen accordingly.

63
Benjamini Hochberg FDR control procedure
Among the procedures that accomplish this task,
one simple recipe was proposed by Benjamini
Hochberg (JRSS-B (1995) 57289-300)
  • choose your desired FDR q (dont ask too much!)
  • define c(m)1 if p-values are independent or
    positively correlated otherwise c(m)Sumj(1/j)

64
LIGO AURIGAcoincidence vs correlation
65
LIGO-AURIGA MoU
  • A working group for the joint burst search in
    LIGO and AURIGA has been formed, with the purpose
    to
  • develop methodologies for bar/interferometer
    searches, to be tested on real data
  • time coincidence, triggered based search on a
    2-week coincidence period (Dec 24, 2003 Jan 9,
    2004)
  • explore coherent methods

best single-sided PSD
Simulations and methodological studies are in
progress.
66
White paper on joint analysis
Two methods will be explored in parallel
  • Method 1
  • IGEC style, but with a new definition of
    consistent amplitude estimator in order to face
    the radically different spectral densities of the
    two kind of detectors (interferometers and bars).
  • To fully exploit IGEC philosophy, as the
    detectors are not parallel, polarization effects
    should be taken into account (multiple trials on
    polarization grid).
  • Method 2
  • No assumptions are made on direction or
    waveform.
  • A CorrPower search (see poster) is applied to
    the LIGO interferometers around the time of the
    AURIGA triggers.
  • Efficiency for classes of waveforms and source
    population is performed through Monte Carlo
    simulation, LIGO-style (see talks by Zweizig,
    Yakushin, Klimenko).
  • The accidental rate (background) is obtained
    with unphysical time-shifts between data streams.

67
Summary of non-directional IGEC style
coincidence search
  • Detectors PARALLEL, BARS
  • Shh SIMILAR FREQUENCY RANGE
  • Search NON DIRECTIONAL
  • Template BURST ??(t)
  • The search coincidence is performed in a subset
    of the data such that
  • the efficiency is at least 50 above the
    threshold (HS)
  • significant false alarm reduction is
    accomplished
  • The number of detectors in coincidence considered
    is self-adapting
  • This strategy can be made directional

HS
68
Cross-correlation search (naïve)
  • Detectors PARALLEL
  • Shh SAME FREQUENCY RANGE NEEDED
  • Search NON DIRECTIONAL
  • Template NO
  • Selection based on data quality can be
    implemented before cross-correlating.
  • The efficiency is to be determined a posteriori
    using Monte Carlo.
  • The information which is usually included in
    cross-correlation takes into account statistical
    properties of the data streams but not
    geometrical ones, as those related to antenna
    patterns.

69
Comparison between IGEC style and
cross-correlation
  • IGEC style search was designed for template
    searches. The template guarantees that it is
    possible to have consistent estimators of signal
    amplitude and arrival time. A
    bank of templates may be required to cover
    different class of signals. Anyway in burst
    search we dont know how well the template fits
    the signal
  • Some more work is needed to extend IGEC in case
    of template-less search among (spectrally)
    different detectors. Hint the amplitude
    estimators should have spectral weights common to
    all detectors, to be consistent without a
    template. The trade-off will be between between
    efficiency loss and network gain (sky coverage
    and false alarm rate)
  • A template-less IGEC style search can be easily
    implemented in case of detectors with equal
    detector bandwidth. In fact it is possible to
    define a consistent amplitude estimator.
    (Karhunen-Loeve, power)

Template search
Template-less search
  • Cross-correlation among identical detectors is
    the most used method to cope with lack of
    templates.
  • Cross-correlation in general is not efficient
    with non-overlapping frequency bandwidths, even
    for wide band signals.
Write a Comment
User Comments (0)
About PowerShow.com