Nikolai Gagunashvili - PowerPoint PPT Presentation

About This Presentation
Title:

Nikolai Gagunashvili

Description:

12/2/09. Nikolai Gagunashvili. School of Computing, University of Akureyri, Iceland ... nikolai_at_unak.is. Unfolding Problem: A Machine Learning Approach. 12/2/09 ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 31
Provided by: nik56
Category:

less

Transcript and Presenter's Notes

Title: Nikolai Gagunashvili


1
Unfolding Problem A Machine Learning Approach
Nikolai Gagunashvili School of Computing,
University of Akureyri, Iceland nikolai_at_unak.is
2
Contents
  • Introduction
  • Basic equation
  • System identification
  • The unfolding procedure
  • A numerical example
  • Conclusions
  • References

3
Introduction
4
Introduction (cont.)
5
Introduction (cont.)
The unfolding problem is an underspecified
problem. Any approach to solve the problem
requires a priori information about the
solution. Different methods of unfolding differ,
directly or indirectly, through the use of this a
priori information.
6
Basic
equation We will use the linear model for a
transformation of a true distribution to the
measured one where f (f1,f2,.,fm)T
is vector of an experimentally measured histogram
content, f (f1, f2,..,fn)T is vector of
some true histogram content, e (e1,
e2,.....em)T is vector of random residual
components with mean value E e 0,
and a
diagonal variance matrix Var e
are the statistical errors
of the measured distribution.
7
is the matrix of transformation
that transform true distribution to
experimentally measured distribution
8
Basic equation (cont.)
9
Basic equation (cont.)
A Least Squares Method can give an estimator for
the true distribution
where , the estimator, is called the
unfolded distribution. The full matrix of errors
of the unfolded distribution is given by
according to the Least Squares Method.
10
There are two stage of solving unfolding
(inverse) problem 1. Investigation and
calculation matrix P is known as
problem of identification system and may be
defined as the process of determining a model of
a dynamic system using observed input-output
data. 2. Solution of equation (1) that gives
unfolded function with complete matrix of
statistical errors of .
11
System identification or calculation of matrix
P The Monte-Carlo simulation of a set-up can be
used to get input-output data (training sample).
Monte-Carlo Simulations




To regularize the solution of the unfolding
problem, let us use training set with a priori
known from theory, or from other experiments
distributions
12
System identification or calculation of matrix P
Assume we have q input distributions in
training sample, and present them as matrix




where each row represents one input training
histogram.
13

System identification or calculation of matrix P
For each i-th row
of the matrix P we can write the equation
is a vector of i-th
components of output distributions, where
is i-th bin content of output distribution for
the j-th training distribution is a vector
of random residuals with mean value
and variance

where is the
statistical error of i-th bin of output
distribution for the j-th training distribution.




14
System identification or calculation of matrix P
Least Squares Method gives and estimator for
pi Columns of matrix can correlate
with each other. It means that
transformation of the training distribution to
the i-th bin of output distribution can be
parameterized by subset of elements of the row
pi. May be more than one subset that describes
this transformation in sufficiently good
manner. Example
15
System identification or calculation of matrix P
Thus for each i-th output bin we will have Ni
candidate rows, and for all output bins
candidate matrices P. We need to
choose a matrix P that is good, or optimal , in
some sense. The most convenient in this case is
the criteria of D-optimality that is related to
the minimization of determinant of full matrix of
errors of unfolded distribution
16
  • System identification or calculation of matrix P
  • Main advantages D-optimization
  • Minimizes the volume of the confidence ellipsoid
    for an unfolded distribution.
  • There are many computer algorithms for
    optimization.

17
Basic equation solution
A Least Squares Method give an estimator for the
true distribution
where , the estimator, is the unfolded
distribution. The full matrix of errors of the
unfolded distribution is given by
according to the Least Squares Method.
18
The unfolding procedure
  • Initialization
  • Define a binning for experimental data.
  • Define a binning for the unfolding distribution.
  • System identification
  • Choose a set of training distributions.
  • Calculate the D-optimal matrix P.
  • Basic equation solution
  • Calculate unfolded distribution with full matrix
    of errors
  • Test of goodness of the unfolding
  • Fit unfolded distribution and compare the
    experimental distribution and the reconstructed
    simulated distribution
  • The holdout method, cross-validation method,
    bootstrap method.

19
  • The unfolding procedure
  • Initialization
  • Define a binning for experimental data

20
  • The unfolding procedure
  • Initialization
  • Define a binning for the unfolded distribution

21
The unfolding procedure
Selection criteria for set of training
distributions A training distribution has
corresponding the output distribution that can
be compared with the experimentally measured
distribution by ?2 test. Let us select for
identification a training distribution that has
a corresponding output distribution satisfying a
?2 lta selection criteria (the parameter a
defines a significant level p(a) for the
comparison of two histograms).
Application of this selection criteria
Increase the number of candidate matrix P
Decrease value of determinant
of full matrix of errors Decrease value of
statistical errors of unfolded distribution.
22
The unfolding procedure
Selection criteria for set of training
distributions
Experimental distribution
Monte-Carlo
Set of training distributions
Set of output distributions
23
A numerical
example We take a true distribution
with parameters
An experimentally measured distribution is
defined as
where the acceptance

and
is the detector resolution function with s1.5.
24
An example of the true distribution f(x), the
acceptance function A(x) and the resolution
function R(x,10)
25
An example of the measured
distribution f
26
(No Transcript)
27
A numerical example (cont.) A histogram of the
measured distribution was obtained by simulating
104 events. Random parameters are generated
uniformly on the intervals 1,3 for A1
0.5,1.5 for A2 8,12 for B1 10,18
for B2 0.5,1.5 for C1 0.5,1.5 for
C2 which define a training distribution for
identification.
28
Training distributions generated for system
identification and an unfolded distribution for
different ?2 cut


29
  • Conclusions
  • The proposed method use of a set of a
    priori distributions for identification
    to obtain stable solution of unfolding problem.
  • D-optimization and the application of the
    Least Squares Method gives the possibility of
    minimizing the statistical errors of the
    solution.
  • ?2 selection criteria permits to decrease the
    possible bias of the procedure.
  • The procedure has no restriction due to
    dimensionality of the problem.
  • The procedure can be applied for solving
    unfolding problem with smooth solution as well
    as non-smooth solution.
  • Based only on a statistical approach the
    method has a good statistical interpretation.

30
  • References
  • V.Blobel, Unfolding methods in high-energy
    physics experiments, CERN 85-02 (1985).
  • V.P.Zhigunov, Improvement of resolution function
    as an inverse problem, Nucl. Instrum. Meth.
    216(1983)183.
  • A.Höcker,V.Kartvelishvili, SVD approach to data
    unfolding, Nucl. Instrum. Meth. A372(1996)469.
  • N.D.Gagunashvili, Unfolding of true distributions
    from experimental data distorted by detectors
    with finite resolutions, Nucl. Instrum. Meth.
    A451(1993)657.
  • N.D.Gagunashvili, Unfolding with system
    identification, Proceedings of PHYSTAT 05,
    Oxford,UK.
Write a Comment
User Comments (0)
About PowerShow.com