Title: OPTIMAL JET FINDER
1OPTIMAL JET FINDER
- Ernest Jankowski
- University of Alberta
in collaboration with D. Grigoriev F. Tkachov
2Acknowledgements
- Alberta Ingenuity Fund
- J Gordin Kaplan Graduate Student Award
3Plan of the talk
- introduction
- Optimal Jet Definition and its implementation
- comparison with cone and kT algorithms
- benchmark physics application test based on
- motivations behind OJD
4Introduction
- final states of particle collisions in HEP
experiments often consist of sprays of hadrons - simple interpretation in the collision quarks
with high kinetic energy are produced or released
from the colliding particles - but quarks interact very strongly at large
distances and never appear as free particles
(confinement) - any attempt to separate free quarks results in
production of extra pairs of quarks and
antiquarks - which recombine into colorless states hadrons
- if the collision energy is high enough the
hadrons appear in sprays, called jets
5Hadronic jets an example
6Jet algorithms
- roughly jets correspond to quarks and gluons
produced in hard scattering process - jet algorithm takes the final state hadrons and
assigns them to jets - a jets momentum is then computed from the
momenta of the particles that belong to that jet
(so called recombination scheme) - the jets momentum corresponds approximately to
the momentum of quarks and gluons in hard
scattering process (partons in perturbative
calculations) - jet algorithms used cone algorithm and
successive recombination algorithms, such as
Durham (kT)
7Optimal Jet Definition
- I will present so called Optimal Jet Definition
(OJD) - proposed by Fyodor Tkachov
- a short introduction to the subject is
- Phys. Rev. Lett. 91, 061801 (2003)
- FORTRAN 77 implementation of OJD
- called Optimal Jet Finder (OJF) is described in
- hep-ph/0301226 (Comp. Phys. Commun., in print)
8Recombination matrix zaj
HEP event list of particles
(partons hadrons
calorimeter cells towers preclusters)
recombination matrix
the 4-momentum qj of the j-th jet expressed by
4-momenta pa of the particles
result list of jets
9Recombination matrix zaj
- zaj describes the fraction of the a-th particle
that belongs to the j-th jet (i.e. we split up
particles between jets) - conventional jet algorithms have zaj equal to 0
or 1, i.e. a particle either entirely belongs to
some jet or does not belong to that jet at all - fragmentation and hadronization is always effect
of interaction of (at least) two hard partons
evolving into two jets, so some hadrons that
emerge in this process can belong partially to
both jets - but this is also very convenient
algorithmically, because a jet configuration is
described by a set of continuous numbers zaj
10Recombination matrix zaj
the 4-momentum qj of the j-th jet expressed by
4-momenta pa of the particles (a1,2,...,nparts)
the fraction of the energy of the a-th particle
can be positive only
the fraction of the energy of the a-th particle
that does not go into any jet
i.e. no more than 100 of each particle is
assigned to jets
11Optimal Jet Definition
- any allowed value of the recombination matrix
zaj describes some jet configuration - the desired optimal jet configuration is the one
that minimizes some function ?(zaj) - details of ? are different for CM lepton-lepton
collisions (spherical kinematics) and collisions
involving hadrons (cylindrical kinematics), where
boost invariance along the beam axis should be
maintained
12Optimal Jet Definition spherical kinematics
width of the j-th jet
energy outside jets
?aj is the angle between the a-th particle and
j-th jet Ea is the energy of the a-th
particle Rgt0 is a parameter with a similar
meaning as the cone radius
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22Optimal Jet Finder fixed number of jets
- desired optimal jet configuration corresponds to
minimum of ?(zaj) - the program finds a minimum iteratively using a
simple gradient-based method - facilitated by analytical formulas for gradient
- start with some candidate minimum (the initial
jet configuration) and descend into a local
minimum in subsequent iterations - the initial jet configuration may be completely
random
23Minimization 2 jets
a point on the boundary of the simplex
a point inside the simplex
24iteration 0
25iteration 1
26iteration 2
27iteration 3
28iteration 4
29iteration 5
30iteration 6
31Local minima of ?
- each time the programs starts with a random
configuration, it finds a local minimum which
does not need to be the global minimum of
?(zaj) - this is similar to how the result of cone
algorithm depends on the initial position for the
cone iterations - but here in contrast with the cone algorithm, we
know which local minimum we should choose the
one that gives the smallest value of ?
32Optimal Jet Finder number of tries
- in order to find the global minimum or increase
the probability of finding it, the program tries
a couple different random initial configurations - OJF parameter number of tries ntries
- it was sufficient to take ntries ? 10
- (even ntries ? 3) in the cases we studied
- compromise between quality of jets found and
computing time used
33OJF number of jets to be determined
- assume some small positive parameter ?cut, which
is analogous to the jet resolution parameter ycut
in binary recombination algorithms - start with (for example) njets1
- find the best configuration with the number of
jets equal to njets - as described previously (starting from ntries
different initial configurations and choosing the
best configuration) - check if
- if so, this is the final jet configuration
- and the final number of jets is njets
- if not, increase njets by 1 and go to point 3
34Cone algorithm
- cone algorithm defines a jet as all particles
within a cone of certain radius in ??? space - ?j, ?j are coordinates of the geometrical center
of the cone ?a, ?a are coordinates of the
particles - ?j, ?j are found from the requirement that
E?-weighted centroid computed from all particles
in the cone coincides with the geometrical center
of the cone -
-
35Cone algorithm
- ?j, ?j are found in an iterative procedure
- start with some trial position ?j, ?j for a cone
- for all particles within the cone compute
E?-weighted centroid ?j, ?j (usually different
from ?j, ?j) - use the centroid as a new position of the center
of the cone - the content of the cone will change accordingly,
- update it and go to ?
- procedure ends when the cone center and
E?_weighted centroid for all particles in the
cone coincide a stable cone
36Cone algorithm seeds
- the iterative procedure described above needs
some initial cone position - look for stable cones starting everywhere
- (i.e. at every tower or cell) very expensive
computationally - start only at energetive towers, so called seeds
37Problems with seeds
- problems arise when we compare theory and
experimental data - when we apply the cone algorithm involving seeds
to partons beyond the LO, soft radiation or
collinear splitting of partons may change the jet
configuration significantly - whereas the experimental jet configuration
remains unchanged
38Soft radiation at NLO
with a soft gluon as a seed the event is
reconstructed into 1 jet
R lt (angle between two partons) lt 2R the event is
reconstructed to 2 jets
39Collinear splitting at NNLO
the red parton splits into 2 collinear pieces
now the blue parton is the most energetic and
the event is reconstructed into 2 jets
R lt (angle between two partons) lt 2R the event is
reconstructed to 1 jet using the most energetic
parton (red) as the seed
40Collinear splitting in the experiment
the particle in the center hits a single
calorimeter cell the cell has sufficient energy
to be a seed
the particle in the center hits two separate
cells none of the cells has enough energy to be
a seed
41Cone algorithm overlapping cones
- after all stable cones are found, deal with
overlapping cones - if the fraction f of overlapping energy (with
respect to the smaller energy jet) exceeds some
threshold (e.g. fgt50) merge the two jets - otherwise split two jets, assigning particles to
the closest jet - if there are more than two jets overlapping, the
final result may depend on the ordering of this
procedure (so some standard order has to be
specified) - it is difficult to take account of the merging \
splitting procedure in theoretical calculations
42Cone algorithm overlapping cones
experiment
theory
43Solution to seed problems
- cone algorithm involving seeds becomes unstable
at higher order perturbative calculations - seedless algorithm look for jets everywhere
- Improved Legacy Cone Algorithm (ILCA)
- midpoints pabpapb, pabcpapbpc, ... are used
as seeds - still problem with overlapping cones
- successive recombination algorithms (binary
algorithms), such as JADE, Durham (kT)
44 is an infrared and collinear safe
45Successive recombination algorithms
- for each pair of particles compute the distance
dab between the particles in the pair, for
example - choose the pair with the smallest distance and
- if min(dab) lt ycut (ycut is the jet resolution
parameter) combine the two particles into one
particle using pnewpapb and go back to the
beginning (now having the number of particles
decreased by one) - if min(dab) ? ycut then stop (we achieved the
final jet configuration)
46OJF and kT
- kT merges only 2 particles at a time (binary
recombination 2?1) - other recombination schemes also considered 3?2,
m?n - OJF takes into account the global structure of
the energy flow in the event, i.e. - jet configuration is found from the momenta of
all particles in the event (corresponds to m?n) - OJF finds more regular jets that kT (still a jet
is not a cone)
47OJF and kT
- OJF is much faster that kT if a large number of
particles has to be analyzed - average time per event
- ? nparts for OJF
- ? n3parts for kT
- this does not matter when we apply the algorithm
to theoretical calculations, but in experiments
we have to analyze large number of calorimeter
cells - D0 ? 45000 cells, ATLAS ? 200 000 cells
48OJF and kT
- the cubic dependence determines the way kT has to
be used in experiments - it cannot be applied directly at the level of
cells - a preclustering step is needed to reduce the
input data to ? 200 preclusters (D0 and CDF
practice) - how does the preclustering affect measurements?
- it is difficult to account for the preclustering
in theoretical calculations - the preclustering is a completely independent
procedure from the kT algorithm itself
49Benchmark test W-boson mass extraction
- benchmark test based on the W-boson mass
extraction from the process - modeled on the OPAL analysis (CERN-EP-2000-099)
- we compared OJF with JADE and Durham (kT)
algorithm (the best algorithm used by the OPAL
collaboration) - we obtained the same accuracy as Durham (still we
did not explore all possibilities) - we studied the speed of OJF and we found that it
is much faster then Durham when a large number of
calorimeter cells need to be analyzed
50Quality of jets
ALGORITHM statistical error of W-boson mass (corresponding to 1000 experimental events) based on Fishers information MeV (3)
Durham (kT) 105
JADE 118
OJF 106
51average time per event
time seconds
kT
OJF ntries10
OJF ntries5
number of particles \ cells
52average time per event
time seconds
kT
OJF ntries10, ntries5
number of particles \ cells
53Method of moments
- parameter estimation, a fundamental problem of
mathematical statistic - for an event P theory gives the probability
density ?M(P) that depends on some parameter M,
i.e. MW or ?s - given an experimental sample of events, we want
to estimate the best value of the parameter M
with possibly small statistical error - method of maximal likelihood
- reinterpreted as method of moments
54Method of moments
depends on M
computed from the experimental sample
55Method of moments
informativeness of the observable f, based on the
statistical error that f gives
Fishers information
Rao-Cramer inequality
56Event representation basic shape observables
event particles
event as energy flow collinearly invariant
basic shape observables
f function of a direction only
57Factorial estimate
- event ? values of all basic shape observables
f(P) - the optimal observable for the measurement of
some parameter can be expressed as a combination
of basic shape observables - applying a jet algorithm we reduce the available
information about the event P to make the
analysis computationally manageable - we use values of all basic shape observables
f(Q), taken on the jet configuration
58Hadronic event approximated by jets
- good approximations for f(P) could exist among
functions that depend only on Q, which is a
parameterization of P in terms of a few jets,
found from the condition - modeled after
- (q is partonic structure of the event) this
expresses the fact that most of the information
about the event is inherited from its
gluon-and-quark structure
59Factorial estimate
for any f Cf,R depends only on f, it does not
depend on the event or jet configuration
configuration
? depends only on the event P and the jet
configuration Q we choose the jet configuration
so that the loss of information about the event
is minimal generically for all f (we minimize
?) this is essentially the Optimal Jet Definition
60Summary
- I presented the Optimal Jet Finder
- based on the global energy flow in the event
- infra-red and collinear safe no seed-related
problems - no overlapping jets
- returns additional numerical characteristics of
the jet configuration found (so called dynamical
width and soft energy) which may be helpful in
construction of (quasi-) optimal observables in
statistical problems - much faster than kT for a large number of input
cells
61(No Transcript)
62Optimal Jet Definition cylindrical kinematics