Title: New Directions in Data Analysis
1New Directions in Data Analysis
- Pushpalatha Bhat
- Fermilab
DPF2000 Columbus, Ohio August 11, 2000
A reasonable man adapts himself to the world. An
unreasonable man tries to adapt the world to
himself. So, all progress depends on the
unreasonable one.
2Outline
- Intelligent Detectors
- Moving intelligence closer to action
- Multivariate Methods
- Neural Networks The New Paradigm
- New Searches Precision Measurements Some
Examples - Measuring the Top Quark Mass
- Discovery Reach for the Higgs
- More Sophisticated Approaches
- Probabilistic Approach to Analysis Exploring
Models - Summary
3(No Transcript)
4Intelligent Detectors
- Data analysis starts when a high energy
collision/event occurs - Transform electronic data into useful physics
information in real-time - Move intelligence closer to action!
- Algorithm-specific hardware
- Neural Network chips, for example
- Configurable hardware
- FPGAs, DSPs
- Innovative data management on-line smart
algorithms in hardware - Data in RAM disk AI algorithms in FPGAs
- Expert systems for control monitoring
- Trouble-shooting, diagnosis and fix
5Smart Triggers
- There are already Success Stories! H1 Level-2
Trigger
- Trigger on rare ep collisions in an overwhelming
beam-gas background - NN Hardware the CNAPS 1064 chip
- 12 Independent neural nets each one trained for a
specific physics process in a total of 960
digital processors - Successful operations since 1996
6Multivariate Methods
Keep it simple As simple as possible Not any
simpler Einstein
7Multivariate Methods
- The measurements being multivariate, the optimal
methods of analyses are necessarily multivariate - Many Applications
- Particle Identification
- e-ID, t-ID, b-ID, e/g , q/g
- Signal/Background Event Classification
- New physics
- Signals of new physics are rare and small
- (Finding a jewel in a hay-stack)
- Parameter Estimation
- t mass, H mass, track parameters, for example
- Function Approximation
- Parametric methods
- Fisher discriminant, Kernel methods
- Non-parametric Methods
- Adaptive/AI methods
8Optimal Event Selection
Conventional cuts
9Discriminant Approximation with Neural Networks
Output of a feed forward neural network can
approximate the Bayesian posterior probability
p(sx,y).
10Calculating the Discriminant
Consider the sum
Where di 1 for signal 0 for
background ? vector of parameters Then
in the limit of large data samples and provided
that the function n(x,y,?) is flexible enough.
11Neural Networks The New Paradigm
- Neural Networks (NN) are mathematical, adaptive
systems (algorithms). - The hidden transformation functions, g, adapt
themselves to the data as part of the training
process. The number of such functions need to
grow only as the complexity of the problem grows. - NN estimates a mapping function without requiring
a mathematical description of how the output
formally depends on the input.
12Measuring the Top Quark Mass
Discriminant variables
shaded top
The Discriminants
13NN Discriminant(DNN vs mfit )
Background
Signal (170 GeV/c2)
14Measuring the Top Quark Mass DØ Leptonjets
Background-rich
Signal-rich
mt 173.3 5.6(stat.) 6.2 (syst.) GeV/c2
15Strategy for Discovering the Higgs Boson at the
Tevatron
P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62
(2000)
hep-ph/0001152
16Hints from the Analysis of Precision Data
MH GeV/c2 MH lt 225 GeV/c2 at
95 C.L.
LEP Electroweak Group, http//www.cern.ch/LEPEWWG/
plots/summer99
17Event Simulation
- Signal Processes
- Backgrounds
- Event generation
- WH, ZH, ZZ and Top with PYTHIA
- Wbb, Zbb with CompHEP, fragmentation with PYTHIA
- Detector modeling
- SHW (http//www.physics.rutgers.edu/jconway/soft/
shw/shw.html) - Trigger, Tracking, Jet-finding
- b-tagging (double b-tag efficiency 45)
- Di-jet mass resolution 14
(Scaled down to 10 for RunII Higgs Studies)
18WH Results from NN Analysis
MH 100 GeV/c2
19WH (110 GeV/c2) NN Distributions
20 WH Results Is it
worth it?
21Combined Results (WHZH)
22Results, Standard vs. NN
About half the luminosity required in case of NN
analyses relative to conventional analyses for
the same discovery reach. A good chance of
discovery up to MH 130 GeV/c2 with 20-30fb-1
23Improving the Higgs Mass Resolution
- Use mjj and HT (? Etjets ) to train a neural
networks to predict the Higgs boson mass
Network-improved Higgs Mass
13.8
12.2
13.1
11.3
13
11
24Newer ApproachesEnsembles of Networks
25Committees of Networks
NN1
y1
NN2
y2
X
NN3
y3
NNM
yM
Decision by a committee has lower error than the
individuals. The performance of a committee can
be better than the performance of the best single
network used in isolation
26(No Transcript)
27Probabilistic Approach to Data Analysis
(The Wave of the future)
28 Bayesian Analysis
Likelihood
Prior
Posterior
M model A uninteresting parameters p
interesting parameters d data
Bayesian Analysis of Multi-source Data P.C. Bhat
et al., Phys. Lett. B 407(1997) 73
29Higgs Mass Fits
S80 WH events, assume background distribution
described by Wbb. Results S/B
1/10 Mfit 114 /- 11GeV/c2
S/B 1/5 Mfit 114 /-
7GeV/c2
30Solar Neutrino Problem
Solar Neutrino Data 1998
- Electron neutrinos from the Sun seem to be lost
en route to the Earth. That loss is described by
the neutrino survival probability, P(E). - We have used solar neutrino data and standard
solar model predictions to extract P(E) and its
uncertainties.
31Bayesian Analysis
Modeling the Survival Probability
C. Bhat, P.C. Bhat, M. Paterno, H.B. Prosper,
Phys. Rev. Lett. 81, 5056 (1998)
32Neutrino Survival Probability
C. Bhat et al.
33Advantages of Bayesian Approach
- Provides probabilistic information on each
parameter of a model (SUSY, for example) via
marginalization over other parameters - Bayesian method enables straight-forward and
meaningful model comparisons. - Bayesian approach allows treatment of all
uncertainties in a consistent manner. - Mathematically linked to adaptive algorithms such
as Neural Networks (NN) - Hybrid methods involving NN for probability
density estimation and Bayesian treatement can be
very powerful
34Summary
- We are building very sophisticated equipment and
will record unprecedented amounts of data in the
coming decade - Use of advanced optimal analysis techniques
will be crucial to achieve the physics goals - Multivariate methods, particularly Neural Network
techniques, have already made impact on
discoveries and precision measurements and will
be the methods of choice in future analyses - Hybrid methods combining intelligent algorithms
and probabilistic approach will be the wave of
the future