Title: Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley
1Support Vector MachinesGet more Higgs out of
your dataDaniel WhitesonUC Berkeley
2Multivariate algorithms
Square cuts may work well for simpler tasks, but
as the data are multivariate, the algorithms also
must be.
3Multivariate Algorithms
- HEP overlaps with Computer Science, Mathematics
and Statistics in this area - How can we construct an algorithm that can be
taught by example and generalize effectively? - We can use solutions from those fields
- Neural Networks
- Probability Density Estimators
- Support Vector Machines
4Neural Networks
- Constructed from a very simple object, they can
learn complex patterns.
- Decision function learned using freedom in hidden
layers. - Used very effectively as signal discriminators,
particle identifiers and parameter estimators - Fast evaluation makes them suited to triggers
5Probability Density Estimation
Then we could calculate
If we knew the distributions of the signal fs(x)
Example disc. surface
and the background fb(x),
And use it to discriminate.
6Probability Density Estimation
Of course we do not know the analytical
distributions.
- Given a set of points drawn from a
distribution, put down a kernel centered at each
point.
- With high statistics, this approximates a
smooth probability density.
Surface with many kernels
7Probability Density Estimation
- Simple techniques have advanced to more
sophisticated approaches - Adaptive PDE
- varies the width of the kernel for smoothness
- Generalized for regression analysis
- Measure the value of a continuous parameter
- GEM
- Measures the local covariance and adjusts the
individual kernels to give a more accurate
estimate.
8Support Vector Machines
- PDEs must evaluate a kernel at every training
point for every classification of a data point. - Can we build a decision surface that only uses
the relevant bits of information, the points in
training set that are near the signal-background
boundary?
For a linear, separable case, this is not too
difficult. We simply need to find the hyperplane
that maximizes the separation.
9Support Vector Machines
- To find the hyperplane that gives the highest
separation (lowest energy), we maximize the
Lagrangian w.r.t ai -
(xi,yi) are training data ai are positive
Lagrange multipliers
The solution is
Where ai0 for non support vectors
(images from applet at http//svm.research.bell-la
bs.com/)
10Support Vector Machines
But not many problems of interest are linear.
Map data to higher dimensional space where
separation can be made by hyperplanes
We want to work in our original space. Replace
dot product with kernel function
For these data, we need
11Support Vector Machines
Neither are entirely separable problems very
difficult.
- Allow an imperfect decision boundary, but add a
penalty.
- Training errors, points on the wrong side of the
boundary, are indicated by crosses.
12Support Vector Machines
We are not limited to linear or
polynomial kernels.
Gives a highly flexible SVM
- Gaussian kernel SVMs outperformed PDEs in
recognizing handwritten - numbers from the USPS database.
13Comparative study for HEP
Signal Wh to bb
Neural Net
Background Wbb
Background tt
PDE
Background WZ
2-dimensional discriminant with variables Mjj and
Ht
SVM
Discriminator Value
14Comparative study for HEP
Signal to Noise Enhancement
Efficiency 43
Efficiency 50
Efficiency 49
All of these methods provide powerful signal
enhancement
Discriminator Threshold
15Algorithm Comparisons
Algorithm Advantages Disadvantages
Neural Nets Very fast evaluation Build structure by hand Black box Local optimization
PDE Transparent operation Slow evaluation Requires high statistics
SVM Fast evaluation Kernel positions chosen automatically Global optimization Complex Training can be time intensive Kernel selection by hand
16Conclusions
- Difficult problems in HEP overlap with those in
other fields. We can take advantage of our
colleagues years of thought and effort. - There are many areas of HEP analysis where
intelligent multivariate algorithms like NNs,
PDEs and SVMs can help us conduct more powerful
searches and make more precise measurements.