Title: The Seattle Single Decision Tree
1The Seattle Single Decision Tree
- Toby BurnettUniversity of Washington
2The single tree definition
- We originally copied the NN procedure of training
on two subsets of the background, then using 2-D
binned likelihood in the two variables. - Get to use all the tools developed for this, just
change a few file names - Gordon presenting the results using this analysis
- But we asked, isnt this a bit of a kluge? Why
not just apply the classification tree technology
to all the background at once?
3Classification tree training
- Use the gini criterion
- Quit when a node has less than 100 entries
- One pass, no boosting, averaging over subsets
- Train on ½ the data, test with the rest.
4The variables used just copied from NN
Table of the gini improvement
5Derived statistical limits, vs. cut on purity.
- Easy application of top_statistics.
- Summed over 1-tag and 2-tag.
- Make cut at 0.7 efficiency for following.
6Now, Systematics!
- Very much harder, could not do in time.
- But easy to just look at the derivatives which
are used as input to the procedure. I think that
what it does is - Assume a priori Gaussian distribution in the
various systematic parameters, JES TRF, TRIG, - Create new MC files with or one sigma change
in value - Recalculate new estimates of single top content
using varied background and signal. - Assume this is linear, and propagate the errors
to the estimate fudge a little and take larger
of two deviations from nominal - A question is, how linear are the estimators as
functions of the systematic parameters?
7Linearity? Something weird with wjj JES
JES differences on all other background sources
look symmetric
8OK, what is unique with JES and/or wjj?
A quick look at histograms of wjj (bad) and wbb
(good)
9(Tentative) conclusions, plans
- This seems to invalidate the JES systematic
calculation, since it badly fails the assumption
used to compute its contribution to the limit - Rerun the training on the systematic files?
- Need to optimize training
- Try all variables the classifier only selects
the ones that help, ranks them - Allow smaller final nodes
- Try boosting, etc.