Title: Top Quark Mass and Width using the Template Method
1Top Quark Mass and Widthusing the Template Method
2Why study tops?
- So far, the top quark is the heaviest
fundamental (point) particle observed in nature - Decays before hadronizing
- Helps pin down the mass of the Higgs boson
- Can constrain new physics
- Because we can
3The Tevatron
CDF
36 bunches of 1010 anti-protons
Bunches cross 2.5 million times per second
The Tevatron (center of mass energy 1.96 TeV)
DZero
36 bunches of 1011 protons
Berkeley
(2000 miles)
4CDF
- SVX Silicon important for good tracking,
necessary for b-tagging - Leptons Best-measured objects in events
- COT drift chamber w/ coverage hlt1, s(PT)/PT
0.15PT - 1.4 Tesla superconducting solenoid outside
tracking system - EM cal sE / E 14 /ÖE
- Hadronic calorimeter crucial for difficult jet
measurements - Had cal sE / E 80 /ÖE
- Muons scintillator chamber, coverage out to
h1.5 - Need entire detector to get a good measurement
of momentum imbalance
SVX
EM cal
Muon
Had cal
COT
5Tevatron performance at CDF
Top Mass analysis 1.7 fb-1 of data
6Top quark phenomenology
- Tops always decay via t-gtWb
- Event topology then depends on W decays
- Hadronic (quarks)
- Leptonic (electron or muon neutrino)
- This analysis uses the LeptonJets channel
- One W decays to hadrons, the other to leptons
- Signature 4 quarks, 1 charged lepton
undetected neutrino - OK, so just take the invariant mass and youre
done. Right? - Not so simple
7Why Mtop is difficult
- With 4 (and only 4 jets!), there are 12
different ways of assigning jets to partons at
hard scattering - Neutrino from W decay
- Non-negligible backgrounds
- Jets are difficult
8Jet Energy Scale
?c unit of combined nominal CDF JES calibration
uncertainty
9Top Quark Mass Some handles
- Instead of taking the invariant mass of the
system, we will have to make a measurement by
comparing data to Monte Carlo simulation - Find the parent top mass distribution most
consistent with our data - We want to measure a variable thats correlated
to the top mass - System is over-constrained (helps choose from 12
possible jet-parton assignments)
10Event Selection
- Use b-tagging in SVX to reduce combinatorics and
increase SB - Divide events into 2 exclusive subsamples with
different SB and different reconstructed mass
shapes
Top Event Tag Efficiency 60 False Tag Rate
(per jet) 0.5
11Reconstructed mass templates
Reconstructed mass is correlated to (but not the
same thing as) the true top quark mass
2-tag templates more sharply peaked than 1-tag
templates
12In-situ calibration using dijet mass templates
- Kinematic fitter works well, but distributions
are highly correlated not only to top quark mass,
but also to calibration of jets in the detector - Introduce the dijet mass from the hadronically
decaying W - Use as in situ calibration of jet energy scale
(JES) - We use the dijet mass closest to the well-known
W mass from any pair of untagged jets among
leading 4 jets (studied other choices, this was
best)
13How do we use the templates?
- How do we get the probability to observe an event
with mtreco and mjj? - Previously, assume the two observables are
uncorrelated, and parameterize as a function of
mtop and shifts in JES - Near-impossible to account for correlation
between observables - Parameterizations difficult, mathematically bad
New Use a Kernel Density Estimate-based approach
to form PDFs that are two-dimensional in
observables
14A 1d KDE pictorial tutorial
Probability
Mtreco
15Adaptive KDE
- Want smoothing to depend on statistics within
sample - First pass fixed kernel width
- Second pass Varying width kernels
- Narrower kernel in high-stats region ? sharper
peak - Wider kernel in low-stats region ? smoother tail
162d KDE
17Backgrounds
- Must deal with non-negligible backgrounds
arising mostly from Wjets (real heavy flavor and
mistags) - Estimate Wjets, QCD normalizations from data
- Wbb/Wcc/Wc fractions from MC
- MC estimates for single-top and dibosons
18Backgrounds (1d)
19Backgrounds (2d)
More correlation between observables for
background events
20Likelihood Fit
- Maximize likelihood for expected number of 1-tag
and 2-tag signal and background events in a grid
of over 2000 Mtop-JES points - Fit a 2d parabola (including correlation
cross-term) to the minimized negative
log(likelihood) values
Example pseudoexperiment
21Tests of machinery
Machinery works and shows no significant bias
22Systematics
Even with in situ calibration, JES systematic
dominates
23Measurement!
Mtop 171.6 ? 2.0 GeV/c2
24Differential pulls
Scale all uncertainties by 4.7
25Cross-checks
All errors uncorrected
would have 3 GeV JES systematic
261d projections
27Plans for top mass measurement
- Add more data (2 fb-1 measurement) for
publication - Our group also has a template-based measurement
in the dilepton channel using KDE - Will be extended to two observables and 2d KDE
for better statistical power - We plan on combining the measurements in the
same likelihood - Will move away from fitting a 2d quadratic and
instead smooth out the likelihood using more
non-parametric statistical tricks (local
polynomial smoothing)
28Switching gears
29Top quark decays
- In SM, tops decay with a lifetime of 4x10-25
seconds - Vtb 1, and Mtop gtgt MWMb
- Total width 1.5 GeV ? (Mtop)3 and calculated to
1 - Only measurement so far is unpublished CDF
result c? lt 52.5 ?m at 95 CL - Lower limit on top quark width, ?tgt 0.002 eV
30The idea
- Use the kinematic fitter with templates that
vary not as a function of top quark mass, but as
a function of top quark width - Use a likelihood fit gives the measured top
quark width - Use the likelihood output to set a 95 CL on the
top quark width
31In practice
- Selection is the same EXCEPT
- No ?2 cut (was found to reduce sensitvity)
- Allow extra jets in 1-tag events
- Use only 1 fb-1 of data
32Trusting the Monte Carlo?
?top 50 GeV
?top 1.5 GeV
?top 30 GeV
MwMb
General consensus from theorists Can trust MC to
30 GeV
33More on the Monte Carlo
Parton level mean, before event selection
Parton level RMS, before event selection
34Reconstructed mass templates
35Likelihood fit output for ?t
How might we use these to set limits?
36How to set upper limits
- Lets say we want to set a 95 upper limit on
the top quark width. We have likelihood output
from MC samples with varying ?t 1.5, 5.0, 10.0,
. (GeV)
95 of curve
Prob
?true
Data
?L-fit
?L-fit
37How to set lower limits
- Lets say we want to set a 95 lower limit on
the top quark width. We have likelihood output
from MC samples with varying ?t 1.5, 5.0, 10.0,
. (GeV)
95 of curve
Prob
Data
?true
?L-fit
?L-fit
38Does setting limits always work?
- Sometimes with this technique, you dont set a
limit! - And what about two-sided limits?
Data
?true
?L-fit
39Feldman-Cousins machinery
- Frequentist approach to setting limits that
guarantees that we can always make a measurement - Tells us how to choose our confidence bands (aka
an ordering principle) - Choose bands based on likelihood ratios. For
every MC point, define a likelihood ratio
x output of likelihood fit
?i true width being examined
Width with max prob at x
40Use of likelihood ratio functions
- Use the likelihood ratio to select the 95
confidence region for a particular true width - Order (select) by the most likelihood ratio
41Likelihood ratio functions
- We parameterize the likelihood output so that we
have a likelihood ratio and confidence band for
arbitrary ?t
42Does it work?
43Systematics
- Move to Bayesian approach, as is typical
- Easy to incorporate into Feldman-Cousins
- Systematics change our confidence bands
- Unfortunately, systematics in limit-setting
procedures are a bit more tricky than in simple
measurements (such as the top quark mass) - Systematics study of the unknown
44Reiterating Systematics
- Typically, we have a PDF of the systematic
parameter (such as scale of jets, background
fraction, etc.), which we assume is Gaussian - Typically, we assume that linear shifts in the
systematic parameter cause linear shifts in the
parameter we are measuring
Jet Energy Scale
45But
- What if linear shifts in JES do not cause linear
shifts in the top quark width out of the
likelihood?
This is the function we use to smear out the
likelihood output (its Gaussian for normal
systematics)
46Systematics summary
Jet resolution smear jets to worsen resolution
by extra 5 Systematics studied at different top
widths when possible, systematic taken to be
conservative non-Gaussian systematics
47Smeared likelihood output
Systematics included Original PDF
Make new parameterizations, new L ratios all over
again after convoluting with systematics
48Likelihood fit!
49Likelihood fit!
50Confidence bands with data
?top lt 12.7 GeV at 95 CL
51What about assuming Mtop 175?
- All our MC was generated with Mtop 175 GeV/c2,
but world average is Mtop 170.9 ? 1.8 GeV/c2 - (How) does this affect our measurement?
Mass 171
Mass 168
Mass 175
Prob
mtreco
To accommodate a different mass, likelihood fit
prefers a larger width - we conservatively do NOT
take a systematic
52Plans for top width measurement
- Worlds first direct limit on the top quark
width - Set an upper limit with 95 confidence only 1
order of magnitude from the SM prediction - Still statistics limited
- Result to be published - PRL going through
internal review process
?top lt 12.7 GeV at 95 CL
53Conclusion
Mtop (GeV/c2) 171.6 ? 2.1 (statJES) ? 1.1
(syst)
?top lt 12.7 GeV at 95 CL
54Backup
55The top quark
- The top quark
- Discovered only 13 years ago at the Tevatron
- Weak isospin partner of the bottom quark
- Charge 2/3 (-2/3 for anti-top)
56Boundary cuts
- KDE doesnt know about hard/soft cutoffs in the
observables - Probability leaks into unpopulated regions
- Easiest fix is to explicitly set boundaries and
force kernels to stay inside via renormalization
inside the boundary - Amounts to extra selection cut on mtreco, mjj
- Efficiency high for signal events passing ?2
cut, somewhat worse for background events
57Fit for JES (just a cross-check)
58Did we get lucky?
p-value 23.4
59Likelihood fit output mean vs ?t
60On adaptive density estimates
- The adaptive density estimates have smoothing
that varies from MC point to MC point - Tails have larger smoothing, core of
distribution has less smoothing -
- Intuitively makes some sense, but can lead to
trouble - if function changes faster than
sqrt(x), can have bizarre nonlocal behavior
Evaluating estimate for f(x) MC point at x1
farther away than MC point at x0 Point at x1
contributes weight but point at x0 does not!
Counterintuitive
x
x0
x1
61Clipped adaptive density estimates
- Try to remove some of the non-locality of the
adaptive estimates by not allowing the smoothing
to get too large - Previously, h could be larger than the entire
width of the template! - Clip the pilot density estimates fpilot(xi) -gt
max0.1fpilot(x0), fpilot(xi) - fpilot(x0) is pilot density estimate with
maximum value - Equivalent to setting hmax sqrt(10)hmin
- Dont let h get too large!
- Recommended in one of the very early adaptive
density estimate papers
62Other weird systematics
63Are we statistics or systematics limited?
Limited by statistics
64An outline of the steps ahead
- Start out with signal and background Monte Carlo
- Run MC through kinematic fitter to get an
estimator of top quark mass - Same reconstructed top quark mass as before
- Parameterize discrete Monte Carlo to form
density functions of reconstructed mass at any
arbitrary top quark width
- Run Monte Carlo through an extended maximum
likelihood fit just as before in mass analysis - Measured top quark width one number per event
- Parameterize likelihood output to get expected
distribution of measured top quark width for at
any arbitrary top quark width - Defines the limit-setting machinery
- Run data through kinematic fitter and likelihood
fit - Set limit!
65ISR/FSR
- In Run I, switch ISR on/off
- using PYTHIA, ?Mtop 1.3GeV
-
- In Run II systematic approach
- ISR/FSR effects are governed
- by DGALP evolution eq.
- ltPtgt of the DY(ll) as a function of Q2
m
m-
qq -gt tt vs mm-
Pt(tt) at generator
(2Mt)2
log(M2)
66Ensuring continuity
- Sometimes (due to fluctuations in tails), can
have discontinuities in confidence bands. Be
conservative and ensure that things stay
continuous