SUSY 1lepton background multidimensional fits CSC Note 1 - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

SUSY 1lepton background multidimensional fits CSC Note 1

Description:

exponential gauss in mtrans. landau gaus in mtop. TTbar Dileptonic: ... exponential gauss in MT. Check the results of the fit by. comparing ... gauss in ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 36

Provided by: ajko

Category:

more less

Transcript and Presenter's Notes

Title: SUSY 1lepton background multidimensional fits CSC Note 1

1
SUSY 1-lepton background multidimensional fits
CSC Note 12 ATLAS SUSY WG

A. Koutsman W. Verkerke, NIKHEF
23 augustus 2007

2
One lepton mode SUSY
1 fb-1
Effective mass (GeV)

Dominant backgrounds
Top pair
Wjets
QCD
Zjets

GOAL estimate and understand backgrounds from
data
TARGET Develop methods to discover/exclude SUSY
with 1 fb-1

3
Multidimensional method

MT Method
extrapolate Wjets/ttbar bkg from control region
(low MT) to signal region (high MT)
Main Idea Improve MT method
Try to use additional observables for
extrapolation (e.g. mtop)
Explicitly account for SUSY contamination in
control region

Overestimated by factor 2.5
Key issues to understand - Amount of
correlations between observables and type
of correlation - Amount and shape of SUSY
in control region
4
Fitting the background w/o correlations

In absence of correlations, we can construct
relatively simple multi-dimensional models to
describe background data
E.g. Pttbar(MT,ET,mtop) P1(MT)?P2(ET)?P3(mtop)
New observable reconstructed hadronic top mass
mtop
Defined as invariant mass of 3 jet system with
highest sum pT
Next step Write model that describes combined
background in control region and use that to
extrapolate to signal region
Ptotal(MT,ET,mtop) Ntt1l Ptt1l(MT,ET,mtop)
Ntt2l
Ptt2l(MT,ET,mtop)
Nwnj Pwnj(MT,ET,mtop)
Nsusy Psusy(MT,ET,mtop) (Ansatz model)
Idea Hope for improved determination of SM
backgrounds due to
Additional observables used in procedure
Generic SUSY component included in fit to account
for non-zero SUSY contamination in control region

5
First iteration of combined background fit

Start out with simplest exercise Shapes of
components fixed
Determined from fits to individual background MC
samples
Shapes chosen for various
backgrounds
TTbar Semileptonic
(11214110 parameters)
exponential in missing ET
exponentialgauss in mtrans
landaugaus in mtop
TTbar Dileptonic
(1225 parameters)
exponential in missing ET
gauss in mtans
landau in mtop
Wjets
(112127 parameters)
exponential in missing ET
exponentialgauss in mtrans
landau in mtop

TTbar Semileptonic
TTbar Dileptonic
Wjets
6
First iteration of combined background fit w/o
SUSY

Now fit model for combined background with fixed
shapes to mix of background samples and see if
We have enough information in fit to constrain
various fractions
If we find back the fractions of background that
went into the fit (no bias etc)
Fits on 1 fb-1 of data

Fit Truth Ndi
123 17 141 Nsemi 567
40 578 Nwjets 168 35
140
OK!
7
Next iteration of combined background fit

Include generic SUSY contribution in fit (flat in
ET, gentle slope in MT,
landau in mtop) and fit to data with SUSY
SU3 contamination
Combined fit with SUSY on 1 fb-1 of data

Fit Truth Ndi
146 25 141 Nsemi 557
43 578 Nwjets 168 43
140 Nsu3 216 24 228
OK
8
Combined background fit cross checks

Cross check 1 fit model with floating SUSY
component to data w/o SUSY
Cross check 2 fit model w/o SUSY component to
data with SUSY contamination

Fit
Truth Ndi 125 18
141 Nsemi 565 40
578 Nwjets 172 35
140 Nsu3 -3 4.7 0
OK
Fit
Truth Ndi 331 26
141 Nsemi 427 37
578 Nwjets 329 37
140 Nsu3 0(fixed) 228
OK
9
How well does the generic SUSY shape work?

Are we sure the fit is not biased? Run fit 1000
times on toy MC samples drawn from combined
background p.d.f. fitted to MC-data and look at
pull distributions
In the fit we have taken a generic shape for SUSY
component. How well does it portray other SUSY
points?
Fit to pull distributions offraction SUSY in
combined fit
SU1
mean -0.027 0.052
s 0.993 0.037
SU2
mean -0.067 0.057
s 1.040 0.037
SU3
mean -0.0006 0.051
s 0.958 0.032

1000 toy MCs output
Fits are unbiased
10
Summary simple fit with generic SUSY component

Have enough information in MT,ET,mtop to
constrain individual background components
(tt1l,tt2l,Wjets)
Can account for unknown SUSY contribution in
control region with generic SUSY component in fit
In current simplified approach the generic SUSY
component in fit allows unbiased determination of
amount of SM background in presence of unknown
amount of SUSY in data
Have checked with multiple SUSY data points that
procedure essentially works for all SUSY points

11
How to deal with correlations?

2D histograms give a clue, but no quantative
account
In the previous iteration we dealt with simple
factorizable PDFs
What if ET and MT have a small correlation?
How can we understand it?
How do we model it?
CONDITIONAL PDFs
Model the dependence of ET and MT and functions
of each other

12
Are ET, MT correlated in signal/bkg?

Procedure
Slice sample in bins of MT and look at ET
distribution
Make fit to distribution in each slice, see if
fit parameter changes vs MT
Make sure fits model
can describe data in
every slice

Wjets
0ltMT10
20ltMT30
30ltMT40
10ltMT20
40ltMT50
50ltMT60
60ltMT70
70ltMT80
110ltMT120
80ltMT90
100ltMT110
90ltMT100
missing ET
13
Model the correlation

First Look at ET slope dependence vs MT slice
Conclusion
There is a correlation because
slope is not constant

Next step try to model this dependence by
replacing
with slope of ET expressed as polynomial in
MT
Fit 2D distribution with 2D conditional product
PDF and see if ET slope dependence on MT is
correctly described

For MTgt160 statistics is the bottle neck and
fits are not trustworthy
14
Fit the 2D distribution

Now fit the 2D model with conditional dependence
Wjets
conditional exponential in ET
exponentialgauss in MT
Check the results of the fit by
comparing the conditional PDF
with data ? very good agreement

2D fit result Sliced data
OK
15
Alternate correlation coordinates

What if we turn the observables around?
Slice sample in bins of ET and look at MT
distribution
Wjets exponentialgauss in MT
4 parameters slope(1) of the exponential
mean(2),sigma(3) and fraction (4) of gaussian
Multiple parameters cause
more difficulty, low statistics
make fits unstable

80ltET90
90ltET100
100ltET110
110ltET120
120ltET130
130ltET140
140ltET150
150ltET160
160ltET170
180ltET190
170ltET180
190ltET200
Thus we set the mean of the gaussian constant and
float all other parameters
200ltET210
230ltET240
220ltET230
210ltET220
MT peak portrays W-mass
MT
16
Is MT dependent on ET?

Make a plot of each parameter of MT as a function
of ET

Statistics insufficient for
ETgt300 (less than 20 events)
Sigma gaussian constant (no dependence)
Fraction gaussian gentle slope
Exponential slope very gentle slope

Now we can fit

17
Double conditional pdf (Wjets)

But were really after
Shape of MT depends on ET and shape ET depends on
MT
Wjets try-out
Model the development of
the shapes by polynomials
Fit the Wjets in ET,MT to the
data with two simple conditional
dependences
Check if correlations come out of the fit
correctly
conditional exponential in ET

N.B. binned fits shown here

conditional exponentialgauss in MT

OK
18
Triple conditional dependence

Now double conditional fit works for ET,MT we
studied also the correlations with 3rd variable
mtop using same procedure
Idea once we have studied all the correlations,
we want to fit our model in all three variables
(ET, MT, mtop). If necessary every parameter from
the plain model gets fashioned with a correlation
to other variables. So we get a triple
conditional pdf to describe each individual
background sample.
Example
How well does the triple conditional correlations
model fit our data?
Checks done for every background sample
Make sure that all correlation coefficients are
significant
Global correlation of each coefficient should be
reasonable
Keep all the correlations that pass the checks
and on to combined fit

OK
Wjets
19
Combined background fit

Summary
for every background sample (Wjets,
tt?lnln, tt?lnqq) we
have now a conditionally dependent
multi-dimensional
model that fits the data
Next step
Construct a combined model that describes
combined background and a non-zero contamination
from SUSY in control region
Pdata(MT,ET,mtop) Ntt1l Ptt1l(MT,ET,mtop)
Ntt2l
Ptt2l(MT,ET,mtop)
Nwjets Pwjets(MT,ET,mtop)
Nsusy Psusy(MT,ET,mtop) (Ansatz model)
How well does the new correlated model fit the
data?
Is it better than the simplified model?

20
First iteration of correlated combined bkg fit

Start out with simplest exercise Shapes of
components fixed
Determined from fits to individual background MC
samples
Shapes chosen for various backgrounds
TTbar Semileptonic
Correlated model
15 parameters
Simple model
10 parameters
TTbar Dileptonic
Correlated model
7 parameters
Simple model
5 parameters
low statistics make
correlation studies difficult
Wjets
Correlated model
15 parameters

TTbar Semileptonic
TTbar Dileptonic
Wjets
SU3
21
Combined background fit w/o SUSY

First fit model for combined background w/o SUSY
component with fixed shapes to mix of background
samples
Do we find back the fractions of background that
went into the fit?
How are the fractions compared to fit with the
simplified model?
Fit on 1fb-1 of background data

Correlated Fit Plain Fit
Truth Ndi 220 26
235 25 229 Nsemi 1073
62 1074 63
1072 Nwjets 416 61 401
61 408
OK
N.B. following results shown for release-11, but
comparable with release-12
22
Combined background fit with SUSY

Now we include SU3 contamination into our data
and a generic SUSY contribution into our model (
Ansatz model SUSY flat in ET,
gentle slope in MT, landau in mtop)
Fit with shapes of components fixed on 1fb-1 of
data

Correlated Fit Plain Fit
Truth Ndi 180 42
158 39 229 Nsemi
1095 65 1127 67
1072 Nwjets 434 66 382
68 408 Nsu3 379 36
420 36 378
Correlated fit a bit better
23
Combined background fit cross checks

Cross check 1 fit model with floating SUSY
component to data w/o SUSY
Cross check 2 fit model w/o SUSY component to
data with SUSY contamination

Correlated Fit Plain Fit
Truth Ndi 239 32
219 31 229 Nsemi
1066 62 1080 64
1072 Nwjets 421 61 396
61 408 Nsu3 -17 16
14 18 0
OK
Correlated Fit Plain Fit
Truth Ndi 592 38
623 38 229 Nsemi
972 61 979 61
1072 Nwjets 524 64 484
64 408 Nsu3 0(fixed)
0(fixed) 378
OK
24
How well does the generic SUSY shape work?

Are we sure the correlated fit is not biased?
Again run fit 1000 times on toy MC samples drawn
from combined background p.d.f. fitted to MC-data
and look at pull distributions
How well does the combined correlated fit
portray other SUSY points?
Fit to pull distributions offraction SUSY in
combined fit
SU1
mean -0.096 0.056
s 1.005 0.036
SU2
mean -0.021 0.055
s 1.031 0.037
SU3
mean -0.023 0.055
s 1.030 0.036

1000 toy MCs output
Fits are unbiased
25
Summary on fits with correlations

Shown that we can determine the amount of
individual background components and a generic
SUSY contamination correctly using a simplified
model w/o correlations in a multidimensional fit
Possible correlations of parameters in ET,MT and
mtop have been studied for all background samples
Non-negligible correlations between variables
have been introduced into the our model using
triple conditional pdfs
The correlated model has enough information in
MT,ET,mtop to constrain individual background
components (tt1l,tt2l,Wjets) and account for
unknown SUSY contamination in control region
Effect of correlations is not huge ? Aim to
introduce a subset of most important correlations
in final model
Next
Exclude signal region from fit and test the
procedure
of extrapolation from control region

26
Extrapolation control region ? signal region

Have shown that we can distinguish amounts of
Wjets, tt(1l) and tt(2l) background from data
using full observables space
Now repeat exercise without signal region
Fit Wtt(1l)tt(2l)genericSUSY background to
data in control region
Extrapolate amount of W,tt(1l),tt(2l) to signal
region
Compare predicted amounts in signal region to MC
truth
Shapes still fixed (releasing shape parameters
last step of whole exercise)

27
First iteration fit extrapolation

Define two sidebands in MT, ET
Example
SB1 0 lt MT lt 120
ET full range
SB2 120 lt MT lt 300
0 lt ET lt 300
Fit the combined model in both ranges
While extrapolating make sure fractions are
correctly defined

28
Combined fit extrapolation

First try-out do extrapolation only in MT
SB1 0 lt MT lt 70
SB2 70 lt MT lt 150
Do extrapolation in both observables
SB1 0 lt MT lt 70
100 lt ET lt 200
SB2 70 lt MT lt 150
200 lt ET lt 250

Fit Truth Ndi
51 12 49 Nsemi 4.6
0.7 4 Nwjets 4.5 1.2
3 Nsu3 114 12 118
OK
Fit Truth Ndi
2.6 2 5 Nsemi
0.1 0.04 0 Nwjets 1e-4 0.03
1 Nsu3 67 2 64
OK
29
Summary and outlook

We can correctly determine the amount of
background in the control region using only a
part of the observables space
The extrapolation to the signal region works
accurately using fit results from the control
region
? Control region is enough to determine
the amounts of W, tt(1l) and tt(2l) background
? We do no need the signal region to
determine the amount of SUSY contamination in the
control region
LAST STEP float as many shape parameters of the
distribution as possible
and show that fitting
procedure still works

30
Back-up slides
31
Method-1 (S.Asai, K.Oe) CSC Note1/2

Main idea separate data in two sets
MTgt100 signal region
MTlt100 control region
Assumption-1 the shape of BG in control region
is same as shape of BG in signal region ? Just
need to scale with events
Assumption-2 SUSY is negligible in control region

Top Wnjets
Top Wnjets SUSY
Estimated signal region X scaling
Works without SU3 in the game ?Assumption 1
is fairly good
Actual BG can be estimated correctly
Estimated BG over- estimated by factor 2
Problems with SU3 ?Assumption 2 is no
good
(Kenta Oe Shoji Asai)
(Kenta Oe Shoji Asai)
32
Release 12 Samples

W0,1,2,3,4,5 partons
WenunJets (n2..5) 5223-5226
WmununJets (n3..5) 8203-8205
WtaununJets (n2..5) 8208-8211
T1 (MC_at_NLO) 5200
separate at truth-level between
semi-leptonic (e, mu, tau)
di-leptonic (ee, mumu, tautau, emu,
etau, mutau)
SUSY
SU1 5401
SU2 5402
SU3 5403
SU4 6400
SU6 5404
SU8 5406

All samples normalized to 1 fb-1
All following plots for ELECTRONS

33
Top mass correlation
top sliced in ET

Studying the correlations we came
across one unexpected result
concerning the reconstructed
hadronic top mass. It seemed
to have a dependence on ET
To see if this was an effect of reconstruction
software, we repeated the study with an extra
feature
the reconstructed hadronic top had
to match a truth top (?R lt 0.2)
The result of this study shows that
the top mass is indeed dependent
of ET, even if you match to truth
MORE STUDIES NEED TO BE DONE
BEFORE A CONCLUSION CAN BE DRAWN!!!

matched top sliced in ET
34
Are ET, mtop correlated?

See if mtop shape depends on ET
mtop is described by a landau in our model ? 2
parameters

mtop sliced in ET
-Sigma landau mtop has a very gentle slope
-Mean landau mtop is correlated -Model
correlation as polynomial 1st order
35
Are ET, mtop correlated?

Procedure as before
Slice 2D sample in bins of mtop and look at ET
distribution
Make fit to distribution in each slice, see if
fit parameter changes vs mtop

Fitted slope per slice vs. mtop
- Clear correlation between slope of exponential
in ET and mtop, as the slope is not constant vs
mtop - Model correlation as a polynomial of 2nd
order

Write a Comment

User Comments (0)