Title: MINOS Batch Processing, Monte Carlo Generation and Oscillation Appearance Analysis
1MINOS Batch Processing, Monte Carlo Generation
and Oscillation Appearance
Analysis
- Alexandre Sousa
- Tufts University
- Tufts HEP Department Of Energy Review
- 10/26/2005
2MINOS Batch Processing
- Work started in 2002 with J. Urheim (IU), H.
Rubin (IIT) and AS (Tufts) - Objective is to develop an infrastructure for a
centralized, standardized MINOS Production Data
Processing - Runs on data files as they become available with
minimal human intervention - In parallel, handles other processing requests,
Alignment, CalDet, MonteCarlo reconstruction,
Mock Data challenge reconstruction - Uses the Fermilab fixed target computer farm for
data processing and Fermilabs ENSTORE tape
archive for storage - Since March 2005, AS responsibilities gradually
transferred to Harvard group. AS continues to
contribute in a consultant role - The Tufts HEP Group main contributions included
- Batch Processing infrastructure design and
original implementation - MINOS Software installation, maintenance and
testing at the farm - Batch Farms MINOS database installation,
configuration and administration - Assembling of all the reconstruction scripts used
in data processing - Creation and maintenance of a Production
software package making these scripts available
to the collaboration at large.
3MINOS Batch Processing
4MINOS Batch Processing
5MINOS Batch Processing
6MINOS Batch Processing
7MINOS MC Gen. (MDC)
- Mock Data Challenge
- Issued by the MINOS spokesmen in January 04
- Generate MC and Mock Data (MC with hidden truth
information) samples - Test the overall MINOS software framework and
data handling - Test the Near Detector and Far Detector
reconstruction chains - Test readiness of analysis groups for real beam
data in early 2005 - In conjunction with its necessary batch
processing reconstruction, was fundamental in
identifying software bugs, reconstruction
deficiencies and analysis shortcomings - Challenge set
- Truth information removed
- FarDet 3yr at 2,4e13 pot/spill
- NearDet 25 x FarDet statistics in target region
- MC set
- Truth information retained
- FarDet 10 x challenge set statistics (100k
events) - Auxiliary flavor changed ne and nm files
- NearDet Same statistics as challenge set.
8MINOS Monte Carlo Generation
- Originally motivated by the Mock Data Challenge
(sample gen. would take 2 months) - Configured and utilized the Tufts Linux Research
Cluster for Monte Carlo file production in April
2004. MDC sample generated in 9 days - In January 2005, contributed to the successful
inclusion of the College of William Mary
cluster in the Off-Site MC Generation effort - Participated in the development and testing of
standard Monte Carlo generation scripts to be
used by new contributing institutions (Rutherford
Appleton Lab cluster coming up to speed) - Generated Sets
- 75 of the total Mock Data Challenge sample
- 50 of the existing Near Detector Low Energy
sample - 100 of the Near Detector pseudo Medium Energy
sample - 100 of the Near Detector pseudo High Energy
sample
9 Appearance Analysis
- One of the most relevant MINOS goals is to search
for sub-dominant nm-gtne oscillations - MINOS can potentially improve the limit on q13
set by CHOOZ - The Tufts HEP Group is a very active participant
on the MINOS ne Appearance Analysis Group - Developed an alternate shower reconstruction
chain, based in the clustering and fitting of 3D
Hits assembled from the strip information for
each event - Built a common analysis framework, NueAna, now
part of the MINOS software, in collaboration with
Harvard, UCL and Stanford MINOS groups - Applied a Multivariate Discriminant Analysis
(MDA) method to ne CC classification of MINOS
events and completed the Mock Data Challenge.
10Angular Clustering Algorithm
- Use seedless nearest-neighbor method to cluster
hits in spherical coordinate event
representation - Calculate distance of each hit to all other hits
omitting the reciprocal distances - Aggregate all hits within some radius of a given
hit discarding already used distances as we go
(Caveat Hits assigned multiple times to
overlapping aggregates) - Histogram centroids of all aggregates and
compute bounds of each high density region via a
recursive algorithm - From all the aggregates for which the centroid
falls within these bounds, select the one with
the most hits (cluster) - Tunable parameters Radius, Noise.
11Angular Clustering example (QE CC nue )
12Angular Clustering example (QE CC nue )
Transverse Shower fitting
Longitudinal Shower fitting
13NueAna Framework
- Originated as the Harvard/Tufts NueAnalysis
framework, rewritten and expanded, in
collaboration with UCL and Stanford, to become a
MINOS Offline Software package.
14Multivariate Discriminant Analysis
- Define a set of variables that
appropriately describes the data sample - Calculate the covariance matrix for each class
- Determine the Mahalanobis distance to each class
for each event - Compute the score for an event to belong to each
class
15Analysis Backgrounds
nm N ?nm pp0
- Neutral Current events Most significant
contamination - High-y CC nm Muon track is invisible
- Intrinsic Beam ne from kaon/muon decay, normally
higher energy than signal peak - nt CC tau decays into electron.
nm N ?m-n2p p-p0
ne N ?pe-
nt N ? pt-
16Samples and Cuts
- Far samples
- Test sample composed of 19 nue, 19 numu and 19
nutau MDC Far MC files processed with R1.12 (each
file corresponds to 6.5x1020 POT) - Training sample with identical size and
proportions, no overlapping events - Test on MDC Far Mock Data file also processed
with R1.12 (7.4x1020 POT) - Near samples for testing
- 229 MDC Near MC files (1.33x1016 POT each)
- 246 MDC Near Mock Data files (1.33x1016 POT)
- Sample cuts
- Fiducial and containment cuts
- Vertex contained in fiducial volume
- Full event containment in Far Det
- Full Z containment in Near Det
- Prong cut individual track or shower pulseheight
gt 5000 sigcor (300 MeV) - High energy Cut Total reconstructed event energy
lt 150 MEU (6 GeV) - Track Length Cut Events with track length lt 18
planes.
17Variable Selection
- Variable selection is performed using SAS
Stepwise discriminant procedure - Preliminary selection from 350 available
variables - 140 sorted by discriminating power
- Run classification method for the 19 variables
with highest discriminating power.
18MDA Classification Results
- Table summarizing FOMsig/sqrt(Sbg) for different
oscillation parameters
19Mock Data Challenge Results
- Using CC group best fit oscillation parameters
sin2(2q23)0.925 Dm2322.175x10-3 eV2 - And Ue320.01
- FOMs at UE320.01, CC best fit, 7.4e20 POT
- MDA 0.532
- NN 0.510
- BDT 0.441
- NuMI-714 0.43
20Mock Data Challenge Results
- MDC truth values
- sin2(2q13)0.151
- sin2(2q23)0.925 Dm2322.175x10-3 eV2
- Showing result summary plots for all three
analyses with 90 and 99 confidence intervals
for 7.4 and 22.2x1020 POT.
21Conclusions
- The Tufts HEP Group continued to make fundamental
contributions to MINOS batch processing and Monte
Carlo sample generation - Made significant contributions to the MINOS ne
Appearance Analysis Group, in the form of an
alternate shower reconstruction method - Was a leading participant in the successful
completion of the Mock Data Challenge. Results
obtained compared favorably with other analysis
methods - Largest improvement over CHOOZ 90 CL sensitivity
- Inconsistent with sin2(2q13)0 at 90 CL
- True MDC value of sin2(2q13) well within the 90
region around the best fit - Future immediate work includes thesis defense and
completion of knowledge transfer.