Title: CHEP 2003 General Summary
1CHEP 2003 General Summary
- Torre Wenaus, BNL/CERN
- CHEP 2003, UC San Diego, La Jolla
- March 28, 2003
2- I agree with all the other summaries.
- Thank you to the organizers,
- and have a safe journey home
3Outline The CHEP03 Zeitgeist
- Themes and observations
- Rising trends
- Important developments
- Receding trends
- Underrepresented
- Open questions
- Concerns
- Major challenges
- Conclusions
- Thanks
zeitgeist Pronunciation 'tsIt-"gIst, 'zIt
Function noun Etymology German, from Zeit
(time) Geist (spirit) Date 1884 Meaning
the general intellectual, moral, and cultural
climate of an era
Google zeitgeist http//www.google.com/press/zeit
geist.html
4Themes and observations
- Lesson from the past Make it simple (R. Brun)
- No more complex than necessary
- Users want consolidation, ease of use, and
stability - Must consider also needs of the future longer
view of maintainability and evolution - In the interests of long term stability
- OO and C is the accepted paradigm
- No major OO/C migration or usage angst at this
conference, it is done and accepted - Offline and online Triumph of C for HEP DAQ
confirmed DAQ summary - Now we are hearing reports on Nth generation C
software - L. Sexton Kennedy, CDF Every component has been
rewritten at least once. Implementations have now
stabilized such that every new arrival doesnt
start by discarding and rewriting software - Many more talks about redesign than about
design Data management summary - And on the maturation and emergence of tools as
broad standards, after years of development and
refinement - e.g. Geant4, ROOT I/O
5Themes and observations
- The tyranny of Moores Law
- Wolbers it is not a substitute for more
efficient faster code, smaller data size - it works against thinking before doing
- Optimize wherever possible
- Addressing the digital divide in networking
(H.Newman) - HEP is obligated as a community to work on this
- A world problem in which our field can have
visible impact - Farm challenges
- Dont underestimate farm installation and
operations (R.Divia) - Big issues are power, cooling, space! (S.Wolbers)
- Watts/ steadily rising (R.Mount)
- Tape-disk random access performance gap in
analysis is receding as an issue, but disk-memory
gap is hardly being addressed (R.Mount)
6R. Mount
7Rising trends
- ROOT
- For analysis, I/O, and much else
- Now fully supported at CERN EP/SFT section
- Close interaction with experiments on new
developments - Run II, RHIC, ALICE, LCG, BaBar,
- Foreign classes, PROOF, geometry, grid
integration, - Mentioned in 47 talks at this conference
- Open source databases (MySQL, Postgres, )
- Metadata, distributed computing, conditions,
- Empowering software easy and potent
- MySQL mentioned in 37 talks! Postgres in 8,
Oracle in 27 - Online offline continuum
- Similar Linux farm environments, attainable time
budgets - Same framework, maybe same algorithms, in HLT as
in offline (V.Boisvert, ATLAS) - Stringent performance/robustness requirements on
software
8Rising trends
- Common projects
- Joint projects one of the CDF/D0 successes
(Wolbers) - But hard to align running experiments with LHC
- LHC Computing Grid project
- Grid projects in general
- Laudable but difficult increasingly forced by
the circumstances - Resource constraints and increasing scale and
complexity makes go-it-alone N times too costly - cf. comments in online/DAQ context by G.
Dubois-Feldmann today somewhat less success in
online where it is even harder than offline, but
possible LHC inroads - Related is software reuse
- Respect what we know about long software
development timescales
9Renes time to develop plot
LCG?
10LCG must effectively re-use and leverage existing
software, or fail
This is the approach taken cf. POOL, SEAL talks.
Time will tell! cf. next CHEP
LCG?
11Rising trends
- Modular component architectures
- Many examples in offline also in online/DAQ
(XDAQ CMS) also in open source - Associated infrastructure white boards,
centrality of dictionary, plug-ins, - XML
- The no-brainer for small scale structured data
storage and exchange. - The more humane applications leave the XML
generation to the computer and not the humans - ASCII lovers count me in now have their
standard - Many talks in many areas involving XML
applications - Detector description, conditions info,
configuration, monitoring, graphics, object
models, data/object interchange, dictionary
generation, not to mention layered apps (e.g.
SOAP) - 37 talks mention XML (same 37 as MySQL?)
- But XML in itself does not define common
format/schema, and much divergence and
duplication exists in how XML is used - e.g. detector description
- We heard (I.Foster) about an OGSA community
clearing house, we have similar things ourselves
(CLHEP, FreeHEP), maybe we need one for XML
applications
12Rising trends
- Open source in general
- Open source, please. Your interests rarely in
commercial vendors interests (M.Purschke,
PHENIX) - In the CDF/D0 success column, similarly all over
- DBs, Qt, utility libraries, and Linux, it goes
without saying - Extraordinary capability and quality
- Java, to a degree
- Important limitations being addressed, e.g.
manageable C interoperability (JACE
autogeneration of interface) - JAS, NLC sw, IceCube, CDMS DAQ,
- But not broadly competing with C in usage so
far - HENP as CS partner and collaborator
- To our mutual benefit in the Grid and in
networking
13Rising trends
- New simulation engines Geant4, FLUKA
- Geant4 as a production tool
- In production in BaBar EM validation in hand,
hadronic beginning, robust and reasonably fast - ATLAS on the way to completing G4 transition
after two years of physics validation - CMS, LHCb also transitioning over next year
- GLAST using LHCb/Gaudi Geant4 interface
- FLUKA not new established and widely used but
new integration efforts as a detector simulation
engine for the four LHC experiments - FLUGG interface to G4 geometry
- ALICE Virtual Monte Carlo as uniform interface to
multiple engines (FLUKA, Geant4, Geant3) - Interest from other experiments joint LCG
project starting - Used for Geant4 testing
- FLUKA integration in progress
14Rising trends
- Automation in software development/management
- Heard about several automated tools for code
building and testing, release integration tag
management, configuration management - Popular new software web portal at CERN LCG/SPI
- http//savannah.cern.ch
- Automated textual and statistical analysis of
test outputs
15Rising trends The Grid
- The central importance of distributed computing
to future (increasingly, present) HENP is long
known - The Grid as the means to that is now
established - Major, broad successes in funding and in
attracting collaboration with CS - F.Berman, Grid 2003 HEP has set a model for
integration, focus, coordination - Progress in applying Grid software and
infrastructure to real problems - Batch production
- Clearly the chosen path success to be proven,
but has promise and broad commitment
16The Grid
- F.Berman, Grids on the horizon
- Must be useful, usable, stable supported
- More cooperative than competitive
- Not always the case today!
- Applications are key to success
- Not a Field of Dreams build it and they will
come RD field any more - Grid killer app a focus on data. Good match to
us - Still a long way to go
17The Grid
- Miron Livny
- Benefit to science democratization of computing
- Still very manpower intensive when the support
team goes on holiday, so does the Grid (CMS
testbed in Dec) - Best practice middleware requires
- True collaboration, open minds (cf. Berman)
- Testing, deployment/adoption, evaluation metrics,
robustness, professional support, longevity,
responsiveness to show stoppers, - Much to do and improve but important progress
- E.g. VDT as standard middleware suite
18Important developments
- Community consensus on a C object store ROOT
I/O - Though many approaches to its use
- Combined with RDBMS for physics data storage
- CDF, RHIC, LHC, BaBar, GLAST,
- Software engineering is catching up to us F.
Carminati - High ceremony processes are not an obvious
success - And we are not alone
- Agile methodologies, Extreme Programming (XP),
is SEs response - Extremely close to a successful HENP working
model - Adaptive, simple, incremental, tight iterations,
plan for change, adjust the methodology for your
environment - I just learned we use XP, comment from CDF
- Means of responsibly formalizing and addressing
in a useful way software engineering in HEP,
and software management - Both must be effective and lightweight Agile
19Important developments
- Major strides in networking
- HENP a leading applications driver and a
co-developer of global networks (H. Newman) - Require rapid global access to event samples and
analyzed physics results drawn from massive data
stores - PB by 2002, 100 PB by 2007, 1 Exabyte by 2012
- Rate of Progress gtgt Moores Law
- Factor of 1M in 1985-2005 (5k during 1995-2005)
in global HENP network bandwidth - Factor of 25-100 Gain in max sustained throughput
in 15 months on some USTransAtlantic routes - Network providers see us as an opportunity
because we push real production applications - Future promise Optiputer (P.Papadopoulos)
- Key driving applications keep the IT research
focused
20Important developments
- The LHC Computing Grid Project
- Major new internationally supported effort to
build the distributed computing environment of
the LHC - Encompasses
- the distributed computing facility
- Site fabrics (facilities), middleware selection,
integration, testing, deployment at distributed
sites, operations and support, - the common physics applications software
- Persistency, core libraries and services, physics
analysis interfaces, simulation and other
frameworks, all in a distributed environment - Must succeed if LHC computing is to succeed!
- An impressive effort by the experiments together
with CERN to work in accord across the cope of
computing - Managed so as to ensure comprehensive oversight
by the experiments - First testbed deployment is this summer (LCG-1)
- Including the first major applications
deployment, POOL persistency framework (ROOT I/O
MySQL hybrid)
21Important developments
- Success of mass stores
- Castor reliable and effective (ALICE)
- D0/CDF convergence on successful Enstore/SAM
- HPSS successful at RHIC
- Exciting new generation of specialized lattice
gauge computers (B. Sugar) - Two tracks
- QCD on a chip QCDOC, a technical marvel,
project with IBM - 1M/Tflop, aiming at 10 Tflop at BNL in 04
- Optimized commodity clusters
- Pentium 4, Myrinet/Gbit Ethernet
- 10 Tflop at FNAL and JLAB by 06
- SCIDAC grant to improve software usability
22Receding trends
- Objectivity and ODBMS in general
- Jury still out at CHEP 2000 (P.Sphicas), but
now clear - Objectivity dropped or being phased out by LHC
experiments, COMPASS, BaBar event store - In PHENIX becoming a liability (compiler
issues) augmented with RDBMSs - Not due to technical failure but a mix of
technical problems, commercial concerns, manpower
costs, availability of an alternative - Its replacements are not other ODBMSes but files
(often ROOT) RDBMS (mySQL, Oracle, Postgres)
for metadata - Magnetic tape (apart from archival)
- PASTA unlimited multi-PB disk caches
technically possible but true cost is unclear
(reliability, manageability) - File system access under urgent investigation
- tapes as random access device no longer a viable
option large disk caches needed for LHC
analysis
23Receding trends
- Commercial software? No
- Some in decline (Objy, LHC), but new prospects
opening (IBM, Sun, MS, ) in Grid - Open source now has an important commercial
element we derive great benefit from (even
post-.com crash) - Red Hat, MySQL, Qt,
24Underrepresented
- Collaborative tools
- Was represented this week, but only lightly
- Vital for distributed collaboration on software
development and physics analysis - H. Newman need culture of collaboration
- Distributed and remote collaboration should be
the norm - Not solely, or even predominantly?, a matter of
tool development in the community - How is the exponential commercial side evolving
and how can we leverage it - What is the evolutionary path, strategy, role for
community-developed tools such as VRVS - Why is the user experience often poor
- Poor physical facilities/configurations,
instabilities, heterogeneous tools/protocols,
support issues, - Current experience sometimes competes
unsuccessfully with the telephone, despite all
the shortcomings
25Open questions
- Distributed analysis
- What will it look like? What development line(s)
are taking us there? Still very much RD pursued
in multiple directions - Several models (e.g. R.Brun) with varying degrees
of Grid exploitation/distributed character - H.Newman where is the comprehensive global
system architecture? M.Livny have to proceed
incrementally, step by step, from the bottom up - Some efforts were reported which are
incrementally extending established analysis
tools into Grid-based analysis - PROOF, JAS
- Others working from various starting points
- Genius, Ganga, Clarens,
26Open questions
- Distributed analysis continued
- Production environments more well-defined, tools
more advanced, a few in production, varying
levels of middleware usage - AliEn (ALICE), SAM-Grid (Run2), CMS tool suite,
GRAT, Magda (ATLAS),DIRAC (LHCb) - Not a lot of sharing/collaboration above the
middleware level!! - Necessary precursor to the more complex analysis
environment, and hard in itself - What analysis improvements will the Grid really
provide? (panel discussion)
27What analysis improvements will the Grid really
provide? (panel discussion)
- Some of the comments (what I heard, not what was
said) - Murphys Law needs to be beaten, not Moores Law
(V.Innocente) - From a technical point of view, the realization
of a successful grid will be a single integrated
distributed computing center (R.Mount) - But beyond the technical, a successful grid will
grow human resources, drawing in distributed
people not otherwise involved, as well as
material resources (M. Kasemann) - The grid is more than this. The LHC will build
the first global collaboration, reaching out to
uninvolved countries. This incurs on us an
obligation. Through the grid we must make their
participation possible and their resources
useful. (H. Newman) - It is an unprecedented opportunity to screw up.
But we have no choice, we cannot put it all in
one place. Focus on reliability.
28Grid panel 2
- The grid is something new. We cant let a one
virtual computing center be the dominating
thing. There should be no dominant force and we
should avoid centralized decision making. This
will help analysis. (L. Robertson) - Grids enable collaboration at a scale not
attempted before. Distributed efforts are
motivated to compete with one another and with
the central site, and this brings benefits and
resources. Analysis groups are teams, spread
across continents and time zones. How do they
collaborate? The grid should provide the
solution. Also, provenance is largely overlooked,
but it is key to analysis. (P. Avery) - We have no model for how 5000 users will use a
globally distributed facility. System issues must
be addressed now. (H. Newman) - Physicists should not see the grid at all. It
should be transparent. (P. Mato)
29Grid panel 3
- The grid will be successful if we make it simple.
Will force some coherence in the development of
distributed analysis tools. Too much process will
kill the process. There is not enough prototyping
going on. (R.Brun) - Agree, we need more prototyping. We need
candidate strategies, then build prototypes, and
see what works. You have to do this before you
will be able to abstract from experience and
automate make transparent the approaches that
work. (H. Newman) - Funding agencies, computer scientists, other
sciences are excited by the HEP grid work, eg. on
provenance. Possibility of infusion of funding.
Could pursue google-like response to what now
takes 3 months. (R. Mount) - The grid will enable collaborative work and
harness distributed brainpower. It will allow
HENP to be more present as a field at the home
university. This is important for the health of
our field. (H. Newman) - There is lots to learn from existing experiments.
(R. Brun)
30Open questions
- Impact of facility security on Grid computing
- Site security in the grid era Dane Skow
- Avoid complexity in designing security it is the
bane of secure systems - Must be agile in the face of change resistant to
attack - Risk management, not elimination must accept
some risk to carry on work - No clear answers in the bottom line, there is
much yet to be resolved and understood, and many
are working - Workable resolution is vital, since you dont
have a usable grid if the walls dont have sockets
31Open questions
- Impact of OGSA migration (Globus) on middleware
- Open Grid Services Architecture
- Leveraging industry standard web services
- Much industry involvement
- IBM, Sun, NEC, Oracle,
- Attention given to backward compatibility
- Promising approach may the migration go well!
- Alpha is under test production release in June
- Major dependency given Globus foundation role in
our middleware - Current Globus2 will be supported for some time
but we will be interested in new functionality
32Open questions
- Utility and practicality of generate-on-demand
virtual data (virtual data by materialization) - Networking going well cost/complexity equation
favors copying - Interesting talk (C.Jones) on successful
implementation and use for many years in CLEO - Relies on user discipline to ensure regenerated
data is trustworthy - Utility of data provenance management, needed for
secure trust of on-demand data, is a separate
question - Should have important utility, not only for
virtual data (reproducibility, trust) but as a
communication mechanism in widely distributed
collaborations - Cannot allow reliance on hallway conversations
with production gurus
33Concerns
- Data analysis as the last wheel of the car (R.
Brun) - Clear message from current generation (e.g. Run
2, BaBar) dont leave data analysis systems and
infrastructure too late, it will lead to problems - Vastly more true when we are talking about doing
globally distributed analysis, for the first time - with unprecedented volume and complexity, e.g.
Terabyte scale at the LHC - Making dist analysis both very difficult and
mandatory - We cannot bootstrap ourselves into a global
analysis system, it will take long incremental
work, so we better be working in a coordinated
effective way now - R. Brun Will not converge on one system will be
multiple competing systems, and that will not be
bad hopefully a small number
34Concerns
- Are we doing enough to ensure senior people can
contribute directly to physics analysis? - How do we interpret the fact (R. Brun) that PAW
usage is still rising? - Has everyone bought the C/OO paradigm shift?
- Are we developing and/or providing the right
tools? - Is there enough engagement of senior physicists
in the (limited) exploratory work being done on
future physics analysis environments? - Almost certainly no, and may be difficult to
attract their attention unless/until attractive
prototypes can be turned loose on them
35Major Challenges
- Storage architecture possibly biggest challenge
for LHC (PASTA) - Seamless integration from CPU caches to deep
archive - Currently very poor data management tools for
storage systems - More architectural work needed in next 2 years
36Future ALICE Data Challenges
- New technologies
- CPUs
- Servers
- Network
R. Divia
37Conclusions (1)
- Coming experiments must learn from prior
generations give early (ie for LHC, immediate)
attention to data analysis - It will take generations of incremental
iterations of design, prototyping and stressful
deployment to get it right - Particularly in the unprecedented global
collaborative environment of the LHC - C is a mature and accepted standard
- Several generations of C code in production
experiments (BaBar, Run 2, ) - Maturation of tools into broad usage (Geant4,
ROOT I/O) - No sign of a major new language migration so far
thank goodness - But beware excessive complexity and remember the
promise of accessible, usable software
38Conclusions (2)
- Grids and networking are making great strides
- HENP is a successful and valued partner with CS
- We provide a community focused on challenging
large-scale deployments in real research settings - But Murphys Law is a potent adversary today far
from robust transparency, and much much more to
do - Global collaborative computing must become a
successful norm for us - Down to the global researcher at the home
institute - Rich leadership potential for our field
- Important new common endeavours like the Grid and
LCG have much invested in their success will be
interesting to measure the degree of success at
next CHEP
39Thanks
- Thanks to Jim Branson and his team of organizers
for giving us - A stimulating program and comfortable schedule
- More-than-pleasant facilities and surroundings
- Terrific banquet, I hear!
- A very successful conference.
- I for one will return to La Jolla any time
40- I agree with all the other summaries.
- Thank you to the organizers,
- and have a safe journey home