Title: Software Packaging with DAR
1Software Packaging with DAR
- Natalia Ratnikova, Anzar Afaq, Greg Graham
- Fermilab
- Tony Wildish, Veronique Lefebure
- CERN
2Introduction ? Motivation
- Compact Muon Solenoid CMS HEP experiment will run
on the LHC accelerator at CERN. - CMS is using GRID technologies to utilize
available computing resources for the worldwide
distributed Monte Carlo event production. - To make this possible CMS software applications
must be brought to the production sites.
3Introduction ? Scope
- CMS software includes a wide range of
inter-related projects and external tools managed
by the Software Configuration Release and
Management tool SCRAM. - Complete installation of the CMS software and
environment on the remote sites is uneasy task,
and actually it is not necessarily required in
order to run ready applications.
4Introduction ? Goal
- The USCMS software and computing project goal was
to move CMS MC production in the US completely
onto the GRID computing resources. - We wanted to have an automated way to create
self-consistent distributions of the
applications, based on the software released at
CERN. - The Distribution After Release DAR tool was
developed at Fermilab for quick-an-easy
deployment of the CMS software applications,
which can run on the systems that do not have
pre-existing CMS environment.
5DAR concept
- DAR automatically creates and installs software
applications based on the runtime environment . - Application is a complete, self-contained
software program, including all required shared
libraries and other files, that can be executed
in a particular environment to accomplish a
particular computing task. - Runtime environment is a set of UNIX shell
environment variables used by the program during
the runtime.
6Concept ? Choices, Decisions
- There is a class of tools and utilities, such as
operating system kernel, loader, that though
needed for the applications, are usually present
on the remote computing node. - Its hard to define a clear border between the
application and the operating system, so one
sometimes has to decide what to include into the
distribution, and what must be pre-installed by
the local system administrator - In CMS software these issues are controlled
through the projects configuration, which
specifies the required tools and corresponding
environment
7Concept ? Conditionals
- Application software must be relocatable.
- This is most important and natural requirement.
Most of real quality software products are
relocatable, and the location is usually
controlled through the shell environment
variable. - No hard-coded absolute paths in the program or in
the shared libraries (except those referred to
the system area). - All executables are found in the PATH.
- DAR distributions rely on the system
compatibility
8DAR Implementation
- DAR is implemented in scripting languages, no
compilation is required. - Core of DAR code is written in PERL.
- Interfaces and extensions are written in Python.
- DAR code can be simply download from the CVS
repository or from the web, and can be used
immediately. In the CMS environment - dar -c lttop release directorygt lttemporary
directorygt - On the remote site
- Dar -i ltdistribution darballgt ltinstallation
directorygt
9Implementation ?Shared Libraries
- DAR will walk through the directories specified
in the LD_LIBRARY_PATH environment variable and
package all found libraries into the
distribution. It will insure that upon
installation the runtime environment scripts will
set proper LD_LIBRARY_PATH in correct order. - DAR does not rely on the output of the
- ldd ltexecutablegt
- command, as this is considered unsafe in case
of dynamically loaded libraries.
10Implementation ? Executables
- By default DAR will walk through the directories
in the PATH environment variable (only the
portion added for this particular application)
and include the contents of directories into
DARball. - This behavior can be overwritten by setting the
DAR_runtime_PATH environment variable, in which
case the associated files and directories will be
included into the distribution, and will be added
to the PATH in the DAR runtime environment
scripts.
11Implementation ? Other Variables
- DAR distinguishes between three types of the
runtime environment variables - Simple values (flags)
- Variables associated with some path to existing
file or directory in the local file system - Variables associated with several paths in the
local file system (PATH-like variables, were
entries are separated by the colon delimiter) - All physical files and directories found in
specified paths are included preserving the
underlying directory structure.
12General Practices, Tests
- All sophisticated work is done by DAR while
creating the distribution - The installation procedure is extremely simple.
- Friendly user interface simple commands,
built-in help, backward compatibility.
13Tests
- Run same application in the native environment
- Install DARball and run on the same node
- Install and run application on remote host
without pre-installed CMS environment - Same output in all three cases means success.
- Second type of tests is optional, and can be used
- to identify any discrepancies in the operating
- system configuration.
14Using DAR in Production
- DAR created distributions have been used as a
mandatory way to install software for the
official CMS Monte Carlo production. - Using the same set of applications and consistent
software distribution mechanisms insured stable
performance and trustworthy results. - The RefDB2DAR interface has been developed to
formalize the requests for applications and - provide bookkeeping of the available
distributions.
15CMS production over GRID (fall 2002)
The CMS Integration GRID Testbed produced 1.2
million CMS Monte Carlo events from generation
with PYTHIA physics generator through simulation
with GEANT and digitization with Objectivity
based applications.All results shown here were
run on Red Hat 6 systems, though some GEANT-only
production was also run on newer Red Hat 7
systems.
16Next steps ? Bookkeepiing
- The RefDB2DAR interface allows to download
request file from the RefDB. - Refdbdar utility is then used to
- parse and validate the RefDB request file
- call Packager CMSIM_packager, CMKIN_packager, or
DAR_packager for scram managed projects, - packager builds executables as requested and
creates distribution
17Next steps ? Optimizations
- runtime environment contains some superfluous
directories and files. However for detection of
files, that could be safely excluded, expert's
knowledge of the software application is
required. - a number of new expert options allow to filter
the contents, but it may take several iterations
to figure out what can be removed, and whether it
is efficient and safe.
18Next steps ? Optimizations
- Space optimizations
- Avoid duplications (all duplicated files are
replaced by symbolic links) - Introduced experts options
- Runtime environment contains some superfluous
directories and files. - However for detection of files, that could be
safely excluded, expert's knowledge of the
software application is required - Time optimizations
- Automating tests
19Distribution process
- Production Coordinator fills web form to create
DARball request. Generated request is stored in
the RefDB, notification is sent by e-mail. - DARball is created then created using refdbdar
and request file, based on software release
installation at CERN. - Application is installed and tested in DAR
runtime environment.
20Distribution process
- DARball is put into SRB for distribution and is
ready for the production assignments. - Production sites get the assignments with the
indication of the DARball (by name). DARball is
then downloaded from the SRB and installed, using
DAR, on the worker nodes. - McRunJob tool creates job based on application
and submitts it to the production GRID.
21Using DAR in MOP
- MOP is a system for distributing CMS Monte-Carlo
production jobs over the GRID. - MOP has capability of running any type of scripts
(jobs) at remote GRID sites, called Worker Sites. - MOP run jobs as DAGs (Decyclic Acrylic Graph)
which could be combined together to create
complex workflows.
22Using DAR in MOP
- In general every DAG contain 04 stages.
- Stage-in Bring in the required input files (from
several sources) to worker site. - Run Execute the job itself, producing results,
logs, data. - Stage-out Send out produced results/data/logs.
- Clean-up clean the left over files/directories
at worker site.
23Using DAR with MOP
- DAR installation at a worker site is achieved by
- creating a special MOP job
- that first pull DAR tool and Application DAR
distribution in stage-in, - runs installation by invoking DAR in run-stage,
- Bring back the results of installation to
submission site in stage-out - and then performs a clean up operations at worker
site.
24Summary
- DAR-based distribution scheme is successfully
used in the CMS event production for an extended
period of time. - It allows to keep the pace with the
software developments and deliver software
applications to the productions sites with ease
and in a timely fashion. - Being re-packaged into RPM files, applications
can be re-used within different distribution
approaches (e.g. LCFG).
25Acknowledgements
- Main credit in this work should be addressed to
the core CMS software developers, architects and
release managers for the constant care about
software quality. - We would like to thank CMS and USCMS software and
computing managers for their attention paid to
this project, CMS Production Team for providing
excellent working environment, and all CMS
colleagues from many counties and institutions
for their useful feedback. - My special thanks to Dr. Yujun Wu for presenting
this talk to You, and numerous fruitful
discussions.
THANK YOU