Title: Third IT Department Meeting Wolfgang von Rden
1Third IT Department Meeting Wolfgang von Rüden
2Overview
- General topics
- Infrastructure General Services
- Physics Services LCG
- EGEE related projects
- Plans for 2006 Priorities
- Questions Answers
3General Topics
- 2005 was yet another busy year with many changes,
many successes and, of course,a few problems
4Key objectivesfor 2005
- Basic Services
- maintain/improve quality of all agreed services
- streamline services to free resources for
high-priority tasks - Ensure successful continuation of EGEE
- review in February, EGEE phase 2 proposal
submission - test and deployment of new EGEE gLite middleware
- Major LCG Goals
- Stabilize Grid Operations
- Succeed with first two Service Challenges,
inter-linking CERN and the Tier-1 centres - Technical Design Report by July
- Ramp up security efforts
- redeploy key staff (mostly done), launch openlab
project - coordinate with CNIC team
Done !
5Organization
- Reorganisation(s)
- Changes in UDS (new leadership, new structure,
review of services, still on-going) - Transfer of PC shop to stores, support to IS
group (also for MACs) - Major changes on 1 November (see next slide).
Date chosen to be before PoW and before the MAPS
exercise - Most changes went smoothly
- We have learnt to adapt our structure to the
changing environment rather than creating complex
matrix management structures - Thank you for making these changes easy
6IT Organigram as of 1 November
7Other IT functions
8Personnel
- Arrivals
- Departures
- Internal Mobility 22 cases (3 between
Departments) - LD review
- 11 candidates reviewed
- Very comprehensive report (40 pages), very good
results - Process used in IT already in line with new
contract policy - of FT positions depends on MTP, about 50 with
present plan
9Miscellaneous
- Secretarial Support Administration a few
numbers - Space Buildings
- Finance Purchasing Officer, Fatima Najeh,
presence in Building 31 on Mondays Tuesdays - Risk Analysis / Disaster Recovery Plan
information gathering started, plan under
preparation (Pål Anderssen)
10CERN School, CERN openlab
- CERN School of Computing
- Took place in St. Malo, supported by Saclay,
Director F. Flückiger - Very successful, 79 students, most passed exam
- Jackies last CSC, replaced by Fabienne
Baud-Lavigne - F. Ruggieri replaced by R. Frühwirth as chair of
Advisory Board - iCSC to be held 6-8 March at CERN
- CSC 2006 in Helsinki (September)
- CERN openlab
- openlab I terminates successfully this year
- Preparation of openlab II very advanced
- Partners Intel, HP, Oracle two being
negotiated - Contributors 2 companies accepted one more
being negotiated - Topics Platform Competence Centre Grid
Interoperability Centre
11IT Communications,Dissemination Outreach
- New this year
- New LCG website, Launch of LCG News
- CERN to coordinate communications in EGEE-II
(NA2) - EGEE VIP brochure, new multimedia video
material - First openlab/EGEE workshop, Intel signing event
- EGO position paper, fund-raising campaign
- Service Challenge demo for SC05 (with LCG,
MonALISA) - Africa_at_home with Swiss Tropical Institute
- Other activities
- Newsletters CERN Courier, CNL, EGEE Newsletter
- Press Releases EGEE conference, Service
Challenge 2, LCG reaches 100 sites - Events First Tuesday, VIP visits, WWWcast panel
on Grids (with Imperial College) - openlab student programme 16 students working
for openlab, LCG EGEE - LHC_at_home public outreach gt8000 volunteers
- Dissemination Repository for IT
Department http//it-comm-team.web.cern.ch/it-com
m-team/repository - and an award
- CERN receives HPCwire High Performance Computing
Public Awareness Award at SC05 - NB After review of safety, Visits Service to
make Computer Centre official point on visit
itineraries. Plans for refurbishing entrance to
513 to be unveiled next week
12Infrastructure General Services
- Frédéric Hemmer, Deputy Department Head
13Security
- Much better than in 2004, but recently new
incidents - Measures taken have helped, but also better
awareness of users - High-numbered ports to be closed by default
- Remove default admin privilege from NICE PCs
- Hype about Skype, phone cost at CERN is not a
reason, but admittedly the tool is very
convenient - CNIC
- Phase 1 completed, report available, phase 2
started - Policies agreed on how to split the general
purpose, the technical the experiments
networks - Significant progress with tools to manage
production computers - A major step forward to achieve the required site
security
14Security
15Admin. Information Services (AIS)
Examples of new services or upgrades APT E-MAPS C
RA AVCL (incl Summer students claim) PAFS (RAE,
mid/end probation) Accident declaration Mission
Order E-Payslip E-Tax certificate Visits
self-help PIE (Person management in experiments
and users office) IBAN conversion Request For
Funds report Invoice Pipeline e-MAPS support in
HRT ("training" and "safety) LHC reports in
CET Standardized Finance Reporting in
CET Information Centre (HRT only) Pension fund
integration SIR (safety training self
service) PRT (self registration of person info on
industrial support) CDFS (Confidential
Declaration of Family Situation) EGEE cost claim
support
- Summary 2005
- Overall the services ran very smoothly, but
reliability of EDH, CET, HRT could be further
improved - 80 of the effort goes into maintenance, but
still a lot of new services being introduced - High turn-around of very short-term staff is
inefficient - New staff plan till 2010 accepted, but not yet
funded - Challenges 2006
- Long list of pending requests, more to come
- Maintenance will take again 80 of the effort
- CRA is a priority for IT as many services depend
upon it - Focus in 2006 is on quality and service
- Long-term review overall approach to AIS solution
Big projects for 2006 CFU reimplementation Pens
ion fund integration APT consolidation Move to
Linux / Commodity hardware Implementation of
5-yearly review decisions CRA after version 1,
Roles, Exceptions ORACLE OnDemand experiment
16Control Systems (CO)
- Summary 2005
- Sustained support for PVSS, Labview, front-end
systems and the Controls Framework - Model-driven Gas Control System for ALICE TPC
delivered - Five Detector Safety Systems delivered on time
- Substantial contributions to computer security
(CNIC) - Challenges 2006
- Build up support towards LHC commissioning
- Commission and test all (5) Detector Safety
Systems - Complete PVSS Framework software, support to
clients, etc. - Take responsibility for external developments
- Commission 23 Gas Control Systems
- Coordinate/Contribute to CNIC phase 2
17Communication Systems (CS)
- Summary 2005
- Large range of activities (PABX, GSM, IP
telephony, radio equipment, CC upgrade, star
point upgrade, campus network, LHC experiments
installation, external networks, security) - Major software effort to manage the
infrastructure - Problems due to departing staff in 2006
- Challenges 2006
- Phone, PBX, GSM, Radio maintenance, operation
development - Major network installations for LHC, LCG, exps
and starpoints - Commission 10 Gb/s links to Tier 1 centres
- Enhance software for network security (CNIC,
Grid, campus) - Manage personnel succession plan
- A few pictures
18Database and Engineering (DES)
- Summary 2005
- Emphasis on support for Databases, Engineering
Software Development Tools, Systems Support - Important work on security in all areas, many
application upgrades - Preparation for the migration to new architecture
(replace Solaris by Linux, Sun hardware by PC
based servers) - New services J2EE, Twiki, Castor2, CS, LHC
logging TCR databases - Winding down Solaris support
- Challenges 2006
- Continue improve agreed services in the 3 main
areas - New architecture still a lot to do on
Automation, Standardisation, Monitoring
Integration with central IT services - Solution for off-site backup and automatic
recovery validation - Consolidation of Castor2 database service, new
LHC accelerator database - Provide CATIA production environment, provide
Windows infrastructure for electronic CAE design
tools - Support of BOINC infrastructure
- openlab-Oracle new projects
19Internet Services (IS)
- 6000 client PCs, 1500 new PC installed/year
- gt 100,000 patches / year
- gt 1,000 antivirus pattern updates / year
- 25 supported applications
- 5 Terminal Services gt1,000 users, gt100
connections/day
- Summary 2005
- Complete services for Web, Mail, PC windows,
MAC (limited) - Hardware selection and procurement preparation
for stores - Authentication, security, NICEFC
- Continuous improvement of services to cope with
increasing complexity and user numbers - Challenges 2006
- Revamping of printer service (review of
acquisition, automated installation print queue
creation) - Improvements in mail, web windows services
- New global search service
- Complete revision of MAC support
- CERN Certification Authority pilot
- More general deployment of NICEFC
Date Sites Hits / Day May 2003 5500
1.000.000 Nov 2003 6000 1.270.000 Oct 2004
6600 1.650.000 Sep 2005 7200 2.000.000
15000 mailboxes on 40 servers 40000 external
e-mail addresses 4000 Mailing lists, 20GB in
archives 200 Faxes/day, 600 users
registered 1000000 Messages received/day, 80
SPAM rejected 10,000 virus blocked/day in mail
gateways Unscheduled service interruptions below
0.01
20User Support and Documentation (UDS)
- Summary 2005
- New group structure in place, several activities
streamlined - Support/Service Audio Visual, Video
Conferencing, Printing - Helpdesk consolidated (1000 calls/week, 75
resolved within contract) - CDS new release Migration agenda ? Indico
started - Book shop moved to library (sold typically
3000/year) - Apple printers phased out, PC rental being phased
out - Challenges 2006
- Phase out customised duplicate services, aged
applications - Move services to IT standards, automate services
- Auditoria video conference rooms need
maintenance badly - Collaborative tools, no funding so far for
requests by experiments - Publications support (DTP), open access
archiving - Develop better model for User Support
- Review printing/copying strategy
- Further streamlining of services
Filming of conf / colloq / sem 150 /yr Live
Webcasts 25 /yr (of 350 in total) Audio
recording 400 hrs/yr 150 visits/yr Film/Edit
VIP visits and Experiment progress
PrintShop 20 M pages/yr Site-wide copiers 10 M
pages/yr Site-wide printers 15 M pages/yr
800k records (including 420k full text) 7k
searches / day 20k unique users / month
Agenda 101k talks (including 150k files
attached) 22k pages/day 17k unique users /
month Indico 52 conferences 6k pages/day
VRVS 1100 meetings / month (worldwide) 17,600
registered users
21Physics Services LCG
- Les Robertson, Project Leader
22LCG Highlights
- LCG-Phase 1 comes to its end
- TDR completed, several Data Challenges done
- All Tier1s and gt20 Tier2s taking part
- Baseline services defined, initial versions
deployed - Major successes in 2005
- Significant improvements in stability of the Grid
- Built effective Grid operations from scratch
- The LCG/EGEE Grid is now the worlds largest
production Grid - LCG-Phase 2 ready to start
- MoU for LCG-2 completed, signing by partners in
progress - 2006 budget secured, 2007/08 still needs more
funds - SC4 needs to reach full LHC rates
- Test full chain from DAQ ?Tier0 ? Tier1s
23Tier0 status Plans
- CC upgrade ongoing, major changes this winter
- Large disk acquisition completed, now at 1
Petabyte - 2000 dual processors, another 1200 to come soon
- Second year of ELFms operation - mature automated
management and monitoring system - Tape acquisition process agreed - first
installation now - Castor 2
- Good progress since first availability in January
- Rapid reaction to problems
- Still to do performance stress testing complete
migration
24Middleware Grid Deployment
- Results of 2005
- Growth of EGEE/LCG infrastructure to 179 sites,
17k CPU - Stabilisation of grid operations operator on
duty, strong monitoring processes ? much
improved site stability - Support for continued experiment data challenges,
production work, service challenges (SC2, SC3) - Definition common agreement on Baseline
Services for LCG - Continuous series of gLite releases ? several
components now in production, others available on
pre-production service - Very good experience with FTS in use in SC3
rapid developer response to problems - Full certification of gLite now in progress
- Plans for 2006
- LCG Service set up, ramp up, and support ?
Service Challenges - Migrate production services to FIO as far as
possible - Integrate the JRA1(GM) and SA1(GD) integration,
testing, certification, release process - Set up the ETICS project infrastructure
- Build consolidated middleware development
support teams across the JRA1 GD teams - Converge the gLite and LCG-2 middleware to a
single distribution called gLite - Ensure Baseline Services for LCG, essential
gLite components for other EGEE apps are
available deployed - EGEE-II, ETICS, support other related grid
infrastructure projects
25Service Challenges
- Results of 2005
- 3 Challenges undertaken with varying degrees of
success - Major issue is failure to meet SC3 T0-T1
throughput targets - Re-run disk-disk tests in January 2006 (stability
rate x 2) - All T1s, gt20 T2s in all regions and all
experiments involved - Grid Services VO variations now well understood
deployed - Plans for 2006
- Ramp up T0-T1 transfer rates to full nominal
rates (to tape) - Identify and validate all other production data
flows (Tx -Ty) - Increase T2 participation from 20 (April) to 40
(September) - Broaden focus from production to analysis (many
more users) - Streamline Operations User Support building on
existing efforts - FULL production services! FULL functionality!
- Quantitative monitoring service level vs MoU
requirements - Significant progress acknowledged by LHCC
referees!
26LCG Networking
- GÉANT 2 research network backbone
- Strong correlation with major European LHC
centres - Excellent collaboration
- Swiss PoP now at CERN
27Application Area
- Results of 2005
- Work done jointly with PH-SFT and experiments
- Merge of SEAL and ROOT projects agreed and
started - POOL framework well consolidated in experiments
- Common RDBMS Access Layer (CORAL) implemented
- Conditions DB being validated by ATLAS and LHCb
- Fluka (hadronic) and Garfield (gaseous detector)
simulations brought into project - Physics validation and Géant4 made good progress
- MC generator library in production
- Plans for 2006 Consolidation of the above
28Major LCG issues for 2006
- Castor 2
- Complete testing and data migration
- Distributed database services
- Architecture and plan agreed now, But still to
deploy pilot services - End-to-end testing of the DAQ-T0-T1 chain
- recording, calibration and alignment,
reconstruction, distribution - Full Tier-1 work load testing
- Recording, reprocessing, ESD distribution,
analysis, Tier-2 support - Understanding the CERN Analysis Facility
- batch analysis and interactive analysis
- Startup scenarios
- Materials budget for 2007 and 2008
29EGEE and Related Projects
- Bob Jones, Project Director
30EGEE Highlights - Infrastructure
Dec 6/7 _at_ CERN 2nd project review successfully
passed
- Scale
- gt170 sites in 39 countries
- gt17 000 CPUs
- gt5 PB storage
- gt10 000 concurrent jobs per day
- gt60 Virtual Organisations
31EGEE Highlights - Applications
- gt20 applications from 7 domains
- High Energy Physics
- Biomedicine
- Earth Sciences
- Computational Chemistry
- Astronomy
- Geo-Physics
- Financial Simulation
Another 8 applications from 4 domains are in
evaluation stage
32EGEE Highlights - Middleware
- Now at gLite release 1.4
- Focus on basic services, easy installation and
management - Industry friendly open source license
- Several gLite components already in production
usage - Roadmap established for full deployment and usage
33 Towards EGEE-II
- EGEE-II proposal
- Submitted to the EU on 8 September 2005
- Very good feedback, expectconfirmation before
Christmas - Proposed start is 1 April 2006
- Expanded consortium
- gt 90 partners in 32 countries (also
non-European partners) - 27 countries throughrelated projects
- Natural continuation of EGEE buildingon our
expertise and experience - 1st conference in September in Geneva
34Related Projects
- EGEE as incubator supporter of related projects
35Sustainability - beyond EGEE-II
- Need to prepare for permanent Grid infrastructure
- Maintain Europes leading position in global
science Grids - Ensure a reliable and adaptive support for all
sciences - Independent of project funding cycles
- Modelled on success of GÉANT
- Infrastructure managed centrally in collaboration
with national bodies - Proposal European Grid Organisation (EGO)
36Establishing EGO
- Objectives
- Operate production Grid infrastructures for all
sciences - Integrate, test, validate and package Grid
middleware - Provide advice, training and support to new user
communities
37Plans for 2006 - Priorities
38Top level Priorities
- LCG
- Remains highest priority
- Last year before beam !!
- Maintain a sound computing infrastructure
- Secure personnel resources
- EGEE
- complete phase 1
- smooth transition to phase 2
- EGO
- Vital to assure long-term sustainability for the
Grid - Have agreed plan ready by June 2006
39Infrastructure
- Maintain the high level of service quality
- Security
- Continued vigilance on all services
- Close high numbered ports
- CNIC implementation and deployment
- CRA
- Deployment, additions, training
- Account cleanup
- ONE solution for certificates
- Global Computer Centre monitoring
- Off-site backup and archiving
- Streamline services, incl. turning some off
40Others
- IT staff plan
- Top priority for Department Head
- Decrease of personnel funds is ITs biggest
problem - Review with HR Christmas working rules (once
more) - Update material MTP
- Launch openlab II
- Improve IT communication and Web pages
- Office space in bldg 28
41Thank you very muchfor another very good year
Best wishes to you and your families for 2006