Cloud Computing Application in High Energy Physics - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Cloud Computing Application in High Energy Physics

Description:

A building block for both grids and clouds. ... platform and infrastructure. Four ... How to schedule jobs efficiently to improve the resource utilization ... – PowerPoint PPT presentation

Number of Views:291
Avg rating:3.0/5.0
Slides: 34
Provided by: infn1155
Category:

less

Transcript and Presenter's Notes

Title: Cloud Computing Application in High Energy Physics


1
Cloud Computing Application in High Energy Physics
  • Yaodong Cheng
  • IHEP, CAS
  • 2012-4-23

2
Outline
  • From Grid to Cloud
  • Some cloud projects in HEP
  • Cloud activities at IHEP, CAS

3
Outline
  • From Grid to Cloud
  • Some cloud projects in HEP
  • Activities at IHEP, CAS

4
Terminology
  • What is a grid?
  • A platform for scientific collaboration
  • A scientific tool to help with data manipulation
    and processing
  • A computational platform
  • And how it compares with a cloud?
  • A source for computing and storage capacity
  • Flexible, easy to access resource
  • And a cluster?
  • A building block for both grids and clouds

5
Grid as a collaborative platform
  • State before the Grid
  • Scientists/teams have resources or access to them
  • Teams are working independently, they do not
    share their resources (no technology support)
  • Data sharing very simply, e.g. secure copy (scp)
    between teams
  • State with a Grid
  • Teams resources are connected
  • Sharing is easy
  • Scientists could focus on their science and not
    on the technology behind it

6
Grid View
7
Clouds
  • A rather recent commercial platform
  • (Large) Pool of virtualized servers
  • Users submits not jobs, but full virtual machines
    with jobs inside them
  • Targets real-time requirements
  • Fast deployment of new virtual server
  • Can quickly react on users changing requirements
  • Standard clouds rather simple
  • Easy to use web interface
  • No collaboration support (standard Consumer
    Provider model, ideal for commercial use)

8
What is cloud
  • The NIST definition lists five essential
    characteristics of cloud computing
  • on-demand self-service
  • broad network access
  • resource pooling
  • rapid elasticity or expansion
  • measured service
  • Three "service models"
  • software, platform and infrastructure
  • Four "deployment models"
  • private, community, public and hybrid

9
Cloud and Grid A Comparison
Grid Middleware
Cloud Middleware
Computing/Data Center
Computing/Data Center
Computing/Data Center
Computing/Data Center
10
From Grid to Cloud
  • Grid has been the necessary infrastructure for
    many scientific research, e.g.. HEP
  • But, there are still some disadvantages
  • How to schedule jobs efficiently to improve the
    resource utilization (vs. static policy)
  • diversified service model on demand (vs. job
    submission)
  • Compatible with legacy programs (vs. unified
    system environment)
  • Virtualization/Cloud is feasible solution

11
Outline
  • From Grid to Cloud
  • Some cloud projects in HEP
  • Activities at IHEP, CAS

12
CernVM
  • CernVM is a baseline Virtual Software Appliance
    for the participants of CERN LHC experiments
  • Motivation
  • Software _at_LHC large, complicated
    Install/update/configure,
  • Multi-core with hardware support for
    virtualization
  • Using virtualization and extra cores to get
    extra comfort
  • zero configuration, reduce compiler-platform
    combinations
  • CernVM Build a thin Virtual Software
    Appliance for use by the LHC experiments
  • provide a complete, portable and easy to
    configure user environment
  • independent of physical software and hardware
    platforms
  • http// cernvm.cern.ch/

13
Thin Software Appliance
H T T P D
LAN/WAN (HTTP)
Software Repository
Cache
10 GB
1 GB
0.1 GB
14
CVMFS CernVM File System
On same host
On File Server
/opt/lcg -gt /chirp/localhost/opt/lcg
/opt/lcg -gt /grow/host/opt/lcg
App
CernVM Fuse
open(/opt/lcg)
!Cache
Kernel
Cache
NFS
LFS
FUSE
15
Bridging Grids Clouds
  • Volunteer Computing
  • uses computers belonging to ordinary people
  • BOINC
  • Open-source software for Volunteer Computing and
    Grid computing
  • CernVM is extended to support BOINC client
  • CernVM CoPilot development
  • Based on BOINC, LHC_at_home experience and CernVM
    image
  • Image size is of utmost importance to motivate
    volunteers
  • Can be easily adapted to Pilot Job frameworks
    (AliEn,Dirac, PanDA)

16
CernVM CoPilot Architecture
17
lxcloud
  • CERN Internal Cloud
  • Highly scalable, Linux (KVM) based cloud-like
    infrastructure
  • Optimized for efficiency/speed

18
Resource Pool details
  • Quattor managed pool of resources
  • Hardware (cheap) CPU server type, local disks
  • LANDB integration
  • Pre-allocation of VM slots in landb
  • Hypervisor knows the name of guests
  • Disk management
  • Use of LVM snapshots
  • All free disk space in one big LV
  • Pre-stage raw images on LV on the hypyerviors
  • Fast installation of VMs Using LV snapshots

19
Image management
  • Central image catalogue (VMIC)
  • Close collaboration with HEPiX
  • No direct user access/user images
  • Images require endorsement by IT
  • Image distribution system
  • Image distribution repository of trusted images
  • Fast distribution using Bit-torrent (rtorrent)
  • Pull model Hypervisors ask if there are updates
  • Transparent update of images using LV tools
  • Hypervisors advertise existing images

20
Virtual Machine Management
  • OpenNebula
  • an open source Cloud Data Center Management
    Solution
  • provides a powerful, scalable and secure
    multi-tenant cloud platformfor fast delivery
    and elasticity of virtual resources
  • OpenStack
  • The Open Source Cloud Operating System
  • The Main components
  • Compute, Object Storage, Image service
  • A interesting product worth to be checked

21
Lxcloud ecosystem
Quattor
Image creation and endorsement
Enduser VO
Golden Nodes
OpenNebula
ONE EC2 Interface
CernVM
ONE 3.0 Master
Image repository (VMIC)
Application manager
VM Provisioning
Image creation
lxcloud
Physical Resource
22
Clever A New VIM
  • CLEVER A CLoud-Enabled Virtual EnviRonment
  • To simplify the access management of
    private/hybrid clouds
  • To provide simple and easily accessible
    interfaces to interact with different
    interconnected clouds, deploy Virtual Machines
    and perform load balancing through migration

23
Clever on Grid
Administration tool
Host1
job Submission
XMPP
Ejabberd XMPP Server
Host Manager (HM)
CLEVER.jar and X.509 Certificate
User Interface
Host2
Resource Broker
Host Manager (HM)
Matchmaking and Jobs Scheduling

HostN-1
HM
HM
XQuery/XPath
Sedna Distributed Databases
CE
Cluster Manager (CM)
jobs Running
HM
Worker Nodes
HostN
Computing Element
Host Manager (HM)
tiny.vdi
Storage Element
24
Outline
  • From Grid to Cloud
  • Some cloud projects in HEP
  • Activities at IHEP, CAS

25
Virtual Cluster
  • Motivation
  • Build Virtual machine pool on physical machines,
    elastic to expand or shrink on demand
  • Flexible to support more kinds of applications
  • Compatible with legacy programs
  • RD cloud for users
  • Key technologies
  • hypervisors (KVM, XEN, ) evaluation suitable
    for HEP
  • VIM management (OpenNebula, OpenStack, )
  • Monitoring and accounting
  • Interface to PBS, WLCG, and other services
  • dynamic scheduling
  • Live migration
  • VM resource adjustment (CPU, Memory, Network, )

26
Architecture of Virtual Server
WLCG
Scheduling policy
Grid Job
Scheduler
PBS Client
Query and Modify Queue
Submit Job
PBS Server
VIM (VM create, start, pause, destroy, migration)
Power Management
VM
VM
VM
VM
Physical Machine
Physical Machine
27
PBS/Torque integration
  • Each batch queue has basic resources (physical
    nodes or Virtual machines)
  • If the jobs are too many in one queue, the
    scheduler will create some extra virtual machines
    according with scheduling policy and
    requirements, then added the new resources into
    the queue
  • The queue with higher priority needs more
    resources, the VM resources in queues with lower
    priority will be paused, even destroyed
  • Fair scheduling is very important here!
  • WLCG interface is simply via PBS/torque

28
GUI
29
BESIII Cloud
  • Integrated with Grid, volunteer computing, and
    virtualization
  • User submits jobs to BESIII portal, then these
    jobs will be dispatched to different computing
    resource
  • Volunteer computing (small sites and personal
    computers)
  • Local cluster (managed by LRMS)
  • WLCG
  • CNGrid
  • plugin framework
  • gLite, PBS, GOS plugins already completed!
  • Recently, BESIII Offline Software System (BOSS)
    has successfully run on CernVM-based CAS_at_home
  • BONIC plugin is ready!

30
CAS_at_HOME
  • CAS_at_home is the first volunteer computing
    platform in China
  • Use BOINC as its middleware
  • Launched by IHEP in January 2010
  • To help scientists from CAS or other research
    organizations in China to to run their scientific
    researches on volunteer computing resources
  • More than 9,000 user, 16,000 computer joined
    CAS_at_home

31
Architecture of BESIII Cloud
BESIII portal
Plugins (gLite, GOS, PBS, BOINC, )
BOINC Server
PBS Server
gLite WMS
GOS
Small sites and Personal Computer
CNGrid
Local Cluster
WLCG
32
Future Cloud-Grid Integration
Web Application Service
Collaboration Services
DatacenterInfrastructure
Compute Service
Database service
Cloud Grid Computing
Service Catalog
Job Scheduling Service
Storage service
Computing centerInfrastructure
Storage backup, archive service
Virtual Client service
Content Classification
33
  • Thanks!
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com