Add title here - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Add title here

Description:

Rob Fowler (Joint work with Allan Porterfield and Todd Gamblin) ... SWAN. ADCIRC. Floodplain Maps. Storm Surge Forecasts. Environmental Modeling Workflows ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 25
Provided by: csU8
Category:
Tags: add | here | swan | title

less

Transcript and Presenter's Notes

Title: Add title here


1
Student opportunities at RENCI
Rob Fowler (Joint work with Allan Porterfield and
Todd Gamblin) Chief Domain Scientist,
HPC Renaissance Computing Institute Aug 18, 2008
2
What we are
  • Renaissance Computing Institute
  • Founded 2004
  • Stakeholders
  • Triangle.edu UNC-CH, Duke, NCSU
  • Statewide.edu ECU, UNC-Charlotte, UNC-Ashville,
    ASU,
  • NC.gov, counties,
  • Federal agencies NSF, DOE, DoD, NOAA, FEMA,
  • Mission
  • Enhance the capabilities of our stakeholders.
  • Solve important problems.
  • Strategies
  • direct effort, technology transfer, collaborative
    engagement.

3
Where we are.
Engagement sites-- UNC-CH -- UNC Med. Lib --
Duke -- NCSU -- ECU -- UNC Charlotte-- UNC
Asheville
RENCI Anchor
RENCI-UNC (ITS Manning)
4
Current Research Opportunities
  • Application Areas
  • Biomedical Visual analytics.
  • Health delivery.
  • Climate and Environmental Modeling.
  • Emergency Response
  • Core Computer Science
  • Performance monitoring and analysis on million
    thread systems. Application of data mining
    methods.
  • Resource-centric monitoring and analysis for
    multi- and many-core systems.

5
RENCIs Disaster Studies Group
  • Use technology to solve problems in North
    Carolina
  • Environmental modeling
  • Collaborative workspaces emergency managers
  • Environmental sensing

6
Opportunities
  • Model coupling
  • Linking weather, hydrology, and storm surge
    models together. Data assimilation (from
    sensors)
  • Work flow
  • Management of processes, recovery from failure of
    one element
  • Mashups
  • Support situational awareness during disasters
  • Asset and people tracking
  • Information flow control
  • Weather and communication tools for emergency
    management community

7
Environmental Model Coupling
Floodplain Maps
Storm Surge Forecasts
8
Environmental Modeling Workflows
WRF Preprocessing System (WPS)
Var3D
Graphics Post-processing
NC EcoNet
RADAR
Brunswick Sensors
MRR
Consumers
9
Mashups for NC Emergency Managers
10
Performance Monitoring and Analysis
  • Emerging technologies ? Challenges
  • On-chip parallelism
  • Prodigeous concurrent computation (cores)
  • Limited shared resources (L3, memory, I/O)
  • High node counts (100K to Millions)
  • Very, very high degree of parallelism.
  • Limited to spend on I/O, interconnect
  • New system balance issues at all levels
  • Dealing with Amdahls Law writ large.
  • Conserving scarce shared resources.

11
RENCI activities.
  • Resource-centric, on-node measurement
  • Interaction of threads at shared resources
  • Limited budget for monitoring analysis
  • On-chip filtering/introspection/feedback
  • Hardware bottlenecks first, software later
  • Adaptive application runtime
  • Bottlenecks? Power and Perf. Adaptation
  • Tools at full scale.
  • Limited communication/IO budget
  • In situ measurement/analysis/diagnosis
  • Focus on scalability issues balance,
    serialization
  • Very large, long-running, adaptive apps.

12
Why is performance not obvious?
  • Hardware complexity
  • Keeping up with Moores law with one thread.
  • Instruction-level parallelism.
  • Deeply pipelined, out-of-order, superscalar,
    threaded.
  • Memory-system parallelism
  • Parallel processor-cache interface, limited
    resources.
  • Need at least k concurrent memory accesses in
    flight.
  • Software complexity
  • Program size, languages, styles
  • Competition/cooperation with other threads
  • Dependence on (dynamic) libraries.
  • Compilers

13
? Each core needs 2 to 6 ops in flight to hide
latencies and get decent bandwidth.
Implications for DDRn memory architecture?
(John McCalpin, AMD, July 2007)
14
System BalanceMulticore Economics 101
8 cores/chip 8 threads per core8 FBDIMM chains
per system. X 4 sticks per chain
Announcement Niagra 2 chipwill be available for
lt100032 DIMMS _at_ 100 3200
Niagra 2 chip is nominally a 95
watt Part. Micron dual rank FBDIMM 15W
single rank 10.5Wvs 5.5 W/rank for DDR2
15
Resource-Centric Tools
  • Utilization and serialization at shared resources
    will dominate performance.
  • Hardware Memory hierarchy, channels.
  • Software Synchronization, scheduling.
  • Tools need to focus on these issues.
  • It will be (too) easy to over-provision a chip
    with cores relative to all else.
  • Memory effects are obvious 1st target
  • Contention for shared cache big footprints
  • Bus/memory utilization.
  • DRAM page contention too many streams
  • Reflection make data available for introspective
    adaptation.

16
Cores vs Nest Issues for HPM Software
  • Performance sensors in every block.
  • Nest counters extend the processor model.
  • Current models
  • Process/thread centric monitoring
  • Virtual counters follow threads. Expensive,
    tricky.
  • Node wide, but now (corePID centric)
  • Inadequate monitoring of core-nest-core
    interaction.
  • No monitoring of fine grain thread-thread
    interactions (on-core resource contention).
  • No monitoring of concurrency resources.

Nest
Cores
17
HPM on a Multicore Chip.Who can measure what.
Counters within a core can measure events in that
core, or in the nest.
Core 0
Core 1
Core 2
Core 3
CTRS
CTRS
CTRS
CTRS
FPU
FPU
FPU
FPU
L1
L1
L1
L1
L2
L2
L2
L2
Nest
L3
Mem-CTL
NIC
DDR-A
DDR-B
DDR-C
HT-1
HT-1
HT-1
Sensor
Counter
18
RCRTool Strategy
One core (0) measures nest events. The others
monitor core events. Core 0 processes the event
logs of all cores. Runs on-node
analysis. MAESTRO All other jitter producing
OS activity confined to core 0.
Core 0
Core 1
Core 2
Core 3
CTRS
CTRS
CTRS
CTRS
FPU
FPU
FPU
FPU
L1
L1
L1
L1
L2
L2
L2
L2
Nest
L3
Mem-CTL
NIC
DDR-A
DDR-B
DDR-C
HT-1
HT-1
HT-1
Sensor
Counter
19
RCRTool Architecture.
Similar to, but extends DCPI, oprofile, pfmon,
Histograms, conflict graphs, on-line summaries
and off-line reports.
Analysis Demon
HW Events
SW Events
Context switches
HPM driver Core 0 EBS events IBS events
HPM driver EBS events IBS events
HPM driver EBS events IBS events
HPM driver EBS events IBS events
Locks, queues, etc.
Power monitors
Kernel space log
20
The need for scalable tools.(Todd Gamblins
dissertation work.)
  • Fastest machines are increasingly concurrent
  • Exponential growth in concurrency
  • BlueGene/L now has 212,992 processors
  • Million core systems soon.
  • Need tools that can collect and analyze data from
    this many cores

Concurrency levels in the Top 100
http//www.top500.org
21
Challenges for Performance Tools
  • Problem 2 Analysis
  • Even if data could be stored to disks offline,
    this would only delay the problem
  • Performance characterization must be manageable
    by humans
  • Implies a concise description
  • Eliminate redundancy
  • e.g. in SPMD codes, most processes are similar
  • Traditional scripts and data mining techniques
    wont cut it
  • Need processing power in proportion to the data
  • Need to perform the analysis online, in situ

22
Motherboard Power Monitor
Prototype slightly wider than a disk bay,.
Estimated materials costs for a build of 100
45, smaller redesign 35
23
Contact Information
Rob Fowler rjf_at_renci.org, rjf_at_unc.edu 919 445
9670 RENCI http//www.renci.org/
24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com