Memory Technology: Present and Future - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Memory Technology: Present and Future

Description:

ADSL. LSI. L5B9236. 10Mb Only Port. LED's. 25MHz. Globespan. AFE. AFE = Analog Front End. 10/3/09 ... ADSL Modem Transistors. 10/3/09. 7. Cable Modem. TNETC4320 ... – PowerPoint PPT presentation

Number of Views:235
Avg rating:3.0/5.0
Slides: 31
Provided by: mic498
Category:

less

Transcript and Presenter's Notes

Title: Memory Technology: Present and Future


1
Memory TechnologyPresent and Future
  • Dean Klein
  • V.P. Market Development
  • Micron Technology, Inc.

2
Agenda
  • The importance of memory
  • Memory issues
  • Active memory

3
802.11 Switch/Router
8Mb NOR Flash
AT75C510 Network Controller
LEDs
16Mb EDO DRAM
RF Module
KS8995
10/100
HFA 3683
HFA 3861
10/100
10/100
HFA 3783
10/100
4
802.11 Switch Transistors
5
ADSL Modem
LEDs
25MHz
Intel i960 CPU
32Mb EDO DRAM
16Mb NOR Flash
Globespan DSP/Framer
LSI L5B9236
10Mb Only Port
Globespan AFE
ADSL
AFE Analog Front End
6
ADSL Modem Transistors
7
Cable Modem
Ethernet
USB
66MHz
TNETC4320
64Mb- 128Mb SDRAM
64Mb Flash (opt.)
TNETC4042 Cable MAC/Phy
Toshiba Tuner
Cable
8
Cable Modem Transistors
9
P4 Personal Computer
Intel P4 CPU
256MB DDR DRAM
Nvidia GeForce
I845 MCH
AGP
Graphics Memory
I845 ICH
10
Computer Transistors
11
Computer Silicon Area
12
Memory is Centric How Computers Work
CPU
256MB DDR DRAM
Graphics Controller
MCH
AGP
Graphics Memory
ICH
Disk
13
Even in Processors, Memory is Centric
  • Intels Itanium 2 CPU
  • Over half of the die is memory, primarily as L1,
    L2 and L3 caches.

14
Memory Growth in Processors
15
Where Memory Isnt Centric
16
Is Memory Keeping Up?
  • 24GB/sec
  • 4.2GB/sec

Source MDR, Micron, Intel
17
CPU to Memory Latency
Source Micro Design Resources
18
Is Memory the Problem???
  • Memory processes have focused on bits vs.
    bandwidth.
  • Physics limit the performance of the memory core,
    but.
  • There is a tremendous untapped bandwidth to
    memory on the memory device.
  • The bus is the bottleneck.
  • The data resides in memory. Can we take the CPU
    to the memory???

19
Exploiting the Bandwidth
  • Single SDRAM chip has internal data bandwidth of
    gt200 Gbits/s
  • Very wide data bus (thousands of bits)
  • Vast data bandwidth is an opportunity for high
    performance
  • OPS/mm2 Build efficient processing structures
  • OPS/W Dont drive long distances
  • How to construct processing resource to exploit
    very wide data?
  • Conventional processor has relatively narrow bus
    and relies on high clock speed
  • Data is Parallel, so Process in Parallel

20
Another Good Idea
21
Yukon
  • Micron project codename for processing-in-memory
    project
  • Aim is to enhance processing resources of a
    system within the memory, not trying to replace
    the CPU
  • Architecture chosen SIMD Massively Parallel
    Processor
  • Hundreds of processors to access thousands of
    data bits
  • Low control overheads
  • Uses bandwidth effectively no sharing
  • Successful with data parallel applications

22
Yukon Hardware
Yukon-256 Pilot
  • Pilot Chip under construction
  • Micron 0.15mm eDRAM/ 0.18mm logic process
  • 128Mbits DRAM
  • 2048 data bits
  • 256 8-bit integer processors
  • Features for FP support
  • Control system optimised for control by HLL
  • Memory mapped onto microprocessor bus
  • Fast I/O
  • Operates like an SDRAM

23
Yukon Processing Element
  • 8 bit integer ALU
  • Add/multiply
  • IEEE754 FP Support
  • 2P Reg File decouples PE operation from DRAM
  • Neighbour interconnect
  • DRAM I/O through dedicated port

24
Yukon Array
  • PEs are connected in flexible topologies
  • 256 Element Vector
  • Linear interconnect
  • 16x16 2D Mesh
  • Orthogonal interconnect
  • Broadcast Operations
  • Data Controlled Local Autonomy

25
Yukon Software
  • Low Level Programming Tools Assembler, C
    Compiler, Simulator
  • C Array Processing Class Library
  • Data Parallel Vector and Matrix Classes for char,
    int, float etc
  • Support for multiple Yukon devices
  • Designed in from the start!
  • 3rd party independent developer
  • Mature based on system with 20 years history
  • Good foundation in applications
  • Software example

26
Yukon Performance
  • 200 MHz
  • Processing 51.2 billion 8-bit operations/sec
    peak
  • eDRAM I/O 25.6 GBytes/sec
  • Floating Point Maximum Sustained (DRAM-DRAM)
  • 629 MFLOPS-32
  • 210 MFLOPS-64
  • 6.4 million DCTs per second (8x8 16 bit)
  • 190 MByte/sec (sustained) input/output

27
Yukon Applications
  • Look for
  • Limited by memory data bottleneck
  • Parallel
  • Typical application areas
  • Image and video processing
  • Speech
  • Data mining
  • Example VoIP
  • gt500 full duplex channels/Yukon Pilot
  • Source detailed study by ISV

28
Project Status
  • Chip in factory
  • Dev. System built
  • Ready for silicon
  • C done
  • Additional libraries in development

29
Future
  • 0.15/0.18mm now
  • 0.11mm end 2003
  • Super-core design
  • Later
  • Smaller geometries, larger arrays
  • Higher performance
  • Smaller arrays for smaller scale embedded
    applications
  • Performance scaled to application

30
Conclusion
  • Memory centred approach to integrating logic and
    memory
  • Yukon
  • Massive parallelism exploits on-chip data
    bandwidth
  • Augment and complement existing technology
  • Mature programming environment
Write a Comment
User Comments (0)
About PowerShow.com