Inter-Processor Communication for Heterogeneous Dual Core Systems - PowerPoint PPT Presentation

1 / 109
About This Presentation
Title:

Inter-Processor Communication for Heterogeneous Dual Core Systems

Description:

Nokia DSP Gateway. TI DSP/BIOS Link. IPC Hardware Architecture. Conclusions. IPC Overview ... drwxr-xr-x arch. drwxr-xr-x block. drwxr-xr-x crypto. drwxr-xr-x ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 110
Provided by: chunmin
Category:

less

Transcript and Presenter's Notes

Title: Inter-Processor Communication for Heterogeneous Dual Core Systems


1
Inter-Processor Communication for Heterogeneous
Dual Core Systems
2006/09/27
  • Chun-Ming Huang, Ph.D.
  • National Chip Implementation Center (CIC)
  • cmhuang_at_cic.org.tw

2
Agenda
  • IPC Overview
  • IPC Schemes
  • Nokia DSP Gateway
  • TI DSP/BIOS Link
  • IPC Hardware Architecture
  • Conclusions

3
IPC Overview
4
What is IPC?
  • Inter-Process Communication
  • Inter-Processor Communication

Single-Chip Multi-Chip
Single-Core
Multi-Core
How to provide inter-process communication
services for multi-core systems?
5
Independent Cooperating Process
  • Processes executing concurrently in the
    multitasking environment may be either
    independent processes or cooperating processes
  • A process is independent if it cannot affect or
    be affected by the other processes executing in
    the system any process that does not share data
    with any other process is independent
  • A process is cooperating if it can affect or be
    affected by the other processes executing in the
    system any process that shares data with other
    processes is a cooperating process

Silberschatz, et al., Operating System
Principles, Seventh Edition
6
Why Allow Process Cooperation?
  • Information sharing
  • Computation speedup
  • Modularity
  • Convenience
  • Cooperating processes requires an inter-process
    communication (IPC) mechanism that will allow
    them to exchange data and information

Silberschatz, et al., Operating System
Principles, Seventh Edition
7
IPC Example
  • Unix pipe
  • ls l / grep 2005 wc
  • 2 19 98
  • The grep utility searches text files for a
    pattern and prints all lines that contain that
    pattern.
  • The wc utility displays a count of lines, words
    and characters in a text file.
  • Data exchange
  • Synchronization

8
Operating System Kernel Components
  • Process scheduler
  • determines when and for how long a process
    execute on a processor
  • Memory manager
  • determines when and how memory is allocated to
    processes and what to do when memory becomes full
  • I/O manager
  • services input and output requests from and to
    hardware devices
  • Inter-process communication (IPC) manager
  • allows processes to communicate with one other
  • File system manager
  • organizes named collections of data on storage
    devices and provides an interface for accessing
    data on those devices

Deitel, et al., Operating Systems, Third Edition
9
Linux Kernel 2.6.17.11
drwxr-xr-x arch drwxr-xr-x block drwxr-xr-x
crypto drwxr-xr-x drivers drwxr-xr-x fs
drwxr-xr-x include drwxr-xr-x init drwxr-xr-x
ipc drwxr-xr-x kernel drwxr-xr-x lib
drwxr-xr-x mm drwxr-xr-x net drwxr-xr-x
scripts drwxr-xr-x security drwxr-xr-x sound
drwxr-xr-x usr
-rw-r--r-- Makefile -rw-r--r-- compat.c -rw-r--r--
compat_mq.c -rw-r--r-- mqueue.c -rw-r--r--
msg.c -rw-r--r-- msgutil.c -rw-r--r--
sem.c -rw-r--r-- shm.c -rw-r--r--
util.c -rw-r--r-- util.h
http//www.kernel.org
10
Machine-Independent SW in the FreeBSD Kernel
Category Lines of Code Percentage of Kernel ()
Headers 38,158 4.8
initialization 1,663 0.2
kernel facilities 53,805 6.7
generic interfaces 22,191 2.8
interprocess communication 10,019 1.3
terminal handling 5,798 0.7
virtual memory 24,714 3.1
vnode memory 22,764 2.9
local filesystem 28,067 3.5
miscellaneous filesystems (19) 58,753 7.4
network filesystem 22,436 2.8
network communication 46,570 5.8
Internet V4 protocols 41,220 5.2
Internet V6 protocols 45,527 5.7
IPsec 17,956 2.2
netgraph 74,338 9.3
cryptographic support 7,515 0.9
GEOM layer 11,563 1.4
CAM layer 41,805 5.2
ATA layer 14,192 1.8
ISA bus 10,984 1.4
PCI bus 72,366 9.1
pccard bus 6,916 0.9
Linux compatibility 10,474 1.3
Total Machine Independent 689,794 86.4
McKusic Neville-Neil, The Design and
Implementation of the FreeBSD Operating System
11
Homogeneous vs. Heterogeneous
Sun
TI OMAP 5910
12
Multiprocessor OS Organizations
  • Can classify systems based on how processors
    share operating system responsibilities
  • Three types
  • Master/slave
  • Separate kernels
  • Symmetrical organization

Deitel, et al., Operating Systems, Third Edition
13
Master/Slave
  • Master/Slave organization
  • Master processor executes the operating system
  • Slaves execute only user processors
  • Hardware asymmetry
  • Low fault tolerance
  • Good for computationally intensive jobs
  • Example nCUBE system

Deitel, et al., Operating Systems, Third Edition
14
Separate Kernels
  • Separate kernels organization
  • Each processor executes its own operating system
  • Some globally shared operating system data
  • Loosely coupled
  • Catastrophic failure unlikely, but failure of one
    processor results in termination of processes on
    that processor
  • Little contention over resources
  • Example Tandem system

Deitel, et al., Operating Systems, Third Edition
15
Symmetrical Organization
  • Symmetrical organization
  • Operating system manages a pool of identical
    processors
  • High amount of resource sharing
  • Need for mutual exclusion
  • Highest degree of fault tolerance of any
    organization
  • Some contention for resources
  • Example BBN Butterfly

Deitel, et al., Operating Systems, Third Edition
16
Memory Access Architectures
  • Memory access
  • Can classify multiprocessors based on how
    processors share memory
  • Goal Fast memory access from all processors to
    all memory
  • Contention in large systems makes this impractical

Deitel, et al., Operating Systems, Third Edition
17
Uniform Memory Access
  • Uniform memory access (UMA) multiprocessor
  • All processors share all memory
  • Access to any memory page is nearly the same for
    all processors and all memory modules
    (disregarding cache hits)
  • Typically uses shared bus or crossbar-switch
    matrix
  • Also called symmetric multiprocessing (SMP)
  • Small multiprocessors (typically two to eight
    processors)

Deitel, et al., Operating Systems, Third Edition
18
Uniform Memory Access
Deitel, et al., Operating Systems, Third Edition
19
Non-Uniform Memory Access
  • Non-uniform memory access (NUMA) multiprocessor
  • Each node contains a few processors and a portion
    of system memory, which is local to that node
  • Access to local memory faster than access to
    global memory (rest of memory)
  • More scalable than UMA (fewer bus collisions)

Deitel, et al., Operating Systems, Third Edition
20
Non-Uniform Memory Access
Deitel, et al., Operating Systems, Third Edition
21
Cache-Only Memory Architecture
  • Cache-only memory architecture (COMA)
    multiprocessor
  • Physically interconnected as a NUMA is
  • Local memory vs. global memory
  • Main memory is viewed as a cache and called an
    attraction memory (AM)
  • Allows system to migrate data to node that most
    often accesses it at granularity of a memory line
    (more efficient than a memory page)
  • Reduces the number of cache misses serviced
    remotely
  • Overhead
  • Duplicated data items
  • Complex protocol to ensure all updates are
    received at all processors

Deitel, et al., Operating Systems, Third Edition
22
Cache-Only Memory Architecture
Deitel, et al., Operating Systems, Third Edition
23
No Remote Memory Access
  • No-remote-memory-access (NORMA) multiprocessor
  • Does not share physical memory
  • Some implement the illusion of shared physical
    memoryshared virtual memory (SVM)
  • Loosely coupled
  • Communication through explicit messages
  • Distributed systems
  • Not networked system

Deitel, et al., Operating Systems, Third Edition
24
No Remote Memory Access
Deitel, et al., Operating Systems, Third Edition
25
Four Possible Cases
Symmetrical OSs Asymmetrical OSs
Homogeneous Cores CPU_A(OS_X) CPU_A(OS_X) CPU_A(OS_X) CPU_A(OS_Y)
Heterogeneous Cores CPU_A(OS_X) CPU_B(OS_X) CPU_A(OS_X) CPU_B(OS_Y)
26
IPC Schemes
27
Communication via Files
  • Communication via files is in fact the oldest way
    of exchanging data between programs. Program A
    writes data to a file and Program B reads it. In
    a system in which only one program can be run at
    any given time, this does not present any
    problem.
  • In a multitasking system, however both programs
    could be run as processes at least quasi-parallel
    to each other. Race conditions then usually
    produce inconsistencies in the file data which
    result from one program reading a data area
    before the other has finished modifying it, or
    both processes modifying the same area of memory
    at the same time.

28
Communication via Files
  • Locking entire files
  • lock file
  • fcntl( ) (POSIX), flock( ) (BSD 4.3)
  • Locking file areas (record locking)
  • Deadlock

29
Process Communication Models
  • Message passing
  • Shared memory

Silberschatz, et al., Operating System
Principles, Seventh Edition
30
IPC for Linux
  • Linux IPC
  • Many IPC mechanisms derived from traditional UNIX
    IPC
  • Allow processes to exchange information
  • Some are better suited for particular
    applications
  • For example, those that communicate over a
    network or exchange short messages with other
    local applications

Deitel, et al., Operating Systems, Third Edition
31
IPC for Linux
  • Signal
  • Pipe
  • Message queue
  • Shared memory
  • System V Semaphores
  • Sockets

32
Signals
  • Signals
  • One of the first interprocess communication
    mechanisms available in UNIX systems
  • Kernel uses them to notify processes when certain
    events occur
  • Do not allow processes to specify more than a
    word of data to exchange with other processes
  • Created by the kernel in response to interrupts
    and exceptions, are sent to a process or thread
  • as a result of executing an instruction (such as
    a segmentation fault)
  • from another process (such as when one process
    terminates another)
  • from an asynchronous event

Deitel, et al., Operating Systems, Third Edition
33
POSIX Signals
Deitel, et al., Operating Systems, Third Edition
34
Signals
  • A process/thread can handle a signal by
  • Ignore the signalprocesses can ignore all but
    the SIGSTOP and SIGKILL signals.
  • Catch the signalwhen a process catches a signal,
    it invokes its signal handler to respond to the
    signal.
  • Execute the default action that the kernel
    defines for that signal
  • Default actions
  • Abort terminate immediately
  • Memory dump Copies execution context before
    exiting
  • Ignore
  • Stop (i.e., suspend)
  • Continue (i.e., resume)

Deitel, et al., Operating Systems, Third Edition
35
Signals
  • Signal blocking
  • A process or thread can block a signal
  • Signal is not delivered until process/thread
    stops blocking it
  • While a signal handler is running, signals of
    that type are blocked by default
  • Still possible to receive signals of a different
    type
  • Common signals are not queued
  • Real-time signals provide signal queuing

Deitel, et al., Operating Systems, Third Edition
36
Pipes
  • Pipes ?
  • Producer process writes data to the pipe, after
    which the consumer process reads data from the
    pipe in first-in-first-out order
  • When pipe is created, an inode that points to
    pipe buffer (page of data) is created
  • Access to pipes is controlled by file descriptors
  • Can be passed between related processes (e.g.,
    parent and child)
  • Named pipes (FIFOs) ?
  • Can be accessed via the directory tree
  • Limitation Fixed-size buffer

Deitel, et al., Operating Systems, Third Edition
37
Message Queues
  • Message queues
  • Allow processes to transmit information that is
    composed of a message type and a variable-length
    data area
  • Stored in message queues, remain until a process
    is ready to receive them
  • Related processes can search for a message queue
    identifier in a global array of message queue
    descriptors
  • Message queue descriptor contains
  • Queue of pending messages
  • Queue of processes waiting for messages
  • Queue of processes waiting to send messages
  • Data describing the size and contents of the
    message queue

Deitel, et al., Operating Systems, Third Edition
38
Shared Memory
  • Shared memory protection schemes
  • Advantages
  • Improves performance for processes that
    frequently access shared data
  • Processes can share as much data as they can
    address
  • Standard interfaces
  • System V shared memory
  • POSIX shared memory
  • Does not allow processes to change privileges for
    a segment of shared memory

Deitel, et al., Operating Systems, Third Edition
39
System V Shared Memory System Calls
Deitel, et al., Operating Systems, Third Edition
40
Shared Memory
  • Shared memory implementation
  • Treats region of shared memory as a file
  • Shared memory page frames are freed when file is
    deleted
  • Tmpfs (temporary file system) stores such files
  • Tmpfs pages are swappable
  • Permissions can be set
  • File system does not require formatting

Deitel, et al., Operating Systems, Third Edition
41
System V Semaphores
  • System V semaphores
  • Designed for user processes to access via the
    system call interface
  • Semaphore arrays
  • Protect a group of related resources
  • Before a process can access resources protected
    by a semaphore array, the kernel requires that
    there be sufficient available resources to
    satisfy the processs request
  • Otherwise, kernel blocks requesting process until
    resources become available
  • Preventing deadlock
  • When a process exits, the kernel reverses all the
    semaphore operations it performed to allocate its
    resources

Deitel, et al., Operating Systems, Third Edition
42
Sockets
  • Sockets
  • Allows pairs of processes to exchange data by
    establishing direct bidirectional communication
    channels
  • Primarily used for bidirectional communication
    between multiple processes on different systems,
    but can be used for processes on the same system
  • Stored internally as files
  • File name used as sockets address, accessed via
    the VFS

Deitel, et al., Operating Systems, Third Edition
43
Sockets
  • Stream sockets
  • Implement the traditional client/server model
  • Data is transferred as a stream of bytes
  • Use TCP to communicate, so they are more
    appropriate for reliable communication
  • Datagram sockets
  • Faster, but less reliable communication
  • Data is transferred using datagram packets
  • Socketpairs
  • Pair of connected, unnamed sockets
  • Limited to use by processes that share file
    descriptors

Deitel, et al., Operating Systems, Third Edition
44
sf01acmhuang/ ipcs IPC status from ltrunning
systemgt as of Thu Sep 21 143530 CST 2006 T
ID KEY MODE OWNER
GROUP Message Queues Shared Memory m
1 0x50000d1d --rw-r--r-- root root m
2 0xabbaca01 --rw-rw-rw- pc62
TR m 3103 0 --rw-rw-rw- cmhuang
DSD m 1404 0 --rw-rw-rw-
root root Semaphores s 0 0x1
--ra-ra-ra- root root s 2031617
0 --ra-ra-ra- cmhuang DSD s
917506 0 --ra-ra-ra- cmhuang DSD
45
IPC for WinXP
  • Data oriented
  • Pipes
  • Mailslots (message queues)
  • Shared memory
  • Procedure oriented / object oriented
  • Remote procedure calls
  • Microsoft COM objects
  • Clipboard
  • GUI drag-and-drop capability

Deitel, et al., Operating Systems, Third Edition
46
Pipes
  • Manipulated with file system calls
  • Read
  • Write
  • Open
  • Pipe server
  • Process that creates pipe
  • Pipe clients
  • Processes that connect to pipe
  • Modes
  • Read pipe server receives data from pipe clients
  • Write pipe server sends data to pipe clients
  • Duplex pipe server sends and receives data

Deitel, et al., Operating Systems, Third Edition
47
Pipes
  • Anonymous Pipes
  • Unidirectional
  • Between local processes
  • Synchronous
  • Pipe handles, usually passed through inheritance
  • Named Pipes
  • Unidirectional or bidirectional
  • Between local or remote processes
  • Synchronous or asynchronous
  • Opened by name
  • Byte stream vs. message stream
  • Default mode vs. write-through mode

Deitel, et al., Operating Systems, Third Edition
48
Mailslots
  • Mailslot server creates mailslot
  • Mailslot clients send messages to mailslot
  • Communication
  • Unidirectional
  • No acknowledgement of receipt
  • Local or remote communication
  • Implemented as files
  • Two modes
  • Datagram for small messages
  • Server Message Block (SMB) for large messages

Deitel, et al., Operating Systems, Third Edition
49
Shared Memory
  • File mapping
  • Processes map their virtual memory to same page
    frames in physical memory
  • Multiple processes access same file
  • No synchronization guaranteed
  • File mapping object
  • Maps file to main memory
  • File view
  • Maps a processs virtual memory to main memory
    mapped by file mapping object

Deitel, et al., Operating Systems, Third Edition
50
Nokia DSP Gateway
51
Nokia DSP Gateway Overview
  • Supports TI OMAP1510, 1610, 5910, 5912, 2410, and
    2412.
  • GPP side
  • Linux kernel 2.6.6?
  • Linux device driver
  • Access DSP through normal system calls such as
    read() and write()
  • DSP side
  • TI DSP/BIOS
  • DSP kernel library (tokliBIOS) and API

http//dspgateway.sourceforge.net/pub/index.php
52
Nokia DSP Gateway Overview
  • Current version 3.3.1 (2006-09-13)
  • Open source software
  • Current license state

Release License License
1.0 GPL GPL
2.X GPL GPL
3.X ARM pack DSP pack
3.X GPL BSD
53
TI OMAP 1610
54
Summary of changes from v2.6.5 to v2.6.6

lttony_at_com.rmk.(none)gt ARM PATCH 1777/1 Add
TI OMAP support to ARM core files Patch from
Tony Lindgren This patch updates the ARM Linux
core files to add support for Texas Instruments
OMAP-1510, 1610, and 730 processors. OMAP is an
embedded ARM processor with integrated DSP.
OMAP-1610 has hardware support for USB OTG,
which might be of interest to Linux developers.
OMAP-1610 could be easily be used as development
platform to add USB OTG support to Linux. This
patch is an updated version of an earlier patch
1767/1 with the dummy Kconfig added for OMAP as
suggested by Russell King here
http//www.arm.linux.org.uk/developer/patches/vi
ewpatch.php?id1767/1 This patch is brought to
you by various linux-omap developers.
http//www.kernel.org/pub/linux/kernel/v2.6/Change
Log-2.6.6
55
TI DSP/BIOS
  • Scalable real-time kernel
  • Real-time scheduling and synchronization
  • Host-to-target communication
  • Real-time instrumentation
  • Preemptive multi-threading
  • Hardware abstraction
  • Real-time analysis and configuration tools
  • Application programs use DSP/BIOS by making calls
    to the API
  • All DSP/BIOS modules provide C-callable interfaces

56
DSP Gateway System Architecture
57
Mailbox in OMAP1
  • Each set of mailbox registers consists of two
    16-bit registers and a 1-bit flag register.
  • The interrupting processor can use one 16-bit
    register to pass a data word to the interrupted
    processor and the other 16-bit register to pass a
    command word.

58
Mailbox in OMAP2
  • 6 sets of mailbox registers, and each message
    register can carry a 32-bit data
  • two mailbox queues are reserved, MAILBOX_0 for
    ARM to DSP direction and MAILBOX_1 for DSP to ARM
    direction

59
Mailbox Command and Data Register
  • Command register bit definitions
  • Data register bit definitions

60
Mailbox Command Definition
61
Mailbox Command Sequence
  • Configuration sequence
  • System configuration
  • Task configuration
  • Task add/delete
  • Data transfer sequence
  • ARM to DSP transfer
  • DSP to ARM transfer
  • Task control
  • Read/write DSP register
  • Read/write DSP system parameters

62
System Configuration Sequence
63
DSPCFG Command
64
ARM to DSP Passive Word Receiving
65
ARM to DSP Active Word Receiving
66
ARM to DSP Passive Block Receiving
67
IPC Buffer
  • It is unrealistic to transfer a large amount of
    data between two processors with only mailbox
    registers. Therefore, IPBUF (Inter-Processor
    Buffer) is introduced for the large block data
    transfer.
  • There are three types of IPBUFs
  • Global IPBUF
  • Private IPBUF
  • System IPBUF

68
Global IPBUF
  • The Global IPBUFs are defined for the block data
    transfer between ARM and DSP.
  • The Global IPBUF lines are identified with BID
    (Buffer ID), and all tasks can use them commonly.
  • The maximum line size is 64k words (128k bytes).

69
Global IPBUF
70
DSP Gateway Linux Device Interfaces
71
DSP Gateway Linux APIs
72
Passive Receiving Task
73
Active Receiving Task
74
TI DSP/BIOS Link
75
TI DSP/BIOS Link
  • For TI OMAP5910/5912, Davinci, and DM642 devices.
  • DSP/BIOS Link is a no-charge, royalty-free
    product and is provided in C source code form.
  • Current version 1.30.06 (Nov. 22, 2005)
  • Portable across different operating systems.
  • OS (GPP) DSP/BIOS (DSP)

http//focus.ti.com/dsp/docs/dspsupportatn.tsp?sec
tionId3tabId477familyId44toolTypeId5
76
DSP/BIOS Link Supported Platforms
  • Davinci running Montavista Linux Pro 4.0 or
    PrKernel v4.1 on ARM
  • OMAP5912 running Montavista Linux Pro 3.1 on ARM
  • DA300 running PrKernel v4.1 on ARM
  • DM642 connected to a PC running Red Hat Linux 9.0
    or Red Hat Enterprise Linux 4.0

77
Software Architecture of DSP/BIOS Link
78
On the GPP Side
  • The OS ADAPTATION LAYER encapsulates the generic
    OS services that are required by the other
    components of DSP/BIOS LINK. This component
    exports a generic API that insulates the other
    components from the specifics of an OS. All other
    components use this API instead of direct OS
    calls. This makes DSP/BIOS LINK portable across
    different operating systems.
  • The LINK DRIVER encapsulates the low-level
    control operations on the physical link between
    the GPP and DSP. This module is responsible for
    controlling the execution of the DSP and data
    transfer using defined protocol across the
    GPP-DSP boundary.

79
On the GPP Side
  • The PROCESSOR MANAGER maintains book-keeping
    information for all components. It also allows
    different boot-loaders to be plugged into the
    system. It builds exposes the control operations
    provided by the LINK DRIVER to the user through
    the API layer.
  • The DSP/BIOS LINK API is interface for all
    clients on the GPP side. This is a very thin
    component and usually doesnt do any more
    processing than parameter validation. The API
    layer can be considered as skin on the muscle
    mass contained in the PROCESSOR MANAGER and LINK
    DRIVER.

80
On the DSP Side
  • The LINK DRIVER is one of the drivers in
    DSP/BIOS. This driver specializes in
    communicating with the GPP over the physical
    link.
  • There is no specific DSP/BIOS LINK API on the
    DSP. The communication (data/message transfer) is
    done using the DSP/BIOS modules - SIO/GIO/MSGQ.

81
DSP/BIOS Link Key Components
  • PROC
  • This component represents the DSP processor in
    the application space.
  • This component provides services to
  • Initialize the DSP make it available for access
    from the GPP.
  • Load code on the DSP.
  • Start execution from the run address specified in
    the executable.
  • Read from or write to DSP memory.
  • Stop execution.
  • Additional platform-specific control actions.
  • In the current version, only one processor is
    supported. However, the APIs are designed to
    support multiple DSPs and hence they accept a
    processorID argument to support this future
    enhancement.

82
DSP/BIOS Link Key Components
  • CHNL
  • This component represents a logical data transfer
    channel in the application space.
  • CHNL is responsible for the data transfer across
    the GPP and DSP.
  • CHNL is an acronym for channel.
  • A channel (when referred in context of DSP/BIOS
    LINK) is
  • A means of transferring data across GPP and DSP.
  • A logical entity mapped over a physical
    connectivity between the GPP and DSP.
  • Uniquely identified by a number within the range
    of channels for a specific physical link towards
    a DSP.
  • Unidirectional. The direction of a channel is
    decided at run time based on the attributes
    passed to the corresponding API.

83
DSP/BIOS Link Key Components
  • MSGQ
  • This component represents queue based messaging
  • This component is responsible for exchanging
    short messages of variable length between the GPP
    and DSP clients. It is based on the MSGQ module
    in DSP/BIOS.
  • The messages are sent and received through
    message queues.
  • A reader gets the message from the queue and a
    writer puts the message on a queue. A message
    queue can have only one reader and many writers.
    A task may read from and write to multiple
    message queues.

84
DSP/BIOS Link Key Components
  • POOL
  • This component provides APIs to open and close
    memory pools, which are used by the CHNL and MSGQ
    component for allocating the buffers used in data
    transfer and messaging respectively.
  • This component is responsible for providing a
    uniform view of different memory pool
    implementations, which may be specific to the
    hardware architecture or OS on which DSP/BIOS
    LINK is ported. This component is based on the
    POOL interface in DSP/BIOS.

85
Initialization Phase API
  • PROC
  • PROC_Setup()
  • PROC_Attach()
  • PROC_Load()
  • CHNL
  • CHNL_Create()
  • CHNL_AllocateBuffer()
  • MSGQ
  • MSGQ_TransportOpen()
  • MSGQ_Open()
  • MSGQ_SetErrorHandler()
  • MSGQ_Locate()
  • POOL
  • POOL_Open()

86
Execution Phase API
  • PROC
  • PROC_Start()
  • PROC_Read()
  • PROC_Write()
  • PROC_Stop()
  • CHNL
  • CHNL_Issue()
  • CHNL_Reclaim()
  • MSGQ
  • MSGQ_Alloc()
  • MSGQ_Put()
  • MSGQ_Get()
  • MSGQ_GetSrcQueue()
  • MSGQ_Free()

87
Finalization Phase API
  • PROC
  • PROC_Detach()
  • PROC_Destroy()
  • CHNL
  • CHNL_FreeBuffer()
  • CHNL_Delete()
  • MSGQ
  • MSGQ_Release()
  • MSGQ_TransportClose()
  • MSGQ_Close()
  • POOL
  • POOL_Close()

88
IPC Hardware Architecture
89
Tightly Coupled vs. Loosely Coupled Systems
  • Tightly coupled systems
  • Processors share most resources including memory
  • Communicate over shared buses using shared
    physical memory
  • Loosely coupled systems
  • Processors do not share most resources
  • Most communication through explicit messages or
    shared virtual memory (although not shared
    physical memory)
  • Comparison
  • Loosely coupled systems more flexible, fault
    tolerant, scalable
  • Tightly coupled systems more efficient, less
    burden to operating system programmers

Deitel, et al., Operating Systems, Third Edition
90
Tightly Coupled Systems
Deitel, et al., Operating Systems, Third Edition
91
Loosely Coupled Systems
Deitel, et al., Operating Systems, Third Edition
92
Processor Interconnection Schemes
  • Interconnection scheme
  • Describes how the systems components, such as
    processors and memory modules, are connected
  • Consists of nodes (components or switches) and
    links (connections)
  • Parameters used to evaluate interconnection
    schemes
  • Node degree
  • Bisection width
  • Network diameter
  • Cost of the interconnection scheme

Deitel, et al., Operating Systems, Third Edition
93
Processor Interconnection Schemes
Shared bus multiprocessor organization.
Deitel, et al., Operating Systems, Third Edition
94
Processor Interconnection Schemes
Crossbar-switch matrix multiprocessor
organization.
Deitel, et al., Operating Systems, Third Edition
95
Processor Interconnection Schemes
4-connected 2-D mesh network.
Deitel, et al., Operating Systems, Third Edition
96
Processor Interconnection Schemes
3- and 4-dimensional hypercubes.
Deitel, et al., Operating Systems, Third Edition
97
Processor Interconnection Schemes
Multistage baseline network.
Deitel, et al., Operating Systems, Third Edition
98
A Simple IPC Architecture
  • ARM writes command in shared memory
  • ARM interrupts DSP
  • DSP responds to interrupt and reads command in
    shared memory
  • DSP executes a task based on the command
  • DSP interrupts ARM upon completion of the task

TMS320DM644x DMSoC ARM Subsystem Reference Guide
(SPRUE14)
99
TI OMAP5910
100
OMAP5910 IPC Architecture
  • Mailbox registers
  • Each direction 32bit x 2
  • Interrupt occurrence
  • MPU interface (MPUI)
  • MPU accesses DSP memory space directly
  • Shared memory
  • Arrangement with the Traffic Controller
  • 3 type of memories
  • Best suitable to large amount of data sharing

101
Traffic Controller (TC)
  • The IMIF allows access to the 192K bytes of
    on-chip SRAM.
  • The EMIFS interface provides 16-bit-wide access
    to asynchronous or synchronous memories.
  • The EMIFF Interface provides access to
    16-bit-wide access to standard SDRAM memories.
  • The TC provides the functions of
  • arbitrating contending accesses to the same
    memory interface from different initiators (MPU,
    DSP, System DMA, Local Bus),
  • synchronization of accesses due to the initiators
    and the memory interfaces running at different
    clock rates,
  • and the buffering of data allowing burst access
    for more efficient multiplexing of transfers from
    multiple initiators to the memory interfaces.
  • The TCs architecture allows simultaneous
    transfers between initiators and different memory
    interfaces without penalty. For instance, if the
    MPU is accessing the EMIFF at the same time, the
    DSP is accessing the IMIF, transfers may occur
    simultaneously since there is no contention for
    resources.

102
ARM IPCM Module
  • The IPCM provides up to 32 mailboxes with control
    logic and interrupt generation to support
    inter-processor communication.
  • An AHB interface enables access from source and
    destination cores.
  • The IPCM
  • sends interrupts to other cores
  • passes small amounts of data to other cores.
  • A source core can have multiple mailboxes and
    send messages in parallel (multitasking).

PrimeCell Inter-Processor Communications Module
Technical Reference Manual
103
IPCM Components
  • 1-32 programmable mailboxes, each comprising
  • a single 1-32-bit Mailbox Source Register
  • a single 1-32-bit Mailbox Destination Register
  • a single 2-bit Mailbox Mode Register
  • a single 1-32-bit Mailbox Mask Register
  • a single 2-bit Mailbox Send Register
  • 0-7 32-bit data registers to store the message.
  • 1-32 sets of read-only interrupt status
    registers, one for each interrupt, each
    comprising
  • 1-32-bit Raw Interrupt Status Register (each bit
    corresponds to each mailbox)
  • 1-32-bit Masked Interrupt Status Register (each
    bit corresponds to each mailbox).
  • A 32-bit Configuration Status Register

104
IPCM Functional Block
PrimeCell Inter-Processor Communications Module
Technical Reference Manual
105
IPCM Example
106
IPCM Example
  • Core0 has a message to send to Core1. Core0
    claims the mailbox by setting bit 0 in the
    Mailbox Source Register. Core0 then sets bit 1 in
    the Mailbox Destination Register, enables the
    interrupts and programs the message into the
    Mailbox Data Registers. Finally, Core0 sends the
    message by writing 01 to the Mailbox Send
    Register. This asserts the interrupt to Core1.
  • When Core1 is interrupted, it reads the Masked
    Interrupt Status Register for IPCMINT1 to
    determine which mailbox contains the message.
    Core1 reads the message in that mailbox, then
    clears the interrupt and asserts the acknowledge
    interrupt by writing 10 to the Mailbox Send
    Register.
  • Core0 is interrupted with the acknowledge
    message, completing the operation. Core0 then
    decides whether to retain the mailbox to send
    another message or release the mailbox, freeing
    it up for other cores in the system to use it.

107
Conclusions
108
Conclusions
  • IPC schemes for supporting many cores
  • Performance and power consumption analysis for
    different IPC schemes
  • IPC API schemes

109
Thanks for Your Attention!
Write a Comment
User Comments (0)
About PowerShow.com