WICS TP Chapter 1 - PowerPoint PPT Presentation

About This Presentation
Title:

WICS TP Chapter 1

Description:

(year-1984) 54. . Communication Hardware ... Point-to-point bandwidth likely to be common among computers by the year 2000. Scenario 2000 ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 85
Provided by: andreas62
Category:
Tags: wics | chapter

less

Transcript and Presenter's Notes

Title: WICS TP Chapter 1


1
The Whirlwind Tour
Chapter 1a
2
Transactions Where It All Started
Cuneiform documents now number about half a
million, three- quarters of them more or less
directly related to the history of law - dealing,
as they do, with contracts, acknowledgment of
debts, receipts, inventories, and accounts, as
well as containing records and minutes of
judgments rendered in courts, business letters,
administrative and diplomatic correspondence,
laws, international treaties, and other official
transactions. The total evidence enables the
historian to reach back as far as the beginnings
of writing, to the dawn of history. ...
Moreover, because of the inconvenience of
writing in stone or clay, Mesopotamians wrote
only when economic or political necessity
demanded it. (Encyclopaedia Britannica, 1974
edition)
3
From Transactions to Transaction Processing
Systems - I
The Sumerian way of doing business involved two
components
  • Database. An abstract system state, represented
    as marks on clay tablets, was maintained. Today,
    we would call this the database.
  • Transactions. Scribes recorded state changes with
    new records (clay tablets) in the database.
    Today, we would call these state changes
    transactions.

4
From Transactions to Transaction Processing
Systems - II
The real state is represented by an abstraction,
called the database, and the transformation of
the real state is mirrored by the execution of a
program, called a transaction, that transforms
the database.
5
Transactions Are In ...
Communications
  • Each time you make a phone call, there is a
    call setup transaction that allocates some
    resources to your conversation the call teardown
    is a second transaction, freeing those resources.
    The call setup increasingly involves complex
    algorithms to find the callee (800 numbers could
    be anywhere in the world) and to decide who is to
    be billed (800 and 900 numbers have complex
    billing). The system must deal with features like
    call forwarding, call waiting, and voice mail.
    After the call teardown, billing may involve many
    phone companies.

6
Transactions Are In ...
Finance
Each time you purchase gas using a credit card,
the point-of-sale terminal connects to the credit
card company's computer. In case that fails, it
may alternatively try to debit the amount to your
account by connecting to your bank. This
generalizes to all kinds of point-of-sale
terminals such as cash registers, ATMs,
etc. When banks balance their accounts with each
other (electronic fund transfer), they use
transactions for reliability and recoverability.
7
Transactions Are In ...
Travel
Making reservations for a trip requires many
related bookings and ticket purchases from
airlines, hotels, rental car companies, and so
on. From the perspective of the customer, the
whole trip package is one purchase. From the
perspective of the multiple systems involved,
many transactions are executed One per airline
reservation (at least), one for each hotel
reservation, one for each car rental, one for
each ticket to be printed, on for setting up the
bill, etc. Along the way, each inquiry that may
not have resulted in a reservation is a
transaction, too.
8
Transactions Are In ...
Manufacturing
Order entry, job and inventory planning and
scheduling, accounting, and so on are classical
application areas of transaction processing.
Computer integrated manufacturing (CIM) is a key
technique for improving industrial productivity
and efficiency. Just-in-time inventory control,
automated warehouses, and robotic assembly lines
each require a reliable data storage system to
represent the factory state.
9
Transactions Are In ...
Real-Time Systems
This application area includes all kinds of
physical machinery that needs to interact with
the real world, either as a sensor, or as an
actor. Traditionally, such systems were custom
made for each individual plant, starting from the
hardware. The usual reason for that was that 20
years ago off-the-shelf systems could not
guarantee real-time behavior that is critical in
these applications. This has changed, and so has
the feasibility of building entire systems from
scratch. Standard software is now used to ensure
that the application will be portable.
10
A Transaction Processing System
A transaction processing system (TP-system)
provides tools to ease or automate application
programming, execution, and administration of
complex, distributed applications. Transaction
processing applications typically support a
network of devices that submit queries and
updates to the application. Based on these
inputs, the application maintains a database
representing some real-world state. Application
responses and outputs typically drive real-world
actuators and transducers that alter or control
the state. The applications, database, and
network tend to evolve over several decades.
Increasingly, the systems are geographically
distributed, heterogeneous (they involve
equipment and software from many different
vendors), continuously available (there is no
scheduled downtime), and have stringent response
time requirements.
11
ACID Properties First Definition
  • Atomicity A transactions changes to the state
    are atomic either all happen or none happen.
    These changes include database changes, messages,
    and actions on transducers.
  • Consistency A transaction is a correct
    transformation of the state. The actions taken as
    a group do not violate any of the integrity
    constraints associated with the state. This
    requires that the transaction be a correct
    program.
  • Isolation Even though transactions execute
    concurrently, it appears to each transaction T,
    that others executed either before T or after T,
    but not both.
  • Durability Once a transaction completes
    successfully (commits), its changes to the state
    survive failures.

12
Structure of a Transaction Program
  • The application program declares the start of a
    new transaction by invoking BEGIN_WORK().
  • All subsequent operations will be covered by the
    transaction. Eventually, the application program
    will call COMMIT_WORK(), if a new consistent
    state has been reached. This makes sure the new
    state becomes durable.
  • If the application program cannot complete
    properly (violation of consistency constraints),
    it will invoke ROLLBACK_WORK(), which appeals to
    the atomicity of the transaction, thus removing
    all effects the program might have had so far.
  • If for some reason the application fails to call
    either commit or rollback (there could be an
    endless loop, a crash, a forced process
    termination), the transaction system will
    automatically invoke ROLLBACK_WORK() for that
    transaction.

13
The End Users View of a Transaction Processing
System
14
The Administrator's/Operators View of a TP System
15
Performance Measures of Interactive Transactions
  • Performance/ Small/Simple Medium Complex
  • Transaction
  • __________________________________________________
    ______________
  • Instr./transaction 100k 1M 100M
  • Disk I/O / TA 1 10 1000
  • Local msgs. (B) 10 (5KB) 100 (50KB)
    1000 (1MB)
  • Remote msgs. (B) 2 (300B) 2 (4KB) 100
    (1MB)
  • Cost/TA/second 10k/tps
    100k/tps 1M/tps
  • Peak tps/site 1000 100 1

16
Client-Server Computing The Classical Idea
17
Client-Server Computing The CORBA Idea
Client on WS Presentation Services etc
Object Implementation Jims Mailbox
IDL Stub
IDL Skeleton
Request Delete
Object Request Broker
18
Client-Server Computing The WWW Idea
Java- applet
HTTP
WWW- Browser
Server
JDBC- driver code
proprietary protocol
Java-Applet Java Database Connection (JDBC) Dr
iver Code
prop. protocol
JDBC-ODBC- bridge
ODBC driver
Database Server
JDBC network driver
public protocol
JDBC driver
(e.g. TCP/IP)
19
Using Transactional Remote Procedure Calls (TRPCs)
20
Terms We Have Introduced So Far
  • Resource manager The system comes with an array
    of transactional resource managers that provide
    ACID operations on the objects they implement.
    Database systems, persistent programming
    languages, and queue managers are typical
    examples.
  • Durable state Application state represented as
    durable data stored by the resource managers.
  • TRPC Transactional remote procedure calls allow
    the application to invoke local and remote
    resource managers as though they were local. They
    also allow the application designer to decompose
    the application into client and server processes
    on different computers.
  • Transaction program Inquiries and state
    transfor-mations are written as programs in
    conventional or specialized programming
    languages. The programmer brackets the successful
    execution of the program with a Begin-Commit pair
    and brackets a failed execution with a
    Begin-Rollback pair.

21
Terms We Have Introduced So Far
  • Atomicity At any point before the commit, the
    application or the system may abort the
    transaction, invoking rollback. If the
    transaction is aborted, all of its changes to
    durable objects will be undone (reversed), and it
    will be as though the transaction never ran.
  • Consistency The work within a Begin-Commit pair
    must be a correct transformation.
  • Isolation While the transaction is executing,
    the resource managers ensure that all objects the
    transaction reads are isolated from the updates
    of concurrent transactions.
  • Durability Once the commit has been successfully
    executed, all the state transformations of that
    transaction are made durable and public.

22
The World According to the Resource Manager
23
Where To Split Client/Server?
Thin
Fat
Presentation
Flow Control
Application Logic (business objects)
Data Access
Thin
Fat
24
Client/Server Infrastructure
Client
Server
Middleware
Objects Group- ware TP-Mon. DBMS OS
Files
GUI OOUI System Mgmt. OS
SQL
ORB
TRPC
Mail
Security
WWW
Transport
etc.
25
Transactional Core Services
26
The X/Open TP-Model
27
The X/Open Distributed Transaction Processing
Model
28
The OTS Model
transmitted with request
transaction originator
recoverable server
creation termination
invocation
commit coordination
Transaction service
29
Transaction Processing System Feature List
  • Application development features
  • Application generators graphical programming
    interfaces screen painters compilers CASE
    tools test data generators starter system with
    a complete set of administrative and operations
    functions, security, and accounting.
  • Repository features
  • Description of all components of the system,
    both hardware and software. Description of the
    dependencies among components (bill-of-material).
    Description of all changes to all components to
    keep track of different versions. The repository
    is a database. Its role in the system must be
    complete, extensible, active and allow for local
    autonomy.
  • TP-Monitor Features
  • Process management server classes
    transactional remote procedure calls
    request-based authentication and authorization
    support for applications and resource managers in
    implementing ACID operations on durable objects.

30
Transaction Processing System Feature List
  • Data communications features
  • Uniform I/O interfaces device independence
    virtual terminal screen painter support support
    for RPC and TRPC support for context-oriented
    communication (peer-to-peer).
  • Database features
  • Data independence data definition data
    manipulation data control data display
    database operations.
  • Operations features
  • Archiving reorganization diagnosis recovery
    disaster recovery change control security
    system extension.
  • Education and testing features
  • Imbedded education online documentation
    training systems national language features
    test database generators test drivers.

31
Data Communications Protocols
32
Presentation Management
33
SQL Data Definition
34
SQL Data Manipulation
35
Summary of Chapter 1
  • A transaction processing system is a large web of
    application generators, system design and
    operation tools, and the more mundane language,
    database, network, and operations software.
  • The repository and the applications that maintain
    it are the mechanisms needed to manage the TP
    system. The repository is a transaction
    processing application.
  • It represents the system configuration as a
    database and supplies change control by
    transactions that manipulate the configuration
    and the repository.
  • The transaction concept, like contract law, is
    intended to resolve the situation when exceptions
    arise. The first order of business in designing a
    system is, therefore, to have a clear model of
    system failure modes. What breaks? How often do
    things break?

36
Basic Terminology
Chapter 1b
37
A Word About Words (Chapter 2)
Humpty Dumpty When I use a word, it means
exactly what I chose it to mean nothing more
nor less. Alice The question is, whether you
can make words mean so many different
things. Humpty Dumpty The question is, which
is to be master, thats all.
Lewis Carroll
38
Basic Computer Terms
To get any confusion that might be caused by the
many synonyms in our field out of the way, let us
adopt the following conventions for the rest of
this class domain data type ... field
column attribute ... record tuple object
entity ... block page frame slot
... file data set table ... process task
thread actor ... functionrequestmethod...
All the other terms and definitions we need
will be briefly introduced and explained during
the session.
39
Basic Hardware Architecture I
In Bell and Newells classic taxonomy, hardware
consists of three types of modules
Processors, memory, and communications (switches
or wires). Processors execute instructions from
a program, read and write memory, and send data
via communication lines. Computers are generally
classified as supercomputers, mainframes,
minicomputers, workstations, and personal
computers. However, these distinctions are
becoming fuzzy with current shifts in
technology.
40
Basic Hardware Architecture II
Todays workstation has the power of yesterdays
mainframe. Similarly, todays WAN (wide area
network) has the communications bandwidth of
yesterdays LAN (local area network). In
addition, electronic memories are growing in size
to include much of the data formerly stored on
magnetic disk. These technology trends have
deep implications for transaction processing.
41
Basic Hardware Architecture III
  • Distributed processing Processing is moving
    closer to the producers and consumers of the data
    (workstations, intelligent sensors, robots, and
    so on).
  • Client-server These computers interact with each
    other via request-reply protocols. One machine,
    called the client, makes requests to another,
    called the server. Of course, the server may in
    turn be a client to other machines.
  • Clusters Powerful servers consist of clusters of
    many processors and memories, cooperating in
    parallel to perform common tasks.

42
Basic Hardware Architecture IV
43
Memories - The Economic Perspective I
  • The processor executes instructions from virtual
    memory, and it reads and alters bytes from the
    virtual memory. The mapping between virtual
    memory and real memory includes electronic
    memory, which is close to the processor,
    volatile, fast, and expensive, and magnetic
    memory, which is "far away" from the processor,
    non-volatile, slow, and cheap. The mapping
    process is handled by the operating system with
    some hardware assistance.
  • Memory performance is measured by its access
    time
  • Given an address, the memory presents the data
    at some later time. The delay is called the
    memory access time. Access time is a combination
    of latency (the time to deliver the first byte),
    and transfer time (the time to move the data).
    Transfer time, in turn, is determined by the
    transfer size and the transfer rate. This
    produces the following overall equation
  • memory access time latency ( transfer size /
    transfer rate )

44
Memories - The Economic Perspective II
  • Memory price-performance is measured in one of
    two ways
  • Cost/byte. The cost of storing a byte of data in
    that media.
  • Cost/access. The cost of reading a block of data
    from that media.
  • This is computed by dividing the device cost by
    the number of accesses per second that the
    device can perform.
  • The actual units are cost/access/second, but the
    time unit is implicit in the metrics name.
  • These two cost measures reflect the two different
    views of a memorys purpose
  • it stores data, and
  • it receives and retrieves data.

45
Memories- The Economic Perspective III
Typical large system capacity
46
Memories- The Economic Perspective VI
/ MB
47
Magnetic Memory
  • There are two types of magnetic storage media
    disk and tape. Disks rotate, passing the data in
    the cylinder by the electronic read-write heads
    every few milliseconds. This gives low access
    latency. The disk arm can move among cylinders in
    tens of milliseconds. Tapes have approximately
    the same storage density and transfer rate, but
    they must move long distances if random access is
    desired. Consequently, tapes have large random
    access latencieson the order of seconds.
  • Disk Access Time Seek_Time
  • Rotational_Latency
  • (Transfer_Size/ Transfer_Rate)

48
Magnetic Memory
  • Compare the times required for two access
    patterns to 1MB stored in 1000 blocks on disk
  • Sequential access Read or write sectors x, x
    1, ..., x 999 in ascending order. This
    requires one seek (10 ms) and half a rotation (5
    ms) before the data in the cylinder begins
    transferring the megabyte at 10 MBps (the
    transfer takes 100 ms, ignoring one-cylinder
    seeks).
  • The total access time is 115ms.
  • Random access Read the 1000 sectors x, ..., x
    999 in random order. In this case, each read
    requires a seek (10 ms), half a rotation (5 ms),
    and then the 1 kb transfer (.1 ms). Since there
    are 1000 of these events, the total access time
    is 15.1 seconds.

49
Memory Hierarchies
50
Memory Hierarchies
  • The hierarchy uses small, fast, expensive cache
    memories to cache some data present in larger,
    slower, cheaper memories.
  • If hit ratios are good, the overall memory speed
    approximates the speed of the cache.
  • At any level of the memory hierarchy, the hit
    ratio is defined as
  • hit ratio references satisfied by cache / all
    references to cache
  • Suppose a cache memory with access time C has hit
    rate H, and suppose that on a miss the secondary
    memory access time is S. Further, suppose that C
    .01 S. The effective access time of the cache
    will be as follows
  • Effective memory access time H C (1 - H)
    S
  • H (.01 S) ( 1 - H) S
  • (1 - .99 H) S
  • (1 - H) S

51
The Five Minute Rule
  • Assume there are no special response time
    (real-time) requirements the decision to keep
    something in cache is, therefore, purely
    economic.
  • To make things simple, suppose that data blocks
    are 10 KB.
  • At 1995 prices, 10 KB of main memory cost about
    1. Thus, we could keep the data in main memory
    forever if we were willing to spend a dollar.
  • With 10 KB of disk costing only .10, we could
    save .90 if we kept the 10 KB on disk.
  • In reality, the savings are not so great if the
    disk data is accessed, it must be moved to main
    memory, and that costs something. How much, then,
    does a disk access cost?
  • A disk, along with all its supporting hardware,
    costs about 3,000 (in 1995) and delivers about
    30 acc./sec. the cost, therefore, is about 100.
    At this rate, if the data is accessed once a
    second, it costs 100.10 to store it on disk
    (disk storage and disk access costs). That is
    considerably more than the 1 to store it in main
    memory.
  • The break-even point is about one access per 100
    seconds. At that rate, the main memory cost is
    about the same as the disk storage cost plus the
    disk access costs. At a more frequent access
    rate, diskstorage is more expensive. At a less
    frequent rate, disk storage is cheaper.
    Anticipating the cheaper main memory that will
    result from technology changes, this observation
    is called the five-minute rule rather than the
    two-minute rule.

52
The Five Minute Rule
Keep a data item in electronic memory if its
access frequency is five minutes or higher
otherwise keep it in magnetic memory. Similar
arguments apply to objects stored on tape and
cached on disk. Given the object size, the cost
of cache, the cost of secondary memory, and the
cost of accessing the object in secondary memory
once per second, the frequency at the break-even
point in units of accesses per second (a/s) is
given by the following formula Frequency
((Cache_Cost/Byte - Secondary_Cost/Byte) .
Object_Bytes) / (Object_Access_Per_Second_Cost)
a/s
53
The Rules of Exponential Growth
Electronic memory MemoryChipCapacity(year) 4
Kb/chip for year in
1970...2000 Moores Law Magnetic
memory MagneticAreaDensity(year) 10
Mb/inch2 for year
1970...2000 Hoaglands Law Processors SunMi
ps(year) 2 MIPS for year in
1984...2000 Joys Law
((year-1970)/3)
((year-1970)/10)
(year-1984)
54
Communication Hardware
The early 90s
The definition of the four kinds of networks by
their diameters. These diameters imply certain
latencies (based on the speed of light). In 1990,
Ethernet (at 10 Mbps) was the dominant LAN.
Metropolitan networks typically are based on 1
Mbps public lines. Such lines are too expensive
for transcontinental links at present most
long-distance lines are therefore 50 Kbps or
less. As you will get from the news, these things
are changing fast.
55
Communication Hardware
Scenario 2000
Point-to-point bandwidth likely to be common
among computers by the year 2000.
56
Processor Architectures
57
Processor Architectures
  • Shared nothing In a shared-nothing design, each
    memory is dedicated to a single processor. All
    accesses to that data must pass through that
    processor. Processors communicate by sending
    messages to each other via the communications
    network.
  • Shared global In a shared-global design, each
    processor has some private memory not accessible
    to other processors. There is, however, a pool of
    global memory shared by the collection of
    processors. This global memory is usually
    addressed in blocks (units of a few kilobytes or
    more) and is RAM disk or disk.
  • Shared memory In a shared-memory design, each
    processor has transparent access to all memory.
    If multiple processors access the data
    concurrently, the underlying hardware regulates
    the access to the shared data and provides each
    processor a current view of the data.

58
Address Spaces
59
Address Spaces
  • Memory segmentation and sharing A process
    executes in an address spacea paged, segmented
    array of bytes. Some segments may be shared with
    other address spaces. The sharing may be
    execute-only, read-only, or read-write. Most of
    the segment slots are empty (lightly shaded
    boxes), and most of the occupied segments are
    only partially full of programs or data.
  • To simplify memory addressing, the virtual
    address space is divided into fixed-size segment
    slots, and each segment partially fills a slot.
  • Typical slot sizes range from 224 to 232
    bytes. This gives a two-dimensional address
    space, where addresses are segment_number,
    byte. Again, segments are often partitioned into
    virtual memory pages, which are the unit of
    transfer between main and secondary memory. If an
    object is bigger than a segment, it can be mapped
    into consecutive segments of the address.

60
Processes
  • A process is a virtual processor. It has an
    address space that contains the program the
    process is executing and the memory the process
    reads and writes. One can imagine a process
    executing Java programs statement by statement,
    with each statement reading and writing bytes in
    the address space or sending messages to other
    processes.
  • Processes provide an ability to execute programs
    in parallel they provide a protection entity
    and they provide a way of structuring
    computations into independent execution streams.
    So they provide a form of fault containment in
    case a program fails.
  • Processes are building blocks for transactions,
    but the two concepts are orthogonal. A process
    can execute many different transactions over
    time, and parts of a single transaction may be
    executed by many processes.
  • Each process executes on behalf of some user, or
    authority, and with some priority. The authority
    determines what the process can do which other
    processes, devices, and files the process can
    address and communicate with. The process
    priority determines how quickly the processs
    demand for resour-ces will be serviced if other
    processes make competing demands. Short tasks
    typically run with high priority, while large
    tasks are given lower priority.

61
Protection Domains
  • There are two ways to provide protection
  • Process protection domain Each subsystem
    executes as a separate process with its own
    private address space. Applications execute
    subsystem requests by switching processes, that
    is, by sending a message to a process.
  • Address space protection domain A process has
    many address spaces one for each protected
    subsystem and one for the application.
    Applications execute subsystem requests by
    switching address spaces. The address space
    protection domain of a subsystem is just an
    address space that contains some of the callers
    segments in addition, it contains program and
    data segments belonging to the called subsystem.
    A process connects to the domain by asking the
    subsystem or OS kernel to add the segment to the
    address space. Once connected, the domain is
    callable from other domains in the process by
    using a special instruction or kernel call.

62
Protection Domains
A process may have many protection domains.
63
Threads
  • There is a need for multiple processes per
    address space
  • For example, to scan through a data stream, one
    process is appointed the producer, which reads
    the data from an external source, while the
    second process processes the data. Further
    examples of cooperating processes are file
    read-ahead, asynchronous buffer flushing, and
    other housekeeping chores in the system.
  • Processes can share the same address space simply
    by having all their address spaces point to the
    same segments. Most operating systems do not make
    a clean distinction between address spaces and
    processes. Thus a new concept, called a thread or
    a task, is introduced.
  • But note Several operating systems do not use
    the term process at all. For example, in the Mach
    operating system, thread means process, and task
    means address space in MVS, task means process,
    and so on.

64
Threads
  • The term thread often implies a second property
    inexpensive to create and dispatch. Threads are
    commonly provided by some software that found the
    operating system processes to be too expensive to
    create or dispatch. The thread software
    multiplexes one big operating system process
    among many threads, which can be created and
    dispatched hundreds of times faster than a
    process.
  • The term thread is used in the following to
    connote these light-weight processes. Unless this
    light-weight property is intended, process is
    used. Several threads usually share a common
    address space. Typically, all the threads have
    the same authorization identifier, since they are
    part of the same address space domain, but they
    may have different scheduling priorities.

65
Messages and Sessions
  • There are two styles of communication among
    processes
  • Datagrams The sender of a message determines the
    recipient's address (e.g. the process name) and
    constructs an envelope consisting of the sender's
    name and address, the recipient's name and
    address, and the message text. This envelope is
    delivered to the capable hands of the
    communication system. It is analogous to sending
    letters by mail.
  • Sessions Before any messages are sent, a fixed
    connection is established between sender and
    receiver, a so-called session. Once it has been
    established, both parties can send and receive
    messages via this session. This symmetry is often
    referred to as "peer-to-peer". Establishing a
    session requires a datagram. A session must at
    some point be closed down explicitly. It is
    analogous to a phone conversation.

66
Advantages of Sessions
  • Shared state A session represents shared state
    between the client and the server. A datagram
    might go to any process with the designated name,
    but a session goes to a particular instance of
    that name.
  • Authorization Processes do not always trust each
    other. The server often checks the clients
    credentials to see that the client is authorized
    to perform the requested function. The
    authentication protocols require multi-message
    exchanges. Once the session key is established,
    it is shared state.
  • Error correction Messages flowing in each
    session direction are numbered sequentially.
    These sequence numbers can detect lost messages
    and duplicate messages.
  • Performance The operations described are fairly
    costly. Each of the steps often involves several
    messages. By establishing a session, this
    information is cached.

67
Clients and Servers
  • The question of how computations consisting of
    many interacting processes should be structured
    has no simple answer. Currently, two styles are
    particularly popular peer-to-peer and
    client-server.
  • The debate about which style is "better" often
    creates the impression that they are radically
    different. But in reality, peer-to-peer is more
    general and more complex, and it subsumes
    client-server. Here is a brief characterization
  • Peer-to-peer The two processes are independent
    peers, each executing its computation and
    occasionally exchanging data with the other.
  • Client-server The two processes interact via
    request-reply exchanges in which one process, the
    client, makes a request to a second process, the
    server, which performs this request and replies
    to the client.

68
Clients and Servers
  • The limitation of the client-server model lies in
    the fact that it implies a synchronous pattern of
    one request/one response.
  • There are, however, cases in which one request
    generates thousands of replies, or where
    thousands of requests generate one reply.
    Operations that have this property include
    transferring a file between the client and server
    or bulk reading and writing of databases. In
    other situations, a client request generates a
    request to a second server, which, in turn,
    replies to the client. Parallelism is a third
    area where simple RPC is inappropriate. Because
    the client-server model postulates synchronous
    remote procedure calls, the computation uses one
    processor at a time. However, there is growing
    interest in schemes that allow many processes to
    work on problems in parallel. The RPC model in
    its simplest form does not allow any parallelism.

69
Remote Procedure Calls (RPCs)
70
Naming
  • Naming has to do with the problem of how a client
    denotes a server it wants to invoke. Typical
    naming schemes distinguish between an object's
    name, its address, and its location. The name is
    an abstract identifier for the object, the
    address is the path to the object, and the
    location is where the object is.
  • An object can have several names. Some of these
    names may be synonyms, called aliases. Let us say
    that Bruce and Lindsay are two aliases for Bruce
    Lindsay. For this to be explicit, all names,
    addresses, and locations must be interpreted in
    some context, called a directory. For example, in
    our RPC context, Bruce means Bruce Nelson, and in
    our publishing context, Bruce means Bruce Spatz.
    Within the 408 telephone area, Bruce Lindsays
    address is 927-1747, and outside the United
    States it is 1-408-927-1747.

71
Name Servers
  • Names are grouped into a hierarchy called the
    name space. An international commission has
    defined a universal name space standard, X.500,
    for computer systems. The commission administers
    the root of that name space. Each interior node
    of the hierarchy is a directory. A sequence of
    names delimited by a period (.) gives a path name
    from the directory to the object.
  • No one stores the entire name spaceit is too
    big, and it is changing too rapidly. Certain
    processes, called name servers, store parts of
    the name space local to their neighborhood in
    addition, they store a directory of more global
    name servers.

72
Authentication Techniques
  • Passwords are the simplest technique. The client
    has a secret password, a string of bytes known
    only to it and the server. The client sends his
    password to the server to prove the clients
    identity. A second password is then needed to
    authenticate the server to the client. Thus, two
    passwords are required, and they must be sent
    across the wire.
  • Challenge-response uses only one password or key.
    In this scheme, the client and the server share a
    secret encryption key. The server picks a random
    number, N, and encrypts it with the key as EN.
    The server sends EN to the client and challenges
    the client to decrypt it using the secret key. If
    the client responds with N, the server believes
    the client knows the secret encryption key. The
    client can also authenticate the server by
    challenging it to decrypt a second random number.
    The shared secret is stored at both ends, but
    random numbers are sent across the wire.

73
Authentication Techniques
  • Public key system Each authid has a pair of
    keysa public encryption key, EK, and a private
    decryption key, DK. The keys are chosen so that
    DK(EK(X)) X, but knowing only EK and EK(X) it
    is hard to compute X. Thus, a processs ability
    to compute X from EK(X) is proof that the process
    knows the secret DK. Each authid publishes its
    public key to the world. Anyone wanting to
    authenticate the process as that authid goes
    through the challenge protocol The challenger
    picks a random number X, encrypts it with the
    authids public key EK, and challenges the
    process to compute X from EK(X). Secrets are
    stored in one place only, and they do not go
    across the wire.

74
Scheduling
  • The purpose of scheduling is to make sure all
    requests get processed, i.e. are assigned to a
    specific server process. There are basically two
    additional constraints
  • Short response times The requests should not
    wait longer than necessary before they get
    serviced.
  • Economic usage of resources The required
    throughput should be achieved with the minimum
    number of resources (processors, nodes, links,
    etc.).
  • Throughput and response time at resource
    utilization r are related by the following
    formula
  • Average_Response_Time(r) (1/ (1 - r))
    Service_Time

75
The Scheduling Problem
76
File Organizations
77
SQL in a Distributed Environment
78
Software Performance
79
Protocol Standards
80
Relevant FAP-Standards
  • CSMA/CD, Token Ring, etc. Low-level protocols
    that specify how bits are physically transmitted
    across a shared medium.
  • IP/TCP, NetBIOS, HTTP Transport level protocols.
  • LU6.2 SNAs peer-to-peer protocol that allows
    both session oriented and client-server-style
    communication under transaction protection.
  • OSI-TP ISOs rendering of a protocol that
    provides a functionality very similar to LU6.2.
  • ASN.1 Protocol for exchanging data formatting
    and structuring information. Required for RPCs in
    a heterogeneous environment.
  • DRDA Interoperability standard for IBM
    SQL-systems.
  • ODBC, JDBC Interoperability standards for
    general SQL-systems.

81
Relevant API-Standards
  • SQL Portability standard for accessing
    relational databases (lots of proprietary
    extensions).
  • APPC, CPI-C Two of IBMs APIs for the LU6.2
    protocol.
  • X/Open-XA, X/Open-XA, etc. APIs by the X/Open
    consortium on ISOs OSI-TP protocols.
  • IDL OMGs interface definition language to let
    objects be integrated through an object request
    broker.
  • STDL Language for programming TP-applications
    based on the ACMS TP-monitor.
  • Java The webs favorite programming language
    comes with its own FAP-component.

82
OSI Standards and X/Open APIs
83
A Last Glance at TP-Standards
Each resource manager (RM) registers with its
local transaction manager (TM). Applications
start and commit transactions by calling their
local TM. At commit, the TM invokes every
participating RM. If the transaction is
distributed, the communications manager informs
the local and remote TM about the incoming or
outgoing transaction, so that the two TMs can use
the OSI-TP protocol to commit the transaction.
84
Summary
  • Transaction processing systems comprise all parts
    of a system, software and hardware.
  • Building such a system requires to consider
    end-to-end arguments at all levels of
    abstraction.
  • The performance of distributed TP systems is
    influenced by the hardware architecture (what is
    shared), by software issues (which protocols are
    used), and by configuration aspects (what limits
    scaleability).
  • The multitude of those influences gives rise to a
    constant dilemma Should one restrict the variety
    to few (proprietary) components for better tuning
    and performance, or should one embrace all the
    standards for openness - at the risk of poor
    scaleability and performance?
Write a Comment
User Comments (0)
About PowerShow.com