Hardware And Software Solutions To Handling Massive Databases - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Hardware And Software Solutions To Handling Massive Databases

Description:

The type of architecture will impact the size and response time of the DBMS. ... Define application recovery groups. An Intelligent Storage Devices for Backup ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 16
Provided by: robta
Category:

less

Transcript and Presenter's Notes

Title: Hardware And Software Solutions To Handling Massive Databases


1
Hardware And Software Solutions To Handling
Massive Databases
  • By
  • Robert Taylor
  • Steve Connell
  • Mutasim Mohammed

2
Introduction
  • We have been witnessing an explosion in the
    amount of digitally stored data, the rate at
    which data is being generated and the diversity
    of disciplines relying on the availability of
    stored data.
  • The importance of database tools and support for
    massive datasets, including the need for computer
    hardware and software solutions, is well
    recognised but often ignored.

3
Introduction
  • A database is a collection of logically related
    data designed to meet the information needs of
    one or more users.
  • A database is a collection of records stored in a
    computer in a systematic way, so that a computer
    program can consult it to answer questions.

4
The Data Explosion
  • The data explosion is directly linked to the
    primary cause of massive databases being created.
  • The consequences of ignoring data explosion can
    be very costly and in most cases can result in
    project, software and hardware failure.
  • The reality of data explosion in
    multi-dimensional databases is a surprising and
    widely misunderstood phenomenon.

5
The Data Explosion
  • What can happen if companies, businesses or
    individuals just ignore data explosion and the
    creation of massive databases?
  • Load or calculation times can take hours rather
    than seconds
  • Large costs to build and maintain monolithic
    models
  • Expensive hardware might be required to process
    and accommodate exploded data
  • The hidden cost of failing to provide timely and
    relevant enterprise business intelligence or the
    inability to make fast business decisions

6
The Software Challenge
  • Organisations need to be able to store, retrieve
    and more importantly manage massive amounts of
    digital information.
  • One piece of vital software needed to manage this
    information is something called a MDDS (Massive
    Digital Data System).
  • For MDDSs to work as efficiently as they can,
    certain issues need to be addressed
  • Scalability
  • Architecture
  • Data Models
  • Database Management Functions
  • Enforcing Integrity Constraints

7
Scalability Issues
  • A particular data management approach can be
    scaled to manage larger and larger databases but
    has limits.
  • A database can often sustain a certain amount of
    growth before it becomes too large for a
    particular approach.
  • More memory, storage and processors could be
    added.
  • A new hardware platform or operating system could
    be adopted.
  • A different microprocessor could be used.
  • However, once the size of the database has
    achieved its limit with a particular approach
    then a new approach is needed.
  • The new approach could come in the form of a new
    architecture, a new data model or new algorithms
    to implement one or more of the functions of the
    DBMS.

8
Architecture Issues
  • The type of architecture will impact the size and
    response time of the DBMS.
  • This is especially relevant for massive databases
    and would be a reason why a MDDS might be used,
    developed or integrated as a present DBMS
    architecture might not be able to cope with the
    new demand placed on it.
  • The MDDS architecture would then have issues that
    would need addressing.
  • Centralised approaches are being migrated to
    distributed and parallel approaches to handle
    large databases.
  • Some architectures such as the shared nothing
    parallel architectures are scalable to thousands
    of processors but will have multiprocessor
    communication issues.
  • A major MDDS architectural issue is managing the
    data transfer between the main memory and
    secondary storage.

9
Different Database Systems Available
  • There are numerous Database packages on the
    market that are available
  • The 3 main packages which Businesses tend to use
    are
  • IBM DB2
  • Oracle 10g
  • Microsoft SQL
  • Each of these packages have different attributes
    in the end it is up to the business to choose
    which package suits them best.

10
Software challenge Solution
  • Using the right software-based recovery and
    backup tools for massive database
  • A clean database image copy
  • Recover a complex application
  • Manage the log environment
  • Define application recovery groups
  • An Intelligent Storage Devices for Backup and
    Recovery.
  • Prepare for automated database Recovery

11
Hardware in Massive Database
  • A massive Database needs a massive amount of
    storage space
  • If businesses do not then money will be lost
    because e.g. transactions cannot be stored.
  • There are different types of storage you can use
    e.g.
  • - TAPE
  • - Redundant Array of Independent Disks
  • It doesnt really matter which type of storage
    you use thats up to the system designer but it
    will need to meet the requirements needed

12
How Hardware Can effect the Database
  • Speed is a massive factor with massive databases
    there are many reasons for this
  • - Speed of retrieving data
  • Speed of storing data
  • The network that the massive database will run on
    will needs various bits of hardware e.g. switches
    and cabling both of these will need to be of an
    adequate standard for the business to use the
    database effectively

13
Databases and Topologies
  • For the database to run well on the network
    topologies is another point to think about
  • Accessing the data in the database is very
    important
  • When designing a system with a massive database
    the topology of the PCs to the storage needs to
    be thought about

14
Different Topologies
  • There are different topologies when looking at
    storing a lot of data
  • Direct Attached Storage (DAS)
  • Network Attached Storage (NAS)
  • Storage Area Network (SAN)
  • All of these methods need to be considered when
    implementing a database all have different effect
    on your systems performance

15
Brief Summary
  • Distributed and parallel architectures are being
    investigated by various parties for managing
    massive databases.
  • Federated architectures are needed to integrate
    the existing different and disparate databases.
  • The existing databases could be massive
    centralised or distributed databases, relational
    systems, object-orientated systems or legacy
    systems.
  • When considering heterogeneous database
    integration as an additional problem, then
    developing a standard uniform interface must also
    be taken into account.
  • It is not simply a case of adopting a type of
    system and expecting it to work, but more a case
    of fundamental system design/redesign which is
    specific to the individual system in place or
    about to be put in place.
Write a Comment
User Comments (0)
About PowerShow.com