Bluffers guide to data - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Bluffers guide to data

Description:

Bluffers guide to data. Patricia Sleeman. Assistant archivist, UKNDAD ... So, EROS, deals with what are essentially digital forms of traditional paper ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 19
Provided by: czia
Category:
Tags: bluffers | data | eros | guide

less

Transcript and Presenter's Notes

Title: Bluffers guide to data


1
Bluffers guide to data
  • Patricia Sleeman
  • Assistant archivist, UKNDAD

2
  • Brief and hopefully uncomplicated overview of
    data and databases from an archivists
    perspective.

3
  • What
  • Issues and problems

4
So what is data?
  • Data consists of basic units of information, in a
    document many items of data are assembled to
    present an argument or describe an activity.
    Until recently most data were preserved and
    transmitted on paper (or, in more ancient times,
    on other media such as stone). I like this image
    taken from the National Archives of Australias
    site on e-permanence

5
(No Transcript)
6
  • Sometimes we find data in the form of lists, as
    in a telephone directory, where the aim is not to
    present an argument but to provide the raw
    material for a future action (e.g finding
    someones phone number).
  • In this case it is crucial that the data be
    classified in such a way as to make searching
    easy. In this instance it is crucial that the
    data be classified in such a way as to make
    searching easy (the names are listed in
    alphabetical order for each town or area)

7
  • As we well know, to-day data is stored
    increasingly in electronic format.
  • A collection of electronic records is termed a
    file, in paper record keeping practice a file is
    a group of records which are accumulated in the
    conduct of business within a paper cover (which
    may be composed of several parts).
  • Electronic files may be thought of as similarly
    constructed, however, records within electronic
    files do not necessarily need to be physically
    collected together as long as they are all
    identified as belonging to the same file and as
    long as they are able to be displayed as
    belonging to the same file.

8
  • Computer files are customarily categorized as
    text, data, or image files. Text files, which may
    be produced by word processing programs and other
    software, contain character-coded letters,
    numeric digits, puntuation marks and other
    symbols encountered in typewritten documents.
  • Data files contain records that are subdivided
    into one or more data elements called fields
    which store particular categories of information.

9
  • So to be able to find a specific item of
    information, it has to be structured, there are
    two ways of structuring data depending on the
    information's purpose
  • Documents this structure is used when data are
    arranged in an ordered fashion to present an
    argument or describe an activity.
  • Databases data are placed in a 'pool' of
    information from which they can be retrieved

10
NDAD
  • NDAD preserves and provide access to databases.
    Again these can be thought of in the most general
    terms as information which can be represented as
    a set of tables of information, in which columns
    contain a particular data item, and rows identify
    subjects for which data items are recorded much
    like, very much like a register.

11
Databases
  • Although the elements in these tables are
    typically numbers or simple text, they may be
    pictures or even more complex multimedia items
    such as sound, documents, video or GIS??.
  • Basically The key attribute which causes a
    particular set of computer records to be selected
    for preservation in NDAD rather than in the PRO's
    related EROS project is this tabular nature of
    the data, and its ability to be processed or
    analysed in some way by the computer system.

12
Relational database
  • A database consisting of multiple tables (rows
    representing records and columns representing
    attributes) whose relationship is explicit. For
    example, a personnel database might consist of
    separate tables for employees, salary,
    department, and family data with the employee
    number ( a unique identifier) being the link
    between the various tables. (Dollar)

13
Metadata,
  • So, EROS, deals with what are essentially digital
    forms of traditional paper documents but NDAD
    handles this kind of structured data? Not so
    simple, as NDAD does actually contain some
    digital documents, and digital forms of paper
    documents. These are documents which are
    strongly associated with the datasets and are key
    to their full understanding, and they form part
    of the archive with the datasets proper. Such
    documents include

14
Data dictionary
  • At the beginning of this presentation I said that
    data is the same as information, well Im going
    to contradict myself as data on its own is not
    information as it depends on a data dictionary to
    make this statement true.
  • It is integral to our understanding of data held
    within a database holding information about the
    definition, structure and usage of data that may
    include the name of each data element, its
    definition (size and type), where and how it is
    used, and its relationship to other data.

15
So whats a dataset?
  • A computer file or related set of computer
    files, forming part of NDAD, which is organised
    under a single descriptive title and is capable
    of being described as a unit in the finding aids
    . It may comprise one or more accessions. (NDAD
    glossary)

16
Types of datasets
  • 'Dynamic where a database that is 'open' within
    its active life and where the data is
    overwritten, this data could be archived at
    regular intervals
  • Closed/static where data is continually added to
    the database and data is not overwritten again
    this data could have been archived at regular
    intervals

17
Permanent preservation
  • At NDAD data is stored as a flat file. This a
    file of records whose organisation is rectangular
    and does not support hierarchical relations. A
    record consists of a series of attributes called
    data elements. The relationship among and between
    data elements is implicit.

18
We are now in the era of multi-dimensional data
created by GIS, object oriented databases,
hypermedia and multimedia even virtual reality
research environment, that cannot be retained in
software-independent (or flat) formats and with
which archivists need to be involved from the
moment of creation if they are going to have a
hope of managing them. (A Gilliland-Swetland-
'For the record')
Write a Comment
User Comments (0)
About PowerShow.com