Title: M150: Data, Computing and Information
1M150 Data, Computing and Information
Unit Five Storing, getting and Sending your data
21- Introduction
- The aims of this unit are
- Describe the notion of persistent data, how it is
created, and how it is stored and accessed
(logically and physically) on various types of
storage device. - Explain how the internet and the applications
that use it work, and address some of the issues
that arise from transmitting data between
computer systems. - Explain how databases facilitate the storage,
access and protection of data, and how metadata
is important in providing access to multimedia
databases. - Explore the issues of privacy and ownership of
data and analyze some of the risks arising from
storing data on computers and transmitting it
across networks.
2
32- Storing and accessing data in documents
- Storing text-based data in documents
- Persistent data documents to persist (to exist
after closing down the application that created
them or after switching off your computer) - they
need to be saved. - To facilitate subsequent retrieval you store
your documents in some logical arrangement on a
suitable storage medium for holding persistent
data such as your computers hard disk. - Organized fashion example
(strategy for retrieving
your documents quickly) - The filing cabinet has several drawers and each
drawer may hold a large number of files. Each
file contains a number of related documents
arranged in alphabetical or date order (retrieved
quickly) - If you access your hard disk from the computer
desktop, you will find a number of icons there.
Many of these icons contain other objects
(documents and folders). Inspecting the contents
of the disk this way reveals what is termed a
flat structure.
3
42- Storing and accessing data in documents
- Document is the lowest level of the hierarchy,
in which there are no further folders to open. - A hierarchical or nested folder structure where
each folder may contain other folders. At each
stage you can see only those documents which are
stored at that level. - You can use Windows Explorer to inspect the
contents of a disk. (right click on a disk then
Explore). An Explorer window has two panes. The
right-hand pane is essentially the same as an
ordinary folder window its contents can be
displayed as icons or small icons or as a list.
4
52- Storing and accessing data in documents
- The left-hand pane does not show any documents,
but it does show hard disks, folders and any icon
which holds other items (network share). Some
folders have a small or next them. - You can think of the folder structure loosely as
a tree lying on its side. The desktop is the root
of the tree, and each folder is a branch. The
leaves of the tree correspond to documents. - Any similar hierarchical arrangement of objects
is frequently called a tree structure or just a
tree. - The number of documents/folders that can be
stored in a single folder is not limited, but it
is best not to put too many items in a single
folder since the human mind does not easily
comprehend huge collections. (Preferably do NOT
exceed 20 items)
5
62- Storing and accessing data in documents
- A path of a folder or a document contains the
names of all the folders (branches of the tree)
that lead to it from the root. - A path allows you to identify unambiguously a
folder or document and is often referred to as
its full name or full path name. (documents in
the same folder must have different names) - The Search/Find function
- Operating systems come with a search function
which allows you to find items you have lost. - To find a document click on Search from Start
Menu, and a dialog box will appear. - Consider the Windows path name
- C\Projects\M150\Assignments\TMA02.doc
- C is the root. Projects is the name of a
folder at the top level of the hard disk which
contains a folder called M150 which in turn
contains an Assignments folder. The document
TMA02.doc is in the Assignments folder.
6
72- Storing and accessing data in documents
- Directories
- Each folder has a list, or directory, of the
folders and documents that it contains. Part of
the directory for a given folder can be displayed
in a number of ways to aid human identification
of the contents - Alphabetically by name is an obvious ordering.
- In order of last modification date.
- By size is sometimes useful.
- By type.
- The directory of a folder also lists the address
or physical location on the disk of each document
and subfolder in that folder. This address is
internal to the operating system and cannot be
seen in a user window. In the case of a
subfolder, the address given is the location of
the directory of that subfolder.
7
82- Storing and accessing data in documents
- Storage technologies
- There are various technologies of data storage
- Storing data can be on the hard disk, or on
removable storage media such as CDs, DVDs, Zip
disks, high-capacity tape cartridges - Capacity Number of bytes of data that can be
held on a storage medium. - There are various measures of storage size
(Capacity)
8
92- Storing and accessing data in documents
- Hard disk storage
- Hard disk is large enough to store several
multi-volume encyclopedias. Hard disk with a
capacity of up to terabyte is available. - Hard disks are coated with a magnetic material
that can be magnetized into a pattern
representing a sequence of bits. The surface
consists of millions of tiny magnets, which can
each be magnetized in two possible directions
representing 1 or 0. (computerized data is held
and transmitted as sequences of bits) - A hard disk is 1 to 3 inches in diameter, and
consists of one or more circular plates, each
having two surfaces. The plates may be aluminum,
ceramic or glass, and the surfaces are coated
with a magnetisable ferrite coating. Data is
recorded on each surface by magnetizing a series
of concentric circles called tracks.
9
102- Storing and accessing data in documents
- The disk surface is divided into a number of
equal sized wedge-shaped regions called sectors.
Within a sector each track holds the same amount
of data usually 512 bytes. - Block is the basic unit of data handled by the
disk control mechanism (each block of data is
guaranteed to be the same size) - Bucket is the unit of transfer several blocks
between disk storage and the computers memory in
a single operation. - The sector holds the same amount of data on each
track even if the outer tracks are larger than
the inner tracks. This is because the bytes are
packed more closely on the small inner circles
than on the larger outer circles.
10
112- Storing and accessing data in documents
- The actual reading from and writing to the disk
surface is performed by a read/write head, which
is attached to an arm that moves to and from the
center. The disk is kept spinning continuously,
so each sector is under the head at some time. - The head hovers close to the spinning surface,
which needs to be engineered carefully to avoid
physical contact between the head and the
surface. - Causes of a disk crash
- 1- Physical contact between the head and
the surface. - 2- A dust particle gets in the tiny gap (5
microns or less) between the head and the
surface. - For each plate in a disk there are two read/write
heads, one for each surface. In a read operation
the head detects a magnetized pattern. In a write
operation, the head magnetizes the relevant
pattern of bits on to the surface. - The heads associated with all surfaces move in
and out together at any one time they can read
from the corresponding tracks on both surfaces of
every plate in the disk. This set of tracks is
called a cylinder (a disk may have 16000
cylinders i.e. each plate would have 16000 tracks
on it).
11
122- Storing and accessing data in documents
- Typical rotation speeds for hard disks are
5,00015,000 revolutions per minute. A disk which
rotates at 10,000 rev/min and has 60 sectors
takes 0.1 millisecond (ms) for a 512-byte block
of data to pass under the head. - On average it takes half a rotation (3ms) for the
desired sector to reach the head this is called
the average latency. You also need to add to this
the seek time (time taken for the head to move to
the relevant track (310ms). - What is the storage capacity in GB of a disk with
16 read/write heads, 40,000 cylinders, and each
surface formatted into 50 sectors? - The number of track sectors 16 x 40 000 x 50
32 000 000 - Since each sector holds 512 bytes
- capacity 32 000 000 x 512 bytes 15GB (approx)
12
132- Storing and accessing data in documents
- Removable storage media devices
- Zip drive, which can accept removable hard disks
of 100MB or 250MB capacity. It works on exactly
the same principle as a fixed hard disk, but you
can change the disk in the drive. (portable) - Floppy disks (work as zip drive) capacity
limited to 1.4MB. - Memory card A removable medium which is very
popular. - Optical discs which may be CD (compact disc) or
DVD (digital versatile disc) - Store documents using a different technology
based on the optical properties of the surface. - The capacity of a CD is 650 MB.
- Data is stored on only one side of it in a single
spiral groove which winds round the disc 22,188
times. - Outer tracks of the groove hold more data than
inner ones (data is packed uniformly), so the
disc spins more slowly when accessing data near
the center
13
142- Storing and accessing data in documents
- Conventional CDs are called CD-ROMs (Read-Only
Memory), and have bits of data stored as pits
in their groove. Beams of laser light are used to
burn the pits on the disc. A CD drive works by
shining a low-power laser beam on the disc, which
detects the presence or absence of a pit (the
pits do not reflect the light). - DVDs (DVD-ROMs) pack the data more tightly,
using - Smaller pits
- a narrower groove
- Less overhead for error correction.
- These factors increase the capacity of a simple
DVD to 4.7GB. - DVDs can also be manufactured to use both sides
of the disk, and each side can have one or two
layers, yielding a theoretical maximum capacity
of 19MB. - One important difference between CD/DVD-ROM discs
and magnetic disks is the ability to write to
them. Hard disks allow you to rewrite data, but
standard optical disc do not (once a pit has been
burned, it cannot be erased).
14
152- Storing and accessing data in documents
- There are two kinds of CDs which computer users
can write to - Recordable CDs (CD-R).
- -Instead of burning pits on the CD, the
writing process dyes (paint) the relevant parts
of the groove. - -When read by a CD drive these dye spots
are indistinguishable from pits on a conventional
CD. The process is not reversible. - Rewritable CDs (CD-RW)
- -Use a different technology altogether.
(for this reason Not all CD drives can read CD-RW
disks). - -CD-RW writer can heat a point on the disk
to one of two temperatures corresponding to
different states of the material. This process is
reversible so that you can write to a CD-RW many
times just like hard and floppy disks.
15
162- Storing and accessing data in documents
- Labeling volumes
- Typically a Zip disk or a CD or a hard disk is
called a volume. - It is possible to partition a disk so that its
contents appear to occupy more than one volume. - Identifying a CD or other removable medium is
necessary. It is important that each is given a
label with a title. The volume can be stored in a
rack (holder) with many similar looking ones but
can be identified by its external label. - Besides its physical label a volume should also
have an electronic label, which, for consistency,
should be the same as the physical label. This
electronic label is the name of the volume, and
it will be displayed when you search the contents
of your computer.
16
172- Storing and accessing data in documents
- Sensible organization of storage
- Each volume contains a large number of documents,
so there has to be a means of locating the one
you want. - In the magnetic disk three numbers are required
to identify a block of data cylinder number,
surface number and sector number. This set is
called the address of the block. - Each volume has a volume table of contents or
VTOC. The VTOC is a table with one line for each
document.
17
182- Storing and accessing data in documents
- A single document might occupy one or more blocks
on the disk. - At the end of each block there is a marker which
either indicates that this is the final block for
the document or gives the address of the block
that holds the next portion of the document. - Moving documents
- Moving a document between folders on a disk is
really an illusion because the document does not
move at all ! - The documents physical location remains
unchanged, but the directories change (as
illustrated in page 21). - The directories will be changed, but the
individual items are stored on the volume exactly
where they were before the move.
18
192- Storing and accessing data in documents
- Deleting documents
- Modern operating systems usually have mechanisms
to protect users against themselves. - The operating system does not obey your
instruction, but, instead moves the document to a
special folder called Recycle Bin or Trash
from which it can be retrieved. - When your hard disk is becoming too full you
can decide to empty it. - When the documents moved to recycle bin, they
did not go anywhere they remained in the same
physical position on the disk. It was the
directory entry for the document that was
removed, with a new directory entry being created
in the recycle bin.
19
202- Storing and accessing data in documents
- What you perceive when you navigate through the
folders on your computer is not where the
documents are located physically, but where they
are located logically. That is, you are given a
logical view of your documents which shows their
relationship to each other in a hierarchical
(nested) structure. - The operating system hides from you where items
are located physically. The document does not
need to be moved when you empty the bin but
marked for deletion. - So the document may remain on your disk for a
long time without being overwritten. However, it
is inaccessible since its directory entries have
disappeared. - It may be possible to recover the deleted
document using a disk recovery utility (an
application that can find documents without using
directories).
20
212- Storing and accessing data in documents
- Other storage media
- You cannot afford to rely on the single copies
held on your hard disk. Instead you need a
strategy for backing up your work. - Magnetic tape is a linear storage medium which is
slow and difficult to access. No direct access as
there is with disks. - The main strengths of tape are its high capacity,
its reusability and its cheapness. - Magnetic tape is ideal for data back up and
archiving - A hologram is a three-dimensional image made with
the aid of a laser which helps storing much
higher volumes of data. - A Biological storage media The basic idea is to
represent 0s and 1s using two color states of a
suitable form of synthetic DNA. - A number of such memory units would be attached
to a support substrate to form a memory cell.
21
223- Transmitting data
- Computer Networking
- The internet has become a part of society, like
the telephone, radio and television. - The web, which is based on the internet, has
become the platform on which all kinds of
information are disseminated. For example,
educational system or e-commerce (with its own
computing practices and legal framework which
involves buying and selling goods and services on
the web). - Besides the internet, many other computer
networks exist (organizational Networks banks,
Police, Travel agents and airline). - A network of computers is linked together by
communication links. These links may be - Dedicated cable links
- Public telephone networks
- Radio or microwaves links.
22
233- Transmitting data
- Any organization using more than one computer is
likely to have a local area network (LAN) to
exploit the benefits of resource sharing. A LAN
may be contained within one building, or it may
span several buildings on the same site. - Pocket-sized computers known as PDAs (personal
digital assistants) can communicate with each
other and with desktop computers using infra-red
/ Bluetooth signals. They form a small local
network. - Many resources can be shared across a LAN like
sharing of data, laser printer and a connection
to the internet.
23
243- Transmitting data
- The internet
- The internet comprises a huge collection of
computers (called hosts) with telecommunications
links between them. - The internet has its roots in the American
military-funded research community of the early
1970s. The first applications to use the internet
were based purely on text. - The internet then began to be used for email and
for file transfer. Modern graphical tools for
accessing the internet (like Netscape Navigator)
are much more recent dating back to the early
1990s. - ARPANET was a network of just four computers at
four universities linked together as a project of
the Advanced Research Projects Agency (ARPA) in
the United States. - By 1996 that figure had grown to 15 million host
computers, in 2002 it had multiplied ten times to
150 millions.
24
253- Transmitting data
- The internet links together not just one type of
computer but any type of computer running any
operating system. By adopting the internet
protocol each of these computers can become an
internet host. - The telephone system (which uses analogue signals
consisting of a continuously varying voltage) was
designed for voice transmission. - As your computer communicates using digital
signals (consisting of discrete bit patterns), a
modem (modulator-demodulator) which is a piece of
equipment will be sitting between your computer
and the telephone socket. - The modem converts the data signals from the
computer into analogue signals, a modem at the
other end will convert the signal back into
digital (typical modem download data at 56kbps). - ADSL (Asymmetric Digital Subscriber Lines) is a
technology which allows data to be transmitted
digitally at high speed (typically 400kbps) over
conventional copper telephone wires.
25
263- Transmitting data
- Browsing the web
- The web is a collection of hypertext documents
distributed worldwide and linked by the internet. - The value of the web is that trillions of pages
of web content are linked together via multiple
hyperlinks. - Web browser is the software you use to access
and view documents on the web. - The basic unit of web content is the web page
which is an HTML document - The browser accesses the page, held on a remote
computer (web server), and downloads it to your
computer (the client). - Speed of download depends on the amount of data,
the speed of modem, the quality of the phone
line, the speed of your computer and the amount
of traffic on the internet.
26