Title: What are files and file systems
1What are files and file systems
- Most files fall into one of three categories
- Executable files - applications and utilities
like Netscape, Excel, Oracle, - Executable files are further divided into
- applications - people use them directly for tasks
- system software OS, comms software, databases
- utilities conversion programs etc
- Data files - word documents, databases,
spreadsheets, mpgs, jpgs, mp3, wav etc - Support files - configuration initialisation,
database indexes - There are probably other types. These are not
exact
2Files - what and where are they
- Files are binary streams of data or instructions.
- They are stored on disc drives, CDROMs, DVDs,
tapes and several other media - From these storage media they are usually
transferred, in part, to a computers memory - They can be sent along telecommunication channels
as streams of digital data through LANs, WANs
and the Internet
3Accessing files
- When we want to move a file from storage to
memory, we open the file. The original file
remains on the storage device. - When we no longer want it in memory we close the
file. This does not save changes that we may have
made to the file. - When making changes to data files, we must save
the file before closing it. Saving updates the
data in the original file in storage
4File types
- Although all files are simply binary streams of
data they behave quite differently and are seen
as being different. - When we open a wav or mp3 file we expect to
hear sounds. When we open a jpg or gif file we
expect to see a visual image. These
representations are not intrinsic to the stream
of data, they are only apparent because some
application or other has been designed to
represent them in that way
5- When we open (or run) an executable file, the OS
loads it into memory and begins to carry out the
instructions in the file. (If it can - it may be
the for the wrong OS) - When we open a data file, the data is also loaded
(partly) into memory, but the OS will present it
to the user by first opening the application
associated with that file. - If a file contains text data but the OS does not
know what application the file is associated
with, the OS may open it with a text editor. - Otherwise, the OS simply cannot open it .
6OS, FAT and FMS
- Many operating system functions concern files
- OS use File Allocation Tables (FAT) to organise
the storage of files on disc etc. - If an application has a large number of related
files, they may use a File Management System
(FMS) to organise these. The FMS still uses the
OS to save and open files etc. - Databases are a special class of FMS
7Naming conventions
- Some operating systems support file names in two
parts - the real name and an extension - In MSDOS based systems, files had names like
- excel.exe an exe file is an executable file
- letter.doc doc is a MSWord document (data)
- word.cfg cfg is a configuration file
- In some OS it is necessary to tell the OS which
applications to associate with each file type so
that the OS can open an appropriate application
8Naming conventions 2
- A file extension is usually separated from the
filename by a punctuation mark, often . - In some OS, the names of directories are
separated from the names of files by punctuation,
often / - Because of their special uses, these punctuation
marks cannot be used in filenames in those OS - e.g. Analysis/essay1.2 is not a valid filename
9- Another way to tell the OS what type a file is,
files often contain metadata i.e data which is
not part of the real data but tells us about the
real data. - a Word document file contains the following
metadata
_at_ñÿ N o r m a l mH lt A_at_òÿ lt
D e f a u l t P a r a g r a p h F o n t
ÿÿÿÿ
ÿÿ P e
t e r H y l a n d C \ H y l a n d D o c u
m e n t s \ t e s t . d o c ÿ_at_ t
_at_ _at_ G
T i m e s N e
w R o m a n 5
S y m b o l 3
A r i a l " ñ Ð h ÛcfÛcf
þÿ
àòùOh '³Ù0 x
Ì Ø ä ô
4 _at_ L X
h p ä test f
est Peter Hyland o ete
ete Normal y Peter Hyland o
1 te Microsoft Word 8.0 _at_ _at_
4oáltÕÁ_at_ 4oáltÕÁ
10File Compatibility and standard formats
- Not all applications can read the metadata in a
file - If an application can read the metadata in a file
it will still need to convert the data in the
file into its own native format - Some file formats, e.g. text files, are almost
universally accepted by applications. - Two special types of text files are
- space delimited - each word is separated by a
space - comma delimited words are separated by commas
11More metadata
- Some files contain a lot of metadata or only
metadata. These files are created in parallel
with other data files about which they store
metadata - e.g. a database application might use 2 types of
files - a file containing real data about customers
- a file containing metadata about the data in the
customer file. - The metadata might include other names (aliases)
by which the data in the customer file was
described - The number of characters in certain fields,
- access rights to some of the data fields
12Buffering of files
- Often data files (and even some applications) are
too large to fit into memory. When opened, only
part of the file is transferred into memory. - As the application needs more data it will open
and close the data file, moving parts of it into
memory to be processed, and then disposing of it.
- Similarly, communications devices and printers
cannot load all of a file at once and so they
store parts of it in buffer memory. - Some operating systems use virtual memory (parts
of the hard drive) to speed up file access
13Storage mechanisms
- When whole files are stored on tapes or discs,
they are either stored sequentially or for random
access. - Just like a video, tapes store their files
sequentially - to get to a TV show at the end of
the tape, you have to fast forward all the other
TV shows first - discs, like CDs are usually random access devices
- you can jump to any track on a CD without
playing the others first. You can jump to the
start of a file - This is about whole files and storage devices
14File structures
- The data in a file may be a series of records all
having the same structure but different details - The data in the file can be arranged sequentially
so that you must read all proceeding records to
get to a specific record, lets say its record
42. - A sequential file can be stored on a tape or a
disc. If it was stored on a disc, the OS could
find the start of the file using random access
but the database application would still have to
search for records sequentially inside the file
15File structures 2
- A random access file is structured in such a way
that the application can locate any record and
jump directly to that record, without reading
preceding records. - Random access files are most useful when stored
on random access devices. - If you were writing a database application, you
need to know which file structure you are going
to use and how to search it.
16Directories / folders
- OS provide tools to organise files on storage
devices.
Files are not stored in a physical location
corresponding to the directories. Files are just
linked to certain directories.
17Deleting files and things in file
- When you delete a file, the file is not
immediately gone. Usually, deleting a file
tells the OS that the space occupied by the file
is no longer in use. - Many OS show this by putting the file in the
Trash - .
- The OS will use the space that was occupied by
the file (write over it) whenever the OS needs
to. - Once the space occupied by a file has been
written over, the file cannot be recovered i.e.
once a file has been written over, you cannot
take it out of the Trash
18Fragmentation
- When a small file is deleted from a storage
device, it leaves an available space - a hole if
you like - When the OS writes a new file to the device, the
new file may not fit in the hole, so the OS puts
part of the file in the hole and writes the rest
of the new file in the next available space - If the next space is not big enough, the file may
be split across several spaces. The OS needs to
keep track of where all the bits of the file are
kept - The new file is fragmented. Defragmenting a
drive puts all file fragments together again -
faster
19Deleting records
- If a file contains database records, the
applications that use the database may delete
individual records from the file. - Removing a record means a lot of work, so the
application/database usually just marks (flags)
the record as being deleted. - When the application searches the file later it
will ignore any records marked as being deleted. - Periodically database files need to be purged to
remove all the deleted records.
20Backup copies
- Because deleting a file is eventually
permanent, most OS make users confirm the delete
command. - Sometimes a storage drive will fail or crash
and a necessary file will be destroyed. - Make a backup copy of any necessary files, to
protect from crashes and accidental deletion - Keeping a sequence of backups is useful - if one
fails you only lose the new data since the last
backup e.g. essay1_1, essay1_2, essay1_3 etc.
21Copying, moving and shortcuts
- When you open a file with an application and use
the application to do a Save as, the
application will create a copy of the file on
another disc - If you want a copy on the same disc you must
either - save the copy in a different directory OR
- save the file with a different name
- In a drag and drop interface, dragging a file
to a new directory or disc usually moves the file
i.e. you dont create a new copy, just move the
original. - If you want to be able to access a file quickly
but dont want to move or copy it, use a shortcut