System Software presentation

About This Presentation

Transcript and Presenter's Notes

Title: System Software

1
System Software

System software refers to the files and programs
that make up your computer's operating system.
System files include libraries of functions,
system services, drivers for printers and other
hardware, system preferences, and other
configuration files.
The programs that are part of the system software
include assemblers, compilers, file management
tools, system utilities, and debuggers.

2
System Software

Unlike application programs, however, system
software is not meant to be run by the end user.
For example, while you might use your Web browser
every day, you probably don't have much use for
an assembler program (unless, of course, you are
a computer programmer).

3
System Software

Since system software runs at the most basic
level of your computer, it is called "low-level"
software.
It generates the user interface and allows the
operating system to interact with the hardware.
Fortunately, you don't have to worry about what
the system software is doing since it just runs
in the background. It's nice to think you are
working at a "high-level" anyway.

4
Debugger

Even the most experienced software programmers
usually don't get it right on their first try.
Certain errors, often called bugs, can occur in
programs, causing them to not function as the
programmer expected.
Sometimes these errors are easy to fix, while
some bugs are very difficult to trace. This is
especially true for large programs that consist
of several thousand lines of code.

5
Debugger

Fortunately, there are programs called debuggers
that help software developers find and eliminate
bugs while they are writing programs.
A debugger tells the programmer what types of
errors it finds and often marks the exact lines
of code where the bugs are found.
Debuggers also allow programmers to run a program
step by step so that they can determine exactly
when and why a program crashes.
Advanced debuggers provide detailed information
about threads and memory being used by the
program during each step of execution.

6
Operating System

Also known as an "OS," this is the software that
communicates with computer hardware on the most
basic level.
Without an operating system, no software programs
can run. The OS is what allocates memory,
processes tasks, accesses disks and peripherials,
and serves as the user interface.

7
Operating System

With an operating system, like Windows, the Mac
OS, or Linux, developers can write code using a
standard programming interface (known as an API).
Without an operating system, programmers would
have to write about ten times as much code to get
the same results.
Of course, some computer geniuses have to program
the operating system itself.

8
Application

An application, or application program, is a
software program that runs on your computer. Web
browsers, e-mail programs, word processors,
games, and utilities are all applications.
The word "application" is used because each
program has a specific application for the user.
For example, a word processor can help a student
create a research paper, while a video game can
prevent that same student from getting the paper
done.

9
Application

In contrast, system software consists of programs
that run in the background, enabling applications
to run.
These programs include assemblers, compilers,
file management tools, and the operating system
itself.
Applications are said to run on top of the system
software, since the system software is made of of
"low-level" programs.
While system software is automatically installed
with the operating system, you can choose which
applications you want to install and run on your
computer.

10
Compilers

These have two functions
to check whether the submitted program is a legal
program,
to translate it into a lower level language e.g.
machine code (which is usually called the object
program).
The object program produced is usually
interpreted by the hardware. Any process that
does this, hardware or software, is called an
interpreter. It simulates the execution of a
program rather than translating the user's
program into machine instructions.

11
Compilers

An example of this is the UNIX system Pascal
compiler. pc translates the Pascal program into
intermediate code in a file called e.out. The
program em1 is the interpreter, reading from
e.out and interpreting it.
One advantage of interpretation is that the
author of an interpreter has more control over
the execution of the user's program than the
author of a compiler (e.g. checks for overflow).

12
Compilers

One disadvantage is that the time taken to
interpret a program can be much greater than the
time taken to execute it under normal techniques.

13
(No Transcript)
14
Assemblers

These are programs which translate from assembly
language. Both assemblers and compilers are
translators.
Assembly language is the mnemonic representation
of binary machine code which allows you to use
names for locations and instructions.

15
Assemblers

An assembler translating a program to binary has
four tasks
Replace symbolic addresses e.g. LOOP, by numeric
addresses.
Replace symbolic operation code by machine
operation codes.
Reserve storage for the instructions and data.
Each instruction and each piece of data counts as
one word of memory and addresses are usually
assigned from 0.
Translate constants into their machine
representation.

16
Assemblers

A simpler assembler must keep track of how many
words we have generated so far (kept in the
Instruction Location Counter). We need a table
listing instructions and their opcodes (the
opcode table).
The program is read twice by a two-pass
assembler.

17
Assemblers

On the first pass we read all the variables and
labels and put them into the Symbol Table (items
in the label column take up no memory but are
recorded in the symbol table using the address of
the command next to them).
On the second pass, label gaps are filled from
the table by replacing the label name with the
address.

18
Assemblers

Possible errors while assembling the program
include missing operand, illegal operation,
missing definition (e.g. JMP LOOP where LOOP was
never defined.
This is detected at the end of the first pass) or
multiple definition (two locations called LOOP).

19
Assemblers

An assembler could be written in binary, which is
impractical.
Alternatively an assembler could be written to
run on machine B but generating code which runs
on machine A (this is referred to as cross
assembly).
To get a true assembler on machine A, you write
the Pascal version in the assembly language of A,
run it through the cross-assembler on B and
transfer it to A (called bootstrapping).

20
Assemblers

In constructing a one-pass assembler, we still
have the problem of forward references.
A forward reference is where a statement in the
program contains a jump to a statement later on
which the compiler hasn't met yet.

21
Assemblers

There are two common solutions
Put the object code directly in memory. If you
come across a forward reference, note the
location, leave it blank and come back and fill
it in when it is defined. This also saves loading
the program into memory at a later stage. There
are problems
Addressing space occupied by the assembler
If the program is large there may not be enough
space for both program and assembler.
It does not permit separate translation (where
the program is developed in sections, each being
translated and then joined using a link-loader
program. This technique, assembling the code and
leaving it in memory is called Assemble and Go).
It does not permit separate translation (where
the program is developed in sections, each being
translated and then joined using a link-loader
program. This technique, assembling the code and
leaving it in memory is called Assemble and Go).

22
Assemblers

There are two common solutions
Noting where forward references occur and where
they are defined and putting this in the object
code. The link-loader (usually written in
assembly language) picks this up and fills in the
locations (not all link loaders do this).
Two-pass assemblers are more common, although
slower. On the first pass, the assembler makes a
note of all the addresses referred to by labels.
On the second pass, the addresses are written
into the assembled code.

23
Link Loaders

The link loader derives standard routines (read,
write, sin etc.) from libraries. The user can
create his own personal libraries. The functions
of a link-loader are
It may have to tie up forward references.
It manages libraries (stored in a special form
to speed up the search).
It allocates space in memory for routines
(forming its own memory map).
It resolves links, calling coroutines,
procedures etc.

24
Link Loaders

It relocates code. Translators always allocate
addresses from 0. Routines are not always loaded
at address 0, so some changes are made to make
jumps and references to data correct. Usually
this means adding the starting address (allocated
by the link-loader) to the address given by the
translator. The translator marks all addresses
needing relocating.

25
Link Loaders

A bootstrap loader activates when the computer is
switched on. Every morning when the computer is
switched on, the RAM is vacant.
The ROM contains a program which reads in from
disk a more sophisticated loader, which loads the
operating system which then executes.

26
Compilers

The different options are
"Compile and Go", which is similar to Assemble
and Go. The compiler reads the source file,
checks it, translates it into machine code and
puts it in memory.
The advantages of this are
It is relatively fast (there is no output to a
file which is relatively slow).
The program is executable directly after
translation.

27
Compilers

The disadvantages are
There is no link-loader (faster, but can't use
any precompiled subroutines).
It lacks portability as absolute binary is
being produced.
If the program is to be run again, it must be
translated again (we can overcome this by also
writing it out to a file which can be reloaded).
This is often used in educational establishments
where execution time is less important that
compilation time.

28
Compilers

The different options are
"Translate and Interpret" (for example, pc/em1).
In this case, the compiler reads the source file,
translates it to intermediate code (in the case
of pc this is called e.out), which is read by the
interpreter which simulates the execution of the
program.
The advantages of this are
It's fairly portable.
It's reasonably fast in translation.
The main disadvantage is that the program
executes relatively slow.

29
Compilers

The different options are
Translate into assembly language. The source
file is read by the compiler and translated to
assembly language, read by an assembler and
passed to the link-loader which produces the
object code.
The advantages of this are
It produces absolute binary so execution is
fast.
It accesses a link-loader so we can use
pre-compiled routines.
It is relatively portable.
The main disadvantage is that it is slow to
compile.

30
Compilers

The different options are
Translate directly into link-loader format (used
by most compilers). The compiler reads the source
and gives out link-loader format, which is read
by the link-loader producing object code.
The advantages of this are
It can access libraries.
The execution is fast.
The translation is relatively fast.
The main disadvantage is that it isn't very
portable.

31
Compilers

The different options are
Interactive translation, such as BASIC. There are
three types of this
The more efficient type (usually done on larger
machines). Each line of the program is translated
separately into intermediate code.
The less efficient type (usual on micros). A line
is typed in, put straight in the file which is
interpreted. It is relatively slow.
The highly efficient type. The translator
translates each line to absolute binary. After
RUN is typed, we have direct execution.

32
Linking and Loading

Most programs are divided into separate parts,
each of which is translated separately and linked
by the link-loader to produce the load module.
When the translator runs, it assigns 'virtual'
addresses to the routines starting with 0.
Therefore each of the object codes have internal
references from 0.
The linker's task is to combine these.

33
Linking and Loading

It constructs a table of all object modules and
their lengths.
Based on this table, it assigns a load address to
each object module.
It finds all the instructions that contain a
memory address and to each adds a relocation
constant equal to the starting address of the
module in which it is contained.
It finds all references to procedures and other
modules in which it is contained.
It finds all references to procedures and other
modules and inserts the appropriate address.

34
Dynamic Link-loading. Structure of an object
module

Usually contains six parts identification, entry
point table, external reference table, machine
instructions and constants, relocation directory
and the end of the module.
The machine instructions are the part of the
module which are the user's program.
The next part (the directory) is the table
showing the addresses of the routines to be
built-in (produced by the linker).

35
Dynamic Link-loading. Structure of an object
module

The compiled program with its virtual (from 0)
addresses may be mapped into memory at any of
several stages
On program execution.
When the program is written (some micros)
When the program is linked but before it is
loaded (slightly more flexile than the previous
option).
At load time.
When a base register is loaded (see below).
6. When the program is compiled.

36
Dynamic Link-loading. Structure of an object
module

Options 1 and 5 provide protection against
accidental jumps to other parts of memory.
A base register in the CPU is one loaded with the
offset of the program.
There is also a limit register holding the offset
of the end of the program. The PDP 11
implementation is similar to this having eight
hardware memory management registers.

37
Compilers

A compiler is a program that translates a source
program written in some high-level programming
language (such as Java) into machine code for
some computer architecture (such as the Intel
Pentium architecture).
The generated machine code can be later executed
many times against different data each time.

38
Compilers

An interpreter reads an executable source program
written in a high-level programming language as
well as data for this program, and it runs the
program against the data to produce some results.
One example is the Unix shell interpreter, which
runs operating system commands interactively.
Note that both interpreters and compilers (like
any other program) are written in some high-level
programming language (which may be different from
the language they accept) and they are translated
into machine code.

39
Compilers

What is the Challenge?
Many variations
many programming languages
many programming paradigms
many computer architectures
many operating systems

40
Compilers

Qualities of a compiler (in order of importance)
the compiler itself must be bug-free
it must generate correct machine code
the generated machine code must run fast
the compiler itself must run fast (compilation
time must be proportional to program size)
the compiler must be portable (ie, modular,
supporting separate compilation)
it must print good diagnostics and error messages
the generated code must work well with existing
debuggers
must have consistent and predictable optimization.

41
Compilers

A typical real-world compiler usually has
multiple phases. This increases the compiler's
portability and simplifies retargeting. The front
end consists of the following phases
scanning a scanner groups input characters into
tokens
parsing a parser recognizes sequences of tokens
according to some grammar and generates Abstract
Syntax Trees (ASTs)
Semantic analysis performs type checking (ie,
checking whether the variables, functions etc in
the source program are used consistently with
their definitions and with the language
semantics) and translates ASTs into Irs
optimization optimizes Irs.

42
Compilers

The back end consists of the following phases
instruction selection maps IRs into assembly
code
code optimization optimizes the assembly code
using control-flow and data-flow analyses,
register allocation, etc
code emission generates machine code from
assembly code.

43
Compilers

The generated machine code is written in an
object file. This file is not executable since it
may refer to external symbols (such as system
calls). The operating system provides the
following utilities to execute the code
linking A linker takes several object files and
libraries as input and produces one executable
object file. It retrieves from the input files
(and puts them together in the executable object
file) the code of all the referenced
functions/procedures and it resolves all external
references to real addresses. The libraries
include the operating sytem libraries, the
language-specific libraries, and, maybe,
user-created libraries.
loading A loader loads an executable object file
into memory, initializes the registers, heap,
data, etc and starts the execution of the program.

44
Lexical Analysis

The lexical analyzer is responsible for
Reading in a stream of input characters
Produces as output a sequence of tokens
Upon get-next-token request from the parser, the
analyzer

45
Lexical Analysis

A scanner groups input characters into tokens.
For example, if the input is
x x(b1)
then the scanner generates the following sequence
of tokens
id(x) id(x) (id(b) num(1))
where id(x) indicates the identifier with name x
(a program variable in this case) and num(1)
indicates the integer 1.
Each time the parser needs a token, it sends a
request to the scanner. Then, the scanner reads
as many characters from the input stream as it is
necessary to construct a single token.

46
Lexical Analysis

The scanner may report an error during scanning
(eg, when it finds an end-of-file in the middle
of a string).
Otherwise, when a single token is formed, the
scanner is suspended and returns the token to the
parser.
The parser will repeatedly call the scanner to
read all the tokens from the input stream or
until an error is detected (such as a syntax
error).
Tokens are typically represented by numbers. For
example, the token may be assigned number 35.

47
Lexical Analysis

Some tokens require some extra information.
For example, an identifier is a token (so it is
represented by some number) but it is also
associated with a string that holds the
identifier name.
For example, the token id(x) is associated with
the string, "x". Similarly, the token num(1) is
associated with the number, 1.

48
Lexical Analysis

Tokens are specified by patterns, called regular
expressions.
For example, the regular expression
a-za-zA-Z0-9 recognizes all identifiers with
at least one alphanumeric letter whose first
letter is lower-case alphabetic.

49
Parsing

Context-free Grammars
Consider the following input string
x2y
When scanned by a scanner, it produces the
following stream of tokens
id(x) num(2) id(y)

50
Parsing

Predictive Parsing
The goal of predictive parsing is to construct a
top-down parser that never backtracks. To do so,
we must transform a grammar in two ways
eliminate left recursion, and
2. perform left factoring.
These rules eliminate most common causes for
backtracking although they do not guarantee a
completely backtrack-free parsing

51
Parsing

Bottom-up Parsing
The basic idea of a bottom-up parser is that we
use grammar productions in the opposite way (from
right to left).
Like for predictive parsing with tables, here too
we use a stack to push symbols.
If the first few symbols at the top of the stack
match the rhs of some rule, then we pop out these
symbols from the stack and we push the lhs
(left-hand-side) of the rule. This is called a
reduction.

52
Semantic Analysis

Abstract Syntax
The basic idea of a bottom-up parser is that we
use grammar productions in the opposite way (from
right to left).
Like for predictive parsing with tables, here too
we use a stack to push symbols.
If the first few symbols at the top of the stack
match the rhs of some rule, then we pop out these
symbols from the stack and we push the lhs
(left-hand-side) of the rule. This is called a
reduction.

System Software PowerPoint PPT Presentation