Title: The Early Methods of Data Processing
1The Early Methods of Data Processing
- Data as part of a program (Input ......Data)
- Data File (Open ......As...... , n)
- Sequential (Input n, Print n)
- Read data sequentially from beginning to end.
- Random I/O (Put n, Get n)
- Can read specific part of data from the file. But
it is still difficult to locate the exact part.
2Early Data Models
- Flat File
- Hierarchy Model (One to Many)
- Network Model (Many to Many)
- The above two models have relations being built
in the data. Difficult to maintain. - Relational
3An Example of Flat File
4Split Flat File
5Relations between the Tables
6RDBMS Products
- Desktop RDBMS
- dBASE, Paradox (Borland)
- FoxPro, Access (Microsoft)
- Mainframe (or Client/Server) RDBMS
- ORACLE (Oracle)
- MS SQL (Microsoft)
- DB2 (IBM)
- Sysbase, Informix
7Traditional vs Client - Server RDBMS ( I )
- Traditional
- Tables, Indices, Codes are in different files
- Relations, Integrity, Security, Consistency are
applied by the programs users developed - Using FIXED WIDTH to store data
- Non Traditional (Modern)
- All the related parts are in a file called
PROJECT - Relations, Integrity, Security, Consistency are
managed by RDBMS itself - Using VARIABLE WIDTH to store data
8Traditional vs Client - Server RDBMS ( II )
- Client-Server (Front-End)
- END databases and FRONT interfaces are
independent each other - Higher performance
- Better security
- MS Access A hybrid database product
- VB/MS Access A good example of client-server
design tool
9Database Trend
- Multiple Platform - Access EXTERNAL DB
- ISAM (Index Sequential Access Method)
- ODBC (Open Database Connectivity)
- OLE DB
- Data Warehouse / Data Mart
- Data Mining
10Database Structure Design
- A well designed database structure can
- Keep the project development on the right track.
- Ensure the system meets all the requirements,
including Flexibility, Security, Integrity and
consistency. - Save the system's resources by eliminating
unnecessary duplication. - Make programming easier.
11Data Normalization ( I )
- Definition
- the process of transforming existing data into
relational tables. - Objectives
- To eliminate duplicated fields (attributes)
- To accommodate future changes
- To minimize the impact of changes to structure on
the related programs
12Data Normalization ( II )
- Functional Dependency Analysis Group the related
data into tables according to their
dependencies. - B DEPEND ON A from B, you can only find one A,
not multiple A's - Name, Address, Birthday DEPEND ON Student ID
- Course NOT DEPEND ON Student ID
13Data Normalization ( III )
- First Normal Form Eliminates REAPEATING fields
and NON ATOMIC values. - Second Normal Form Requires each field to
depend on EVERY PART of the primary key. - Third Normal Form Requires all non-primary
fields SOLELY DEPEND on the primary key (Not
depend on other fields). - Don't include the fields which can be derived
from others.
14The Reasons of Violating Normalization
- A trade off of speed vs storing space,
flexibility vs efficiency - Repeating fields when the repeating number is
known - Non-atomic fields e.g. Address
- Derived fields when it can improve the speed a
lot - E.g. (Value price x quantity)
15Name Convention
- Reasons? -- Limitation from the software
Readability of the programs (both the yourself
and others). - Length of the Name
- File Name Not more than 8 characters.
- Field Name Not more than 10 characters.
- Characters Used in the Name
- Names should begin with a LETTER.
- Don't use control characters (-/gtlt,\)
- Use the name close to the real attribute
- (e.g., Name, Address, StudentID)
- Use certain prefix to group the classes.
- Use capital letters to separate the words.
16Table Design (I-1)
- Field Type
- Character (Text) Store text data. Each character
occupies 1 byte. Normally up to 255 digits. - Numerical
- Byte Store 1255, 1 byte.
- Integer Store integer -32,76832767, 2 bytes.
(Long Integer 4 bytes) - Single Store data with decimal, 4 bytes.
(Double 8 bytes).
17Table Design (I-2)
- Logical (Boolean) Store True/False. 1 byte.
- Date Store date/time. 8 bytes.
- Memo Store up to 64 MB text data.
- General Store up to 1 GB picture/sound.
- Tip Use the smallest field size because smaller
file can be processed faster and require less
memory.
18Table Design (II-1)
- Primary Key Unique value to identify each
record. - (in some RDBMS, such as FoxPro, using Record
Number) - To eliminate duplicated records.
- To build up the relationship with other tables.
- Index Assistant file to logically re-order the
records. - (Ascending vs Descending)
- To provide views in different order.
- To speed up the record search by using SELECT,
SEEK, FIND commands.
19Table Design (II-2)
- Primary /Index Key Can be an expression of
several fields (Compound key) - All the fields must be converted into same type
- There is limit to the total length
20Table Design (III-1)
- Index and Sort (Cluster Index)
- Index Logically change the records' order
- Sort Physically change the records' order
- Pros and Cons of indexes
- Improve the query performance
- Increase the work when the field is changed
- Validation Rule Specify the special requirement
for the entered data
21Table Design (III-2)
- Security Protect the database from illegal
operations. Two Methods - Assigning Password(SID, Security ID) To protect
whole database from illegal access. - Adding Different User Group To limit certain
users within allowed operations - Admins Group Have full access to the database.
- User Groups Can assign different permissions to
each group.
22Table Design (IV)
- Relationship To link the tables.
- Foreign Key The key in parent table used to link
the child table via its primary key. - Relation Types One - Many, Many - One, One -
One, Many - Many. - Referential Integrity The rules to ensure the
relationship between related tables are valid. - Basic integrity rule is It not permitted a child
record exist without its parent. - Three actions related to integrity DELETE,
INSERT, UPDATE.
23Table Design (V-1)
- Three integrity rules applied to the actions
- Cascade Update all child records' foreign key
with the new parent key primary value upon
INSERT, UPDATE or DELETE all child records whose
foreign key is same as the parent key value. - Restrict Restrict the key value changes
- To parent table Prohibit changing the parent
primary key to any existing foreign child key
value. - To child table Prevent changing child key to any
non-existing parent key value. - Ignore Perform no referential integrity checks.
24Table Design (V-2)
- Consistency, Replication and Synchronization.
- (Master vs Replica)
25SQL (Structured Query Language) -- A kind of
fourth generation languages (4GLs)
- First Generation Language Machine code, a kind
of binary language. Instructions are consisted
with a series of "0" and "1". - Second Generation Language Assembly languages.
Allows to use simple alphabetic codes. - Third Generation Language Procedure languages.
Using structured English words to write the
commands. Such as FORTRAN, BASIC, C). - Fourth Generation Language Non-procedure
languages (SQL and OOP).
26The difference between SQL and 3GLs
- SQL only tell computer what kind of results are
needed, not how to get the data. - One line of SQL command can have the same effects
of multi-line procedure in 3GLs. - SQL is a WEAKLY DATA TYPED language. Don't need
to specify the data type in SQL. The type of
these data rely on the column expression and
source data type.
27Basic Elements of SQL Command
- Verb Such as SELECT. Determine the type of
operation. - Predicate Variable List Specify the variables
affected by the operation. Normally is a list of
fields or field expressions. - Prepositional Clause Tell to which object the
operation will apply. Normally a list of tables.
Such as FROM.
28Simplest SQL
- SELECT ltvariable listgt FROM lttable listgt
- SELECT StudentID, StudentName from StudentInfo
29Practical Grammar of Simple SQL SELECT Statement
- SELECT ALLDISTINCT ltvariable listgt
- FROM lttable listgt
- WHERE ltcriteriagt
30Practical Grammar of Simple SQL SELECT Statement
- SELECT Specify the operation is a query, i.e.,
extracting the data from the targeted database.
Variable list can be consisted of - Table fields (StudentID, StudentName)
- Field expression (StreetNumStreetName)
- Wild character asterisk .
- ALL Tell to get all the rows in the targeted
database. DISTINCT can eliminate the duplicated
rows. - FROM specify the targeted tables.
- WHERE begins a clause that decides which records
will be selected.
31SQL Operators and Functions I
- Arithmetic operators , -, , /, (),
- Comparison operators
- Basic gt, gt, , lt, lt, ltgt(!)
- Range BETWEEN
- Pattern LIKE (with wild characters)
- Wild characters
- () - Replace the character with any length of
string. - ?( _ ) - Replace the character with any ONE
character. - IN x IN (v1,v2,v3,...). Whether expression x
equals to any v? value.
32SQL Operators and Functions II
- The result of a comparison is a logical value
(TRUE or FALSE) - Logical operators Link the comparisons
- AND x AND y - x and y both TRUE.
- OR x OR y - Either x or y is TRUE.
- NOT NOT x - If x is FALSE, then NOT x is TRUE.
33SQL Operators and Functions III
- The Order of Operators and the Usage of Brackets
- Arithmetic operators () -- , / -- , -,
- Logical operators NOT -- AND, OR
- Use the brackets () to change the default order.
34SQL Operators and Functions IV
- SQL Aggregate Functions
- AVG(x) Return the arithmetic average of the
field expression x. - COUNT(x) Return the number of rows containing
NOT NULL value of the field expression x. If
x"", it returns the number of non empty rows. - MIN(x) Return the smallest value of x.
- MAX(x) Return the largest value of x.
- SUM(x) Return the total value of x.
35SQL Operators and Functions V
- SELECT COUNT(), AVG(mark), MIN(mark), MAX(mark)
from StudentInfo - Some Other Useful Functions (There are some
difference for different databases) - LEFT(string, length) Left part of a string.
- RIGHT(string, length) Right part of a string.
- MID(string, start, length) Middle part of a
string. Some software using SUBSTR().
36SQL Operators and Functions VI
- Trim String Get rid of tailing spaces.
- RTRIM(string) Get rid of right end spaces.
- LTRIM(string) Get rid of leading spaces.
- TRIM(string) Get rid of both ends spaces.
- Data Converting Convert the data into required
type. - CSTR(x) STR(x, len, dec)
- CINT(y) VAL(y)
- CDATE(d) CTOD(d)
37SQL Operators and Functions VII
- Constant Expressions
- Numerical No delimiter. (2500, 4.75)
- String Normally using single or double quotation
as delimiters. - ('string', "STRING").
- Logical Some software using dot as delimiters (
.TRUE. , .FALSE. ), some using no delimiter.
38SQL Operators and Functions VIII
- Date Normally using number sign as delimiters.
- (5/10/1998)
- Note There are two formats of date US format
(mm/dd/yy) and UK format (dd/mm/yy). In SQL, if
using to express a date constant, then ONLY US
format is accepted).
39Join Tables I
- CROSS JOIN Return the a cross product of two
tables. If Table1 has n rows and Table2 has m
rows, then Table1 CROSS JOIN Table2 creates a
table with n x m rows. - SELECT Publishers.Name, Authors.Author FROM
Publishers, Authors - A very dangerous situation, must avoid
- Equi-Join or INNER JOIN Creates a table with the
rows in which the key values exist in both
tables. - Syntax1 SELECT ltListgt FROM Table1 INNER JOIN
Table2 - ON Table1.keyTable2.key
- Syntax2 SELECT ltListgt FROM Table1, Table2 WHERE
Table1.keyTable2.key - SELECT Publishers.Name, Titles.ISBN, Titles.Title
FROM Publishers INNER JOIN Titles ON
ublishers.PubIDTitles.PubID
40Join Tables II
- Multiple Equi-Joins If 3 or more tables are
involved in SQL selection, then these tables need
to be joined by pairs. - Syntax1 SELECT ltListgt FROM Table1 INNER JOIN
(Table2 INNER JOIN Table3 ON Table3.key3Table2.ke
y2) ON Table2.key2Table1.key1 ... ... ... - Syntax2 SELECT ltListgt FROM Table1, Table2,
Table3, WHERE Table1.key1Table2.key1,
Table2.key2Table3.key3, ... ... ...
41Inner Join Examples
- SELECT Authors.Author, Titles.Title FROM Titles
INNER JOIN (Authors INNER JOIN Title Author ON
Authors.Au_ID Title Author.Au_ID) ON
Titles.ISBN Title Author.ISBN - SELECT Authors.Author, Titles.Title FROM Authors,
Titles, Title Author WHERE Authors.Au_ID
Title Author.Au_ID AND Titles.ISBN Title
Author.ISBN
42Alias in SQL
- Field alias
- SELECT Field1 AS Alias1, Field2 As Alias2, ..
- Table alias
- FROM Table1 AS Alias1, Table2 AS Alias2,
- Example
- SELECT A.Author AS Author Name, B.Title AS
Book Name FROM Authors A, Titles B, Title
Author C WHERE A.Au_ID C.Au_ID AND B.ISBN
C.ISBN
43Join Tables III
- Outer Joins
- Left Join
- Table1 LEFT OUTER JOIN Table2 Returns all the
rows in Table1 and only the rows of Table2 with
matching key values. - Left outer join is useful to find "lone parent"
records.
44Join Tables IV
- Right Join
- Table3 RIGHT OUTER JOIN Table4 Returns all the
rows in Table4 and only the rows of Table3 with
matching key values. - Right outer join is useful to find "orphan"
records.
45Join Tables V
- Not Equal Join Find the records don't have the
same key value between the tables. The result
will be Cross Join minus Inner Join. Use ltgt in
WHERE clause. - Self Join Select records in a single table with
two fields' value being same.
46Join Tables VI
- Sub-query
- SELECT ltListgt FROM ltTable1gt WHERE ltKeygt IN
(SELECT ltKeygt FROM ltTable2gt WHERE ltCriteriagt) - When using sub-query in WHERE clause, only one
table's fields can be selected. - SELECT Name, PubID FROM Publishers WHERE PubID IN
(SELECT PubID FROM Titles WHERE Title LIKE
Database)
47Merge SELECT Results
- UNION ALL
- To merge SELECT results
- SELECT Statement1 UNION SELECT Statement2
- ALL Without this keyword, the records from two
SELECT results will be de-duplicated. With this
keyword, simply merge two results.
48Other SQL Clause I
- ORDER BY Sort the query result in a specified
order. - SELECT ... FROM ... WHERE ORDER BY ltField1gt
ASCDESC, ltField2gt ASCDESC, - SELECT ... ORDER BY Lname, Fname
49Other SQL Clause II
- GROUP BY Combine multiple records with identical
values in the specific field list into a single
record. It is a useful tool to produce summary
reports. - SELECT ... FROM ... WHERE GROUP BY ltGroupListgt
- Two important notes
- The select field expression list must have at
least one aggregate function. Actually, except
the fields are specified in GROUP BY clause, all
the other fields must be displayed in aggregate
functions. - GroupList can have up to 10 fields which can be
used to produce the summary record for sub
sections.
50Other SQL Clause III
- HAVING ltHavingCriteriagt Apply a filter ONLY to
GROUP BY output - SELECT ... FROM ... WHERE GROUP ltGroupListgt
HAVING ltHavingCriteriagt - HAVING clause can decide which GROUPED ROWS to
output. - WHERE clause provides a filter to decide which
records are used in the summary report.
51Example of GROUP BY and HAVING
- SELECT PubID, Year Published, OUNT(Title) AS
" of Books" FROM Titles WHERE Title LIKE
Database GROUP BY PubID, Year Published
HAVING Year Publishedgt1990
52Action SQL I
- SELECT ... INTO Make a table.
- SELECT ltListgt INTO ltNewTablegt FROM ltSourceTablegt
WHERE Criteria - (Example Make a copy. SELECT INTO CopyPub FROM
Publishers) - INSERT INTO Add records to a table.
- INSERT INTO ltTargetTablegt IN ltExternalDatabasegt
(FieldList) SELECT ltSourceFieldListgt FROM
SourceTable - INSERT INTO ltTargetTablegt IN ltExternalDatabasegt
(FieldList) VALUES (ConstantList)
53Action SQL Examples
- INSERT INTO CopyPublishers SELECT FROM
Publishers - Add all the records in Publishers into
CopyPublishers. All the fields being selected and
copied. - INSERT INTO CopyPublishers (PubID) SELECT PubID
FROM Publishers WHERE PubID624 - Add the records in Publishers which meet the
WHERE criteria into CopyPublishers. Only the
fields in SELECT list are copied.
54Action SQL Examples
- INSERT INTO CopyPublishers (PubID,Name) VALUES
(0,"Kingston Publishing") - Add ONE record into CopyPublishers. Only the
fields in the list are filled with the value.
55Action SQL II
- DELETE Delete records from a table.
- DELETE FROM ltTableNamegt WHERE ltCriteriagt
- DELETE FROM CopyPublishers WHERE NameSAMS"
56Action SQL V
- UPDATE Change the fields' value in a specified
table - UPDATE ltTablegt SET ltField1gtltValue1gt,
ltField2gtltValue2gt, WHERE Criteria - UPDATE Titles SET PricePrice1.15
- UPDATE CopyPublishers SET Name"I. B. D." WHERE
Name"I B D"
57Crosstab Query
- Syntax
- TRANSFORM Aggregate Function SELECT ltExpressiongt
AS ltAliasgt WHERE ltCriteriagt GROUP BY
ltGroupListgt PIVOT ltPivotFieldgt - It can transform the normal multiple row summary
report created by GROUP BY query into multiple
column tabular way. It is a useful tool to
produce crosstab.
58Crosstab Query Example
- TRANSFORM COUNT(Title) AS " of Books" SELECT
PubID FROM Titles WHERE Title LIKE "Database"
AND Year Publishedgt1990 GROUP BY PubID PIVOT
Year Published - Copmare with
- SELECT PubID, Year Published, COUNT() AS "
of Books" FROM Titles WHERE Title LIKE
"Database" AND Year Publishedgt1990 GROUP BY
PubID, Year Published
59Further Development of SQL
- Parameter in SQL
- Procedure in SQL
- PL/SQL of ORACLE
- Transact-SQL of MS SQL Server
60Basic Concepts of MS Access
- Access Jet (Joint Engine Technology)GUI
facilities - Components of Access
- Table Organize the data.
- Query Extract the data.
- Form Build up Graphic User interface (GUI).
- Report Present the data.
- Macro Automate the manipulations.
- Module Store the common procedures.
61Levels of Manipulation
- Manually Manipulate Manually open (close,
delete, insert etc.) tables, forms, queries,
reports and so on. - Create Macros A macro is a set of actions which
compose a whole operation. - Use Access Basic It is a special version of VBA
(Visual Basic for Application) which can be put
into the different kinds of events.
62Table Properties
- Description.
- Validation Rule The validation rule apply to the
record. - Validation Text Message when the record not
valid. - Filter Decide records are logically exist when
the table is open. - Sort By Provide the logical sequence for the
table.
63Field Properties
- Format Specify the data appearance when
displayed or printed. - Input Mask Limit the way the user inputs data
into the table. Only apply to TEXT and DATE type. - Caption Determine the column heading.
- Default Value The value be automatically filled
when a new record is created. Two advantages
Avoid the null value and save the typing time. - Validation Rule Decide whether the field is
properly entered. - Indexed Define SINGLE field index.
64Basic Concepts of MS Access Programming
- MS Access Programming is a kind of Object
Oriented Programming (OOP)
65What is an object?
- Object is a visual entity in user's interface.
- Objects can be divided into containers and
non-containers. - Containers The object can contain other objects,
such as forms, reports, option group. - Non-containers individual controls or Data
Access Objects (DAOs).
66Properties, Methods and Events
- Properties The appearance of objects.
- Typical properties Name, Caption, Visible,
Enabled, Font Name/Size/Weight, Tooltip, Help
Context ID, Record/Row/Control Source, etc. - Events Actions happen to the object.
- Typical events Click, Dbl Click, Enter (Open),
Exit (Close), Got /Lost Focus, etc. - Methods Procedures are associated with an
object. - Typical Methods SetFocus, Refresh, etc.
67What is OOP
- OOP is a kind of non-procedural programming. It
combines the concepts of objects, properties,
events and methods. Instead of programming from
beginning to the end through the whole process,
it puts PIECES of procedures into objects' event
snippets to achieve EVENT DRIVEN or OBJECT
ORIENTED programming.
68MS Access Name Convention
- Up to 64 characters
- No control characters and special characters
- No leading spaces
- Don't use the reserved words such as names of
properties, events and methods - Use ! (exclamation) to refer to objects or
fields Use . (Dot) to refer the properties,
events or methods - Example Forms!Orders!OrderID.Name
- Reports!Rpt1!.......
69Table Operations
- Delete Fields
- Delete Records
- Delete Index
- Delete Table
70Query
- Create new query using QBE (Query By Example)
- Click ltNewgt button
- Select Design View
- Highlight the table(s) and click ltAddgt
- Drag the field and drop into desired column
- Untick the column if it is not visible
- Organize the criteria by
- Same row AND
- Different rows OR
- Switch between Design View and Datasheet view
71Relationship
- From Menu Tools, choose Relationship
- Add necessary tables
- Drag and drop between two fields in two tables
- Check Enforce Referential Integrity if needed
- Right click on the relation to modify
72Form
- Example of a simple data display form
- Click ltNewgt button
- Choose Design View and choose a table
- In the form design window, click Menu ViewField
List. - Highlight all fields and drag-drop to the form
- Run form by clicking Menu ViewForm View