Title: Using Relational Databases and SQL
1Using Relational Databases and SQL
Lecture 1 Introduction
John Hurley Department of Computer
Science California State University, Los Angeles
2Introduction
- John Hurley
- Call me John, especially outside class.
- If thats too informal for you, you can call me
Instructor - hurley_j_at_sbcglobal.net
- xxx6aTWOb VI Xxx4cATEd
- d7xxx8eONEf5gFORE!hij
- (text preferred)
- Office hours listed on course page. I will often
be in A-310A (inside A-310) at other times, too.
3Attendance
4Administrative
- Course page
- http//www.calstatela.edu/faculty/jhurley2/classes
/cs122 - Syllabus
- Software download links
- Assignment dates
5Grading
- Grading A, B, C, (with and -), NC.
- If you are an undergraduate and dont get a C or
better, you get an NC - If you are a graduate student and dont get a B
or better, you get an NC - See the grading scale on the syllabus no curve
- In past terms, I have assigned all grades from A
to C as well as NC in this class. Median grade
is usually B or B, which is lower than the
median grade in my CS120 sections.
6Deadbeats Will Fail!
- About 10 of the all the course grades I have
ever given in CS122 were NCs. Everyone to whom I
have ever given an NC missed significant portions
of the coursework. - If you decide not to take the class, drop it
yourself. Dont expect me to drop you! - I cant drop anyone after the no-record drop
deadline - You will have your midterm grades before the
drop-with- W deadline
7Labs
- Assignments are in text files linked from the
course web page - Turn in on CSNS
- Posted before the weekly lecture
- Part A usually due at the end of the weeks lab
class - Part B usually due before the following weeks
lecture - Let me know in advance if you wont be able to
attend a lab for some good reason - I may give quizzes towards the end of lab periods
if attendance is poor. - Last labs before the midterm and final will be
ungraded, very realistic practice exams.
8Assignments
- Labs and exams will contain the following types
of questions - short answers and multiple choice
- 1-paragraph answers
- SQL to English
- English to SQL
9Quizzes
- Quizzes will be administered either during
lectures or labs - Quizzes usually unannounced but open-notes
- I may give one pre-announced closed-book quiz
which requires you to memorize a few very
important definitions - No makeups unless you provide a satisfactory
explanation in advance for why you wont be in
class.
10Cheating
- You may discuss general material about databases
and the techniques taught in this class with
other students - You may give or receive help understanding
assignments and debugging work
11Cheating
- You may copy examples from the lecture notes and
then change them to meet assignment requirements.
- Working programmers often solve problems in
similar ways. - Other instructors may not allow this. I am
only saying that it is OK and expected in my
sections of CS122. - You may not directly use language from the
lecture notes to answer short-answer questions
restate the answers in your own words. This is
difficult to do thats the point.
12Cheating Copying
- There are grey areas in cheating in CS, but
presenting an answer that is copied directly from
any source other than your brain is always over
the line. - You may not copy code from other students or
allow anyone to copy your code. - Few to none of my assignment questions are taken
from the textbook or other sources, so dont
bother copying published solutions to the
textbook exercises. - If you copy code posted by past students, you
will likely do poorly anyway, because I change
many questions slightly each term.
13Cheating on LAB WORK
- OK on lab work
- Copying examples from the lecture notes and
modifying them to fulfill the assignment - Examples of legitimate help for other students
- The problem with your query is that you forgot
to write the join condition - That isnt working because on a Mac the table
name is case sensitive - You accidentally copied a character from
Powerpoint that is invalid in mySQL - You need to use a float instead of an int for
that field because the values might not be an
integer - Cheating on lab work
- Copying code from other students or internet
sources - Copying text from other students for short-answer
or essay questions - Copying text from the internet or a book for
short-answer and essay questions
14Cheating on Exams and Quizzes
- OK on exams and open-book quizzes
- Consulting lecture notes, textbooks, your own
notes - Checking Wikipedia or other internet sources that
do not involve real-time communication with human
beings - Copying examples from the lecture notes and
modifying them to answer the questions - Cheating on exams and quizzes
- Copying code or text from other students or
internet sources - Answering short-answer questions with direct
quotes form the notes (restate them in your own
words!) - Communicating with any human being other than me
via email, chat, phone, or any other means
15Cheating Detection
- It is obvious to me when students answer
short-answer questions with text copied from
professional-level sources like Wikipedia and
textbooks. - Even for SQL code, there are only a few correct
answers to each question using the material we
cover. However, if you copy answers from other
students, you will sooner or later copy an
identifiable incorrect answer or trip up in some
other way. - I will be comparing all students lab and exam
papers using an automatic tool designed to
detect copying. I developed this application
specifically to detect cheating in CS122! - If I do detect copying, I will penalize all
students involved equally. If you understand the
material, it is foolish to take this risk by
letting other students copy your work. - People who do well on labs but poorly on exams
and quizzes receive careful scrutiny!
16Using Relational Databases and SQL
Part I Databases
17Database Definition
- Data (information) base (foundation)
- A database is a structured collection of
persistent data. - Structured organized according to a set of
rules. In this case, organized according to a
database model. - Persistent stored in permanent storage, not just
RAM. If you shut down the application or the
power goes off, the data is not lost.
18Database Definition
- Many definitions are like this one
- A collection of data, typically modelling the
activities of one or more related organizations
(Ramkrishnan and Gherke, Database Management
Systems.) - I dont like this definition, because databases
dont always model anything in particular.
Database designers dont always know what the
data will be used for.
19What is a Database?
- Structured using a database model
- No database model, no database!
- Often, not always, used to model organizational
activities - Examples
- Companies
- Stores
- Universities
20Database Skills
- Database skills are foundational in CS
- The great majority of modern applications use
databases to store information - You will put these skills together with your OOP
programming skills a little later if you are an
undergraduate, very soon if you are a grad
student - As a working software engineer, you will probably
use the skills you learn in this class every day
21Database Skills
- Some applications you are familiar with that rely
heavily on large databases - Wikipedia
- GET
- Amazon.com
- ITunes
22Tables?
- User can add and remove tables, get information
from them, update or delete information in them,
change them - These are the skills we will study in this class
23Database Background
- Storage was bulky, expensive, and slow in the old
days!
24Physical/Logical Separation
- Previous to the inventions of DBMS, one had to
write a program that traversed pointers at the
physical level to extract data from a database - By abstracting the physical level and writing a
program at the logical level instead, extracting
data from a database became much easier
25Database Models
- The Two Levels of a Database Model
- Physical Level (how data is stored)?
- The things we dont have to worry about
- Logical Level (how data is organized)?
- The things we do care about
- The Basic Models
- Hierarchical Model? (IBMs IMS) represented data
as a tree - Network Model (CODASYL)?
- Relational Model (ALPHA, SEQUEL)?
26Hierarchical Database Model
27Hierarchical Database Model
- Example of a query to retrieve info
- for book in (get_children("Programming/J.Smith))
- print book.field("Title"), book.field("Publisher")
- Mostly superseded by relational model
- Has an afterlife with XML
28XML Data
ltCATALOGgt ltCDgt ltTITLEgtWhen a Man Loves A
Womanlt/TITLEgt ltARTISTgtPercy Sledgelt/ARTISTgt ltC
OUNTRYgtUSAlt/COUNTRYgt ltCOMPANYgtAtlanticlt/COMPANYgt
ltPRICEgt8.70lt/PRICEgt ltYEARgt1987lt/YEARgt lt/CDgt
ltCDgt ltTITLEgtBlack Angellt/TITLEgt ltARTISTgtSavag
e Roselt/ARTISTgt ltCOUNTRYgtEUlt/COUNTRYgt ltCOMPANY
gtMegalt/COMPANYgt ltPRICEgt10.90lt/PRICEgt ltYEARgt199
5lt/YEARgt lt/CDgt lt/CATALOGgt
29Network Model
- Built on hierarchical model but allows multiple
parents and multiple children
30Relational Model
- Proposed by Edgar F. Codd (circa 1969)?
- Database is a collection of tables (relations)
- Relational comes from Relational
Algebra/Calculus and not from Relationships - Relational model is based on extensive
mathematical theory, which we will not cover in
this class - Dominant database model
- Oracle was the first to aggressively market a
commercial relational database product
31Dr. Edgar F(rank) Codd
- MA Mathematics, MA Chemistry
- MS and PhD in Communication Sciences
- ACM Turing Award (1981)
32Tables?
- Artists Table (Relation)?
- ArtistID, City, Region, ... Columns
(Attributes)? - Each row is called a Record (Tuple)?
33Using Relational Databases and SQL
Part II Database Management Systems
34Database Management Systems (DBMS)?
- A DBMS handles these functions
- Data definition Defining new data structures for
a database, removing data structures from the
database, modifying the structure of existing
data. - Update Inserting, modifying, and deleting data.
- Retrieval Obtaining information either for
end-user queries and reports or for processing by
applications. - Administration Registering and monitoring users,
enforcing data security, monitoring performance,
maintaining data integrity, dealing
with concurrency control, and recovering informati
on if the system fails. - Source Wikipedia
35Database Management Systems (DBMS)?
- Some common relational DBMSs
- MySQL, PostgreSQL (free, open source)?
- Oracle, MS SQL Server (commercial)?
36Database Schemas
- The definition of the database, where you define
- Tables
- Relationships
- Constraints
- Stored Functions and Procedures
- Views
- Indexes
- Schemas are typically represented by a schema
diagram see the Lyric diagram linked from the
course page
37Database Management Systems (DBMS)
- You can have multiple databases, each with a
single schema - A separate database for each application
- Toystore (First database)
- Bookstore (Second database)
- Furniture Store (Third database)
- Etc.
- You can also have a single database, with
multiple schemas
38Database Management Systems (DBMS)?
39Using Relational Databases and SQL
Part III Query Languages
40Query Languages
- Query question
- Query Language A computer language used to
extract data from a database - Data Sublanguage A computer language used to
extract and manipulate database data - SEQUEL/SQL (1974)? developed at IBM
41Query Languages
- Data Sublanguage Alpha (Codds original query
language) - Data Sublanguage SEQUEL (SQL)?
42SQL
- Stands for Structured Query Language
- A non-procedural, domain-specific language (not
like Java, C or C)? - An open ANSI and ISO standard
- Supported by most major DBMS
- Some variations in implementations
- Used by programmers, managers, and database
administrators
43SQL
- SQL is nonprocedural or declarative
- Procedural languages, like Java or C, require
programmers to implement an algorithm (a series
of instructions that will solve a problem in a
finite amount of time) to accomplish each task - Nonprocedural / declarative languages, like SQL,
require the programmer to describe what data
s/he wants. The platform (in this case, DBMS)
determines how to produce the data - This is an important distinction, but as we will
see, it is not as clear-cut for SQL as it is for,
say, HTML.
44SQL Functions
- View information from relational databases
- Single and multiple table selections
- Calculation and analysis
- Manipulate information in relational databases
- Insert and delete records
- Update records
- Create relational databases
- Create databases, tables, constraints, ...
45Nonstandard Features
- SQL is an open standard, but developers of DBMSs
often add additional features that are not part
of the standard - Differentiate their products from competitors
- Vendor lock-in
- What happens when you want to switch to a
different DBMS? - Is it a good idea to use features like this?
46Using Relational Databases and SQL
Part IV Lyric Database Discussion
47Primary Keys
- Primary key is used to uniquely identify every
record in a table - Must be a field or combination of fields with
unique values - What would happen if we needed to identify
individuals in the university DB and tried to do
this using first name? Last name? Both?
Height? DOB? - If more than one field is required, we have a
composite primary key
48The Lyric Database
- Database for a web-based company that provides
services to artists and the studios that they
work for - Before we start extracting data from a database,
we must understand the database completely first - Lets go over all the tables and attributes
49Primary Key Example
- What is the primary key of the Studios table?
- What is the primary key of the XRefArtistsMembers
table? (hint it may require more than one field
to make up a primary key!)
50Using Relational Databases and SQL
Part V MySQL
51MySQL
- For coursework, we will use MySQL, which you must
install on a USB drive. - Bring a USB drive to the next class meeting!
- You may also install it on your own laptop, but
note that you will have to use the lab computers
for the midterm and final exam, so be sure you
can run it from a USB drive before the midterm. -
52Downloading MySQL, Part I
- Go to CS122 web page and follow the links to
MySQL site - Get MySQL Community Server
- mysql-5.5.x has the MySQL database client and
server programs - Get the .zip files (not the MSIs) for your OS
(Windows vs. OSX) and processor (32 vs 64 bit).
- The files are labelled in a way that may confuse
you into downloading the source code, which you
dont need. Be careful to get the binaries
instead. MySQL 5.5.8-win32, for example, is 132
MB. The 27 MB file is the source code.
53Downloading MySQL, Part II
- Extract the zip files you will have two
directories - You may also want to use the MySQL Workbench,
which is a GUI tool for working with MySQL.
However, Workbench only works with the 32 bit
version and is buggy in any case. Please dont
ask me to help you with it until at least week 3,
after everyone is working smoothly with the main
MySQL software. - mysql-workbench-gpl-5.2.x.... is the MySQL GUI
Tools
54Mac Users
- An OSX version of MySQL is available, but some
students have had difficulties with it and I am
not very good at supporting OSX. - If you find it hard to work with, I can give you
an account on the cs1 server, which has MySQL. - You might also consider using the Windows version
with Boot Camp or an emulator.
55Using MySQL In Windows
- This process should only be slightly different in
OSX - Open up a Windows command line console
- Use the cd command to navigate to the
- mysql-5.x.xx-xx/bin directory
- If you add this directory to your PATH, you wont
have to navigate there every time. However, you
wont be able to add anything to the PATH on
the lab computers. - Type in the following to start the database
server - start mysqld
- Then type in the following to start the database
client - mysql u root
56Some MySQL Commands
- Once MySQL has started and you see the mysql
prompt - At mysqlgt prompt type in show databases
- At mysqlgt prompt type in create database lyric
- At mysqlgt prompt type in use lyric
- At mysqlgt prompt type in show tables
- You shouldnt see any yet
57Adding Data to a Database
- Now that the database is selected, let's load a
database script - Download lyric.sql from the course webpage
- At mysqlgt prompt type in source path
lyric.sql - Where path stands for the path to the location
where you saved lyric.sql. - If you put lyric.sql in mysqls bin directory,
all you will have to type is source lyric.sql - You should see a bunch of messages like
thisQuery OK, 1 rows affected (0.01 sec).
58Verify that the database is set up
- To check whether everything has worked correctly,
type - SELECT FROM Salespeople
- The output should look like this
- ---------------------------------------------
--------------- - SalesID FirstName LastName Initials
Base Supervisor - ---------------------------------------------
--------------- - 1 Bob Bentley bbb
100.00 4 - 2 Lisa Williams lmw
300.00 4 - 3 Clint Sanchez cls
100.00 1 - 4 Scott Bull sjb
NULL NULL - ---------------------------------------------
--------------- - 4 rows in set (0.39 sec)
59Using MySQL in the lab
- If you will be using your own laptop in the lab,
bring it to the next class meeting - If you will be using MySQL on a lab computer,
bring a USB drive to the lab on Wednesday