JDBC and Java Access to DBMS - PowerPoint PPT Presentation

About This Presentation
Title:

JDBC and Java Access to DBMS

Description:

Postgres is extensible because its operation is catalog-driven ... and standard RDBMS is that Postgres stores much more information in its catalogs ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 64
Provided by: ValuedGate1
Category:

less

Transcript and Presenter's Notes

Title: JDBC and Java Access to DBMS


1
JDBC and Java Access to DBMSIntroduction to
Data Warehouses
  • University of California, Berkeley
  • School of Information
  • IS 257 Database Management

2
Lecture Outline
  • Review
  • Object-Relational DBMS
  • OR features in Oracle
  • OR features in PostgreSQL
  • Extending OR databases (examples from PostgreSQL)
  • Java and JDBC
  • Introduction to Data Warehouses

3
Lecture Outline
  • Object-Relational DBMS
  • OR features in Oracle
  • OR features in PostgreSQL
  • Extending OR databases (examples from PostgreSQL)
  • Java and JDBC
  • Introduction to Data Warehouses

4
Object Relational Data Model
  • Class, instance, attribute, method, and integrity
    constraints
  • OID per instance
  • Encapsulation
  • Multiple inheritance hierarchy of classes
  • Class references via OID object references
  • Set-Valued attributes
  • Abstract Data Types

5
Object Relational Extended SQL (Illustra)
  • CREATE TABLE tablename OF TYPE TypenameOF NEW
    TYPE typename (attr1 type1, attr2 type2,,attrn
    typen) UNDER parent_table_name
  • CREATE TYPE typename (attribute_name type_desc,
    attribute2 type2, , attrn typen)
  • CREATE FUNCTION functionname (type_name,
    type_name) RETURNS type_name AS sql_statement

6
Object-Relational SQL in ORACLE
  • CREATE (OR REPLACE) TYPE typename AS OBJECT
    (attr_name, attr_type, )
  • CREATE TABLE OF typename

7
Example
  • CREATE TYPE ANIMAL_TY AS OBJECT (Breed
    VARCHAR2(25), Name VARCHAR2(25), Birthdate DATE)
  • Creates a new type
  • CREATE TABLE Animal of Animal_ty
  • Creates Object Table

8
Constructor Functions
  • INSERT INTO Animal values (ANIMAL_TY(Mule,
    Frances, TO_DATE(01-APR-1997,
    DD-MM-YYYY)))
  • Insert a new ANIMAL_TY object into the table

9
PostgreSQL Classes
  • The fundamental notion in Postgres is that of a
    class, which is a named collection of object
    instances. Each instance has the same collection
    of named attributes, and each attribute is of a
    specific type. Furthermore, each instance has a
    permanent object identifier (OID) that is unique
    throughout the installation. Because SQL syntax
    refers to tables, we will use the terms table and
    class interchangeably. Likewise, an SQL row is an
    instance and SQL columns are attributes.

10
Creating a Class
  • You can create a new class by specifying the
    class name, along with all attribute names and
    their types
  • CREATE TABLE weather (
  • city varchar(80),
  • temp_lo int, -- low
    temperature
  • temp_hi int, -- high
    temperature
  • prcp real, --
    precipitation
  • date date
  • )

11
PostgreSQL
  • Postgres can be customized with an arbitrary
    number of user-defined data types. Consequently,
    type names are not syntactical keywords, except
    where required to support special cases in the
    SQL92 standard.
  • So far, the Postgres CREATE command looks exactly
    like the command used to create a table in a
    traditional relational system. However, we will
    presently see that classes have properties that
    are extensions of the relational model.

12
Inheritance
  • CREATE TABLE cities (
  • name text,
  • population float,
  • altitude int -- (in ft)
  • )
  • CREATE TABLE capitals (
  • state char(2)
  • ) INHERITS (cities)

13
Inheritance
  • In Postgres, a class can inherit from zero or
    more other classes.
  • A query can reference either
  • all instances of a class
  • or all instances of a class plus all of its
    descendants

14
Non-Atomic Values - Arrays
  • The preceding SQL command will create a class
    named SAL_EMP with a text string (name), a
    one-dimensional array of int4 (pay_by_quarter),
    which represents the employee's salary by quarter
    and a two-dimensional array of text (schedule),
    which represents the employee's weekly schedule
  • Now we do some INSERTSs note that when appending
    to an array, we enclose the values within braces
    and separate them by commas.

15
PostgreSQL Extensibility
  • Postgres is extensible because its operation is
    catalog-driven
  • RDBMS store information about databases, tables,
    columns, etc., in what are commonly known as
    system catalogs. (Some systems call this the data
    dictionary).
  • One key difference between Postgres and standard
    RDBMS is that Postgres stores much more
    information in its catalogs
  • not only information about tables and columns,
    but also information about its types, functions,
    access methods, etc.
  • These classes can be modified by the user, and
    since Postgres bases its internal operation on
    these classes, this means that Postgres can be
    extended by users
  • By comparison, conventional database systems can
    only be extended by changing hardcoded procedures
    within the DBMS or by loading modules
    specially-written by the DBMS vendor.

16
Rules System
  • CREATE RULE name AS ON event
  • TO object WHERE condition
  • DO INSTEAD action NOTHING
  • Rules can be triggered by any event (select,
    update, delete, etc.)

17
Views as Rules
  • Views in Postgres are implemented using the rule
    system. In fact there is absolutely no difference
    between a
  • CREATE VIEW myview AS SELECT FROM mytab
  • compared against the two commands
  • CREATE TABLE myview (same attribute list as for
    mytab)
  • CREATE RULE "_RETmyview" AS ON SELECT TO myview
    DO INSTEAD
  • SELECT FROM mytab

18
Extensions to Indexing
  • Access Method extensions in Postgres
  • GiST A Generalized Search Trees
  • Joe Hellerstein, UC Berkeley

19
Indexing in OO/OR Systems
  • Quick access to user-defined objects
  • Support queries natural to the objects
  • Two previous approaches
  • Specialized Indices (ABCDEFG-trees)
  • redundant code most trees are very similar
  • concurrency control, etc. tricky!
  • Extensible B-trees R-trees (Postgres/Illustra)
  • B-tree or R-tree lookups only!
  • E.g. WHERE movie.video lt Terminator 2

20
GiST Approach
  • A generalized search tree. Must be
  • Extensible in terms of queries
  • General (B-tree, R-tree, etc.)
  • Easy to extend
  • Efficient (match specialized trees)
  • Highly concurrent, recoverable, etc.

21
GiST Applications
  • New indexes needed for new apps...
  • find all supersets of S
  • find all molecules that bind to M
  • your favorite query here (multimedia?)
  • ...and for new queries over old domains
  • find all points in region from 12 to 2 oclock
  • find all text elements estimated relevant to a
    query string

22
Lecture Outline
  • Review
  • Object-Relational DBMS
  • OR features in Oracle
  • OR features in PostgreSQL
  • Extending OR databases (examples from PostgreSQL)
  • Java and JDBC
  • Introduction to Data Warehouses

23
Java and JDBC
  • Java is probably the high-level language used in
    instruction and development today one of the
    earliest enterprise additions to Java was JDBC
  • JDBC is an API that provides a mid-level access
    to DBMS from Java applications
  • Intended to be an open cross-platform standard
    for database access in Java
  • Similar in intent to Microsofts ODBC

24
JDBC Architecture
  • The goal of JDBC is to be a generic SQL database
    access framework that works for any database
    system with no changes to the interface code

Java Applications
JDBC API
JDBC Driver Manager
Driver
Driver
Driver
Oracle
MySQL
Postgres
25
JDBC
  • Provides a standard set of interfaces for any
    DBMS with a JDBC driver using SQL to specify
    the databases operations.

26
JDBC Simple Java Implementation
import java.sql. import oracle.jdbc. public
class JDBCSample public static void
main(java.lang.String args) try //
this is where the driver is loaded
//Class.forName("jdbc.oracle.thin")
DriverManager.registerDriver(new
OracleDriver()) catch (SQLException e)
System.out.println("Unable to load driver
Class") return
27
JDBC Simple Java Impl.
try //All DB access is within the
try/catch block... // make a connection to
ORACLE on Dream Connection con
DriverManager.getConnection(
"jdbcoraclethin_at_dream.sims.berkel
ey.edu1521dev", mylogin",
myoraclePW") // Do an SQL statement...
Statement stmt con.createStatement()
ResultSet rs stmt.executeQuery("SELECT NAME
FROM DIVECUST")
28
JDBC Simple Java Impl.
// show the Results... while(rs.next())
System.out.println(rs.getString("NAME"))
// Release the database
resources... rs.close()
stmt.close() con.close() catch
(SQLException se) // inform user of
errors... System.out.println("SQL Exception
" se.getMessage()) se.printStackTrace(Syst
em.out)
29
JDBC
  • Once a connection has been made you can create
    three different types of statement objects
  • Statement
  • The basic SQL statement as in the example
  • PreparedStatement
  • A pre-compiled SQL statement
  • CallableStatement
  • Permits access to stored procedures in the
    Database

30
JDBC Resultset methods
  • Next() to loop through rows in the resultset
  • To access the attributes of each row you need to
    know its type, or you can use the generic
    getObject() which wraps the attribute as an
    object

31
JDBC GetXXX() methods
SQL data type Java Type GetXXX()
CHAR String getString()
VARCHAR String getString()
LONGVARCHAR String getString()
NUMERIC Java.math.BigDecimal GetBigDecimal()
DECIMAL Java.math.BigDecimal GetBigDecimal()
BIT Boolean getBoolean()
TINYINT Byte getByte()
32
JDBC GetXXX() Methods
SQL data type Java Type GetXXX()
SMALLINT Integer (short) getShort()
INTEGER Integer getInt()
BIGINT Long getLong()
REAL Float getFloat()
FLOAT Double getDouble()
DOUBLE Double getDouble()
BINARY Byte getBytes()
VARBINARY Byte getBytes()
LONGVARBINARY Byte getBytes()
33
JDBC GetXXX() Methods
SQL data type Java Type GetXXX()
DATE java.sql.Date getDate()
TIME java.sql.Time getTime()
TIMESTAMP Java.sql.Timestamp getTimeStamp()
34
Large Object Handling
  • Large binary data can be read from a resultset as
    streams using
  • getAsciiStream()
  • getBinaryStream()
  • getUnicodeStream()

ResultSet rs stmt.executeQuery(SELECT IMAGE
FROM PICTURES WHERE
PID 1223)) if
(rs.next()) BufferedInputStream gifData new
BufferedInputSteam(

rs.getBinaryStream(IMAGE)) byte buf
new byte41024 // 4K buffer int len while
((len gifData.read(buf,0,buf.length)) ! -1)
out.write(buf, 0, len)
35
JDBC Metadata
  • There are also methods to access the metadata
    associated with a resultSet
  • ResultSetMetaData rsmd rs.getMetaData()
  • Metadata methods include
  • getColumnCount()
  • getColumnLabel(col)
  • getColumnTypeName(col)

36
JDBC access to MySQL
  • The basic JDBC interface is the same, the only
    differences are in how the drivers are loaded

public class JDBCTestMysql public static
void main(java.lang.String args) try
// this is where the driver is loaded
Class.forName("com.mysql.jdbc.Driver").newInstance
() catch (InstantiationException i)
System.out.println("Unable to load driver
Class") return catch
(ClassNotFoundException e)
System.out.println("Unable to load driver
Class")
37
JDBC for MySQL
try //All DB access is within the
try/catch block... // make a connection to
MySQL on Dream Connection con
DriverManager.getConnection(
"jdbcmysql//localhost/ (this
is really one line) MyDatabase?userMyLogin
passwordMySQLPW") // Do an SQL
statement... Statement stmt
con.createStatement() ResultSet rs
stmt.executeQuery("SELECT NAME FROM DIVECUST")
  • Otherwise everything is the same as in the Oracle
    example
  • For connecting to the machine you are running
    the program on, you can use localhost instead
    of the machine name

38
Demo JDBC for MySQL
  • Demo of JDBC code on Harbinger
  • Code is available on class web site

39
Lecture Outline
  • Review
  • Object-Relational DBMS
  • OR features in Oracle
  • OR features in PostgreSQL
  • Extending OR databases (examples from PostgreSQL)
  • Java and JDBC
  • Introduction to Data Warehouses

40
Overview
  • Data Warehouses and Merging Information Resources
  • What is a Data Warehouse?
  • History of Data Warehousing
  • Types of Data and Their Uses

41
Problem Heterogeneous Information Sources
Heterogeneities are everywhere
Personal Databases
World Wide Web
Scientific Databases
Digital Libraries
  • Different interfaces
  • Different data representations
  • Duplicate and inconsistent information

Slide credit J. Hammer
42
Problem Data Management in Large Enterprises
  • Vertical fragmentation of informational systems
    (vertical stove pipes)
  • Result of application (user)-driven development
    of operational systems

Sales Planning
Suppliers
Num. Control
Stock Mngmt
Debt Mngmt
Inventory
...
...
...
Sales Administration
Finance
Manufacturing
...
Slide credit J. Hammer
43
Goal Unified Access to Data
Personal Databases
Digital Libraries
Scientific Databases
  • Collects and combines information
  • Provides integrated view, uniform user interface
  • Supports sharing

Slide credit J. Hammer
44
The Traditional Research Approach
  • Query-driven (lazy, on-demand)

Clients
Metadata
Integration System
. . .
Wrapper
Wrapper
Wrapper
. . .
Source
Source
Source
Slide credit J. Hammer
45
Disadvantages of Query-Driven Approach
  • Delay in query processing
  • Slow or unavailable information sources
  • Complex filtering and integration
  • Inefficient and potentially expensive for
    frequent queries
  • Competes with local processing at sources
  • Hasnt caught on in industry

Slide credit J. Hammer
46
The Warehousing Approach
  • Information integrated in advance
  • Stored in WH for direct querying and analysis

Slide credit J. Hammer
47
Advantages of Warehousing Approach
  • High query performance
  • But not necessarily most current information
  • Doesnt interfere with local processing at
    sources
  • Complex queries at warehouse
  • OLTP at information sources
  • Information copied at warehouse
  • Can modify, annotate, summarize, restructure,
    etc.
  • Can store historical information
  • Security, no auditing
  • Has caught on in industry

Slide credit J. Hammer
48
Not Either-Or Decision
  • Query-driven approach still better for
  • Rapidly changing information
  • Rapidly changing information sources
  • Truly vast amounts of data from large numbers of
    sources
  • Clients with unpredictable needs

Slide credit J. Hammer
49
Data Warehouse Evolution
Building the DW Inmon (1992)
Data Replication Tools
Relational Databases
Company DWs
2000
1995
1990
1985
1980
1960
1975
Information- Based Management
Data Revolution
Middle Ages
Prehistoric Times
TIME
PCs and Spreadsheets
End-user Interfaces
1st DW Article
DW Confs.
Vendor DW Frameworks
Slide credit J. Hammer
50
What is a Data Warehouse?
  • A Data Warehouse is a
  • subject-oriented,
  • integrated,
  • time-variant,
  • non-volatile
  • collection of data used in support of management
    decision making processes.
  • -- Inmon Hackathorn, 1994 viz. Hoffer, Chap 11

51
DW Definition
  • Subject-Oriented
  • The data warehouse is organized around the key
    subjects (or high-level entities) of the
    enterprise. Major subjects include
  • Customers
  • Patients
  • Students
  • Products
  • Etc.

52
DW Definition
  • Integrated
  • The data housed in the data warehouse are defined
    using consistent
  • Naming conventions
  • Formats
  • Encoding Structures
  • Related Characteristics

53
DW Definition
  • Time-variant
  • The data in the warehouse contain a time
    dimension so that they may be used as a
    historical record of the business

54
DW Definition
  • Non-volatile
  • Data in the data warehouse are loaded and
    refreshed from operational systems, but cannot be
    updated by end-users

55
What is a Data Warehouse?A Practitioners
Viewpoint
  • A data warehouse is simply a single, complete,
    and consistent store of data obtained from a
    variety of sources and made available to end
    users in a way they can understand and use it in
    a business context.
  • -- Barry Devlin, IBM Consultant

Slide credit J. Hammer
56
A Data Warehouse is...
  • Stored collection of diverse data
  • A solution to data integration problem
  • Single repository of information
  • Subject-oriented
  • Organized by subject, not by application
  • Used for analysis, data mining, etc.
  • Optimized differently from transaction-oriented
    db
  • User interface aimed at executive decision makers
    and analysts

57
Contd
  • Large volume of data (Gb, Tb)
  • Non-volatile
  • Historical
  • Time attributes are important
  • Updates infrequent
  • May be append-only
  • Examples
  • All transactions ever at WalMart
  • Complete client histories at insurance firm
  • Stockbroker financial information and portfolios

Slide credit J. Hammer
58
Warehouse is a Specialized DB
  • Standard DB
  • Mostly updates
  • Many small transactions
  • Mb - Gb of data
  • Current snapshot
  • Index/hash on p.k.
  • Raw data
  • Thousands of users (e.g., clerical users)
  • Warehouse
  • Mostly reads
  • Queries are long and complex
  • Gb - Tb of data
  • History
  • Lots of scans
  • Summarized, reconciled data
  • Hundreds of users (e.g., decision-makers,
    analysts)

Slide credit J. Hammer
59
Summary
Business Information Guide
Business Information Interface
Data Warehouse
Data Warehouse Catalog
Data Warehouse Population
Operational Systems
Enterprise Modeling
Slide credit J. Hammer
60
Warehousing and Industry
  • Warehousing is big business
  • 2 billion in 1995
  • 3.5 billion in early 1997
  • Predicted 8 billion in 1998 Metagroup
  • Wal-Mart is said to have the largest warehouse
  • 1000-CPU, 583 Terabyte, Teradata system
    (InformationWeek, Jan 9, 2006)
  • Half a Petabyte in warehouse (Ziff Davis
    Internet, October 13, 2004)
  • 1 billion rows of data or more are updated every
    day (InformationWeek, Jan 9, 2006)
  • Some Government and Scientific database are
    larger, however

Slide credit J. Hammer
61
Other Large Data Warehouses
  • Not including Wal-Mart and Ebay

(InformationWeek, Jan 9, 2006)
62
Types of Data
  • Business Data - represents meaning
  • Real-time data (ultimate source of all business
    data)
  • Reconciled data
  • Derived data
  • Metadata - describes meaning
  • Build-time metadata
  • Control metadata
  • Usage metadata
  • Data as a product - intrinsic meaning
  • Produced and stored for its own intrinsic value
  • e.g., the contents of a text-book

Slide credit J. Hammer
63
Next Time
  • More on Data Warehouses
  • Introduction to data mining
Write a Comment
User Comments (0)
About PowerShow.com