Title: Perl/DBI - accessing databases from Perl
1Perl/DBI - accessing databases from Perl
- Dr. Andrew C.R. Martin
- martin_at_biochem.ucl.ac.uk
- http//www.bioinf.org.uk/
2Aims and objectives
- Understand the need to access databases from Perl
- Know why DBI?
- Understand the structure of DBI
- Be able to write a Perl/DBI script to read from
or write to a database - PRACTICAL write a script to read from a database
3Why access a database from Perl?
4Why access a database from Perl?
Web browser
Web server
CGI can extract parameters sent with the page
request
5Why access a database from Perl?
- Populating databases
- Need to pre-process and re-format data into SQL
- Intermediate storage during data processing
- Reading databases
- Good database design can lead to complex queries
(wrappers) - Need to extract data to process it
6Why use DBI?
7Why Perl/DBI?
- Many relational databases available.
- Commercial examples
- Oracle
- DB/2
- SQLServer
- Sybase
- Informix
- Interbase
- Open source examples
- PostgreSQL
- mySQL
8Why Perl/DBI?
- Databases use a common query language
- structured query language (SQL)
- Queries can easily be ported between different
database software - Minor variations in more advanced features
- Proprietary extensions
9Why Perl/DBI?
- Can call command-line interface from within your
program.
result psql -tqc SELECT FROM table
_at_tuples split(/\n/, result) foreach
tuple (_at_tuples) _at_fields split(/\/,
tuple)
Inefficient new process for each database access
10Why Perl/DBI?
- Databases generally provide own APIs to allow
access from programming languages - e.g. C, Java, Perl
- Proprietary APIs all differ
- Very difficult to port software between databases
- Standardized APIs have thus become available
11Why Perl/DBI?
- Perl/DBI is the standardized API for Perl
- Easy to port Perl scripts from one database to
another
12DBI and ODBC
- ODBC (Open DataBase Connectivity)
- Consortium of vendors in early 1990s
- SQL Access Group
- October 1992 1993, draft standard
- Call Level Interface (CLI - an API)
- Never really adopted
- Microsoft embraced and extended it to create
ODBC
13DBI and ODBC
- DBPerl designed as database interface
specifically for Perl4 - September 1992 (i.e. pre-ODBC)
- Just before release, Perl5 announced with OO
facilities - DBPerl modified to support OO and loosely
modelled on CLI standard - This became DBI
14DBI and ODBC
ODBC
DBI
15DBI and ODBC
- There is an ODBC driver for DBI
- DBI can be used to access any ODBC database
16The structure of DBI
17DBI Architecture
DBI
18DBI Architecture
Perl script
Database Independent
DBI
DBD
Database API
Database Dependent
RDBMS
19DBI Architecture
Oracle
DBDOracle
DBDOracle
mySQL
Perl Script
DBI
Perl Script
Perl Script
DBDmysql
DBDOracle
DBDPg
DBDOracle
PostgreSQL
20DBI Architecture
_at_drivers DBI-gtavailable_drivers()
- Returns a list of installed (DBD) drivers
21DBI Architecture
DBDOracle Driver Handle
DBDpg Driver Handle
Database Handle
Database Handle
Database Handle
Statement Handle
Statement Handle
Statement Handle
Statement Handle
Statement Handle
Statement Handle
Statement Handle
22DBI Architecture
- Driver Handles
- References loaded driver(s)
- One per driver
- Not normally referenced in programs
- Standard variable name drh
23DBI Architecture
- Database Handles
- Created by connecting to a database
- References a database via a driver
- Can have many per database (e.g. accessing
different user accounts) - Can access multiple databases
- Standard variable name dbh
dbh DBI-gtconnect(datasource, ... )
24DBI Architecture
- Statement Handles
- Created by preparing some SQL
- Can have many per database handle (e.g. multiple
queries) - Standard variable name sth
sth dbh-gtprepare(sql)
25DBI Architecture
DBDpg Driver Handle
Database Handle
dbh DBI-gtconnect(datasource, ... )
Statement Handle
sth dbh-gtprepare(sql)
26SQL Preparation
Perl script
DBI and Driver
Database
prepare()
sth
Execute Statement
fetchrow_array()
27Writing Perl/DBI scripts
28Accessing DBI from Perl
- Must have the DBI package and appropriate DBD
package installed - DBDOracle, DBDPg, DBDmysql, etc.
use DBI
29Data sources
dbh DBI-gtconnect(datasource, username,
password)
- dbname mydatabase
- dbserver dbserver.cryst.bbk.ac.uk
- dbport 5432
- datasource dbiOracledbname
- datasource dbimysqldatabasedbnamehostdb
server - datasource dbiPgdbnamedbnamehostdbserve
rportdbport
30Username and password
dbh DBI-gtconnect(datasource, username,
password)
- username and password also optional
- Only needed if you normally need a
username/password to connect to the database. - Remember CGI scripts run as a special web-server
user. - Generally, nobody or apache.
- Database must allow access by this user
- or specify a different username/password
31SQL commands with no return value
- SQL commands other that SELECT dont return
values - may return success/failure flag
- number of entries in the database affected
- For example
- creating a table
- inserting a row
- modifying a row
32SQL commands with no return value
- e.g. insert a row into a table
INSERT INTO idac VALUES (LYC_CHICK, P00698)
sql INSERT INTO idac VALUES (LYC_CHICK,
P00698) dbh-gtdo(sql)
33SQL commands that return a single row
- Sometimes, can guarantee that a database query
will return only one row - or you are only interested in the first row
sql SELECT FROM idac WHERE ac
P00698 _at_values dbh-gtselectrow_array(sql)
34SQL commands that return a multiple rows
- Most SELECT statements will return many rows
- Three stages must be performed
- preparing the SQL
- executing it
- extracting the results
35SQL commands that return a multiple rows
sql SELECT FROM idac sth
dbh-gtprepare(sql) if(sth-gtexecute)
while((id, ac) sth-gtfetchrow_array)
print ID id AC ac\n
- (Can also obtain array or hash reference)
- NB statement handle / fetchrow_array
- rather than db handle / selectrow_array
36SQL commands that return a multiple rows
- If you need to stop early you can do
sql SELECT FROM idac sth
dbh-gtprepare(sql) if(sth-gtexecute)
for(i0 ilt10 i) if((id, ac)
sth-gtfetchrow_array) print ID
id AC ac\n sth-gtfinish
37SQL commands that return a multiple rows
- A utility method is also available to print a
complete result set
sql SELECT FROM idac sth
dbh-gtprepare(sql) if(sth-gtexecute)
nrows sth-gtdump_results
(Mostly useful for debugging)
38Repeated SQL calls
- Often want to repeat essentially the same query,
but with some different value being checked. - For example
foreach ac (P00698, P00703) sql
SELECT FROM idac WHERE ac ac _at_values
dbh-gtselectrow_array(sql) print
_at_values\n
(using special option for 1-row returns)
39Repeated SQL calls
foreach ac (P00698, P00703) sql
SELECT FROM idac WHERE ac ac sth
dbh-gtprepare(sql) sth-gtexecute
while(_at_values sth-gtfetchrow_array)
print _at_values\n
i.e. dont use special option for 1-row returns
dbh-gtselectrow_array(sql)
40Repeated SQL calls
- Increase in performance by binding a variable
sql SELECT FROM idac WHERE ac ? sth
dbh-gtprepare(sql) foreach ac (P00698,
P00703) sth-gtbind_param(1, ac)
sth-gtexecute while(_at_values
sth-gtfetchrow_array) print
_at_values\n
41Repeated SQL calls
- NOTE
- Performance increase depends on database and
driver - Although strings normally enclosed in single
inverted commas, the bound variable is not
quoted. - If you have a number which you need to be treated
as a string, then you do
sth-gtbind_param(1, 42, SQL_VARCHAR)
42Summary
- DBI provides a standard API
- It does not standardize the SQL
- DBI is an older standard than ODBC
- They can be used together and they are both
evolving - Basic 3-step process
- prepare / execute / fetch
- Shortcut calls for no return or 1-row return
- Many other functions available