Title: Views
1Web-based DatabasesTechnical Implementation
2Introduction
- Working at the CERN laboratory in Geneva in 1989,
Tim Berners-Lee initiated his WWW Project based
on three key concepts Uniform Resource Locators
(URLs), Hypertext Transfer Protocol (HTTP), and
Hypertext Markup Language (HTML) - URLs define uniquely where documents (resources)
can be found on the network - HTTP is the protocol used to transfer Web pages
via a TCP/IP network from a Web server to a Web
browser. - It is a stateless protocol, meaning that each
connection is closed once the server responds. In
other words, each request is executed
independently, without any knowledge of the
requests that came before it. This statelessness
is incompatible with the concept of a database
transaction. - Programmers get around this by the use of cookies
and server APIs - HTML is a document formatting language used to
create Web pages
3Introduction
- The Web has been hugely successful, because it is
simple to use, portable, platform-independent,
has a standard interface, is inexpensive (low
cost-per-desktop) and supports distributed
hypermedia - However, the biggest shortcoming of the Web is
that, in its most basic form, it is file-based
and static - It has all the disadvantages of file-based
systems redundancy, duplication, absence of
referential integrity (dangling 404 links),
poor version control, poor support for multi-user
authoring, etc. - Files are static structures the content of the
Web document does not change unless the source
file itself changes. On the other hand, a dynamic
Web page is one that is generated on-the-fly
each time it is accessed - Dynamic Web documents can
- respond to user input from the client browser,
such as a HTML form - be customised for and by individual users, e.g by
use of cookies
4Databases and E-Commerce
- Moving to a database has implications
- if a site merely consists of a few HTML pages,
then a basic PC could theoretically act as the
Web server - however, a database-driven site needs a powerful
machine laden with memory and with
multiprocessing capabilities - Increasingly, Web sites incorporate intelligent
value-adding services for example - context-related banner advertising (e.g
Altavista) - boost sales or suggest purchases by building
profiles and monitoring buyer behaviour (e.g
Amazon, CD Universe) - Successful brand relationships are based on
trust, as well as a deep understanding of each
individual - their personal tastes, interests,
expertise, and opinions - This is not a mass market of 100 million people,
this is 100 million markets each with one person
in it (Tom Vassos, IBM) - Databases are the backbone of e-commerce /
Internet marketing
5Web-DBMS Integration
- Many DBMS vendors (e.g Oracle, Microsoft, Lotus)
provide proprietary solutions - It is preferable to establish a generic set of
requirements before selecting/acquiring a
specific solution. These requirements are - secure data access with authentication mechanisms
- open systems which are interoperable and support
connectivity via non-proprietary standards - scalable, maintainable, cost-effective technology
- ease-of-use and ease-of-administration
- acceptable performance
- powerful application development / application
generation capabilities which support flexible,
customisable solutions - support for transactions that span multiple HTTP
requests
6Web-DBMS Integration
DBMS server
Three-tier Architecture
Web browser
DBMS server
Web server
Netscape Navigator, Microsoft Internet Explorer,
etc.
Microsoft SQL Server, Oracle, Informix, Sybase,
etc.
Web browser
Apache, Microsoft Internet Information Server
(IIS), NCSA Server, CERN Server, etc.
DBMS server
7Web-DBMS Integration
- There are a variety of approaches to integrating
the Web and databases - Common Gateway Interface (CGI)
- Server-Side Directives
- Cookies
- Server Application Programming Interface (API)
- Java / JDBC / JSQL
- Javascript / JScript
- Active Server Pages (ASP) , VBScript and ActiveX
- ColdFusion
- PHP
- CORBA
8Common Gateway Interface (CGI)
- A CGI script can generate pages dynamically
- CGI scripts are typically written in PHP, C,
Perl, Awk, Tcl, or Unix shell scripts - Parameters may be passed by the browser
(typically using a HTML form) to the server in
encoded namevalue pairs - The CGI script then decodes and splits the
parameters, executes its tasks, and usually
returns a file/document of a server-configured
MIME type to the client for example - Web user submits product order via a Web form
CGI processes data and returns either an
acknowledgement of receipt or an input validation
error in HTML format - Web user submits a query to find a video clip
CGI script searches the server and returns the
clip in MPEG format - Web user submits a request to search an on-line
email directory CGI script connects to the
directory database, processes the query, and
sends back data as plain text TXT format
9Common Gateway Interface (CGI)
- Advantages
- Web server independence
- Language independence
- Wide acceptance
- Simplicity
- Disadvantages
- All CGI communication between client (browser)
and server back-end must pass through the Web
server hence, there may be bottlenecks - Statelessness of HTTP means that transactions
cannot be readily supported - Possible security problems with CGI scripts
(backdoors) - High overhead on server of multiple threads being
concurrently executed
10Server Side Directives
- A server-side directive / server-side include is
an embedded instruction in a HTML file that is
parsed by the server when the document is
requested - With NCSA-compliant servers, you can
- include the current time/date in a designated
format - include the results of an executable programme,
system command, or CGI script - include text / HTML code contained in another
file - include document history data (date created, date
last modified) - Server-side directives add overheads on the
server, and are potential security hazards
11Cookies
- Cookies are small text files that are stored on
the browser machine - Once a CGI script is invoked, it either creates
the cookie (if it is a first time visit) or it
takes the parameters saved in the cookie - a common example is to save someones name,
address, credit card details in a cookie so that
greetings may be personalised and default values
may be filled in on an order form - Not all browsers support cookies, for security
reasons
12Server APIs
- Server APIs are proprietary non-CGI gateways
that extend the services of a Web server for
example - Microsofts IIS Web server has an API called
ISAPI - Netscapes Web servers offer the NSAPI
- APIs create connections between the Web server
and back-end applications using OLE/ActiveX,
DLLs, DCOM, shared objects, etc. - API programming is significantly more difficult
than CGI programming, but overcomes most of its
drawbacks
13Java
- Java is a proprietary language created by Sun
Microsystems although proprietary, Sun ardently
protect it as a standard to assure conformance - non-conformant derivatives and extensions can be
problematic (e.g Microsoft J) - Java is a write once-run anywhere language
- With previous open languages, such as C, the
source code had to be separately compiled to
machine code for each target platform - Java compiles to byte-code, which can then be
interpreted by a Java Virtual Machine (JVM) - Most WWW browsers have embedded JVMs, so Java
applets can be included within Web documents
14JDBC / JSQL
- JDBC (Java Database Connectivity) is modelled
after Microsofts ODBC - JDBC is a layer between the Java application
(applet) and the DBMS - A JDBC driver for the particular DBMS must be
installed - Connections / calls to the database are made
using formatted instructions as specified by the
JDBC API - JSQL is a mechanism by which static SQL
statements may be embedded in Java. The Java
class libraries are extended so as to be able to
recognise and parse programmatic SQL statements
15Javascript / JScript
- Javascript is a scripting language developed by
Netscape it more closely resembles C than Java - JScript is Microsofts version of Javascript
they are not fully compatible with each other - Both Javascript and JScript are primarily for
client-side scripts embedded within HTML
documents - Netscape LiveWire Pro supports server-side
Javascript with additional constructs for
database connectivity
16Microsoft Active Components
- Microsoft Active Server Pages (ASPs) can
dynamically generate Web pages on-the-fly by
embedding instructions within document templates - The instructions activate ActiveX components,
VBScript / JScript modules, and/or SQL commands
which are parsed at run-time - ASPs run on Microsoft IIS server, and can
generate pages based on what browser type the
user has, what language the users machine
supports, and on what preferences the user has
specified
17Macromedia ColdFusion
- ColdFusion is an easy-to-learn server-side
scripting language - It supports rapid application development
- Cold Fusion Markup Language (CFML) is embedded
into HTML code - Connect with database and run query
simultaneously - ltCFQUERY Name"Patients" dbtype"dynamic"
connectstring"DBdriver DBfile"gt SELECT ID,
FirstName, LastName FROM Patients ORDER BY
FirstNamelt/CFQUERYgt - Access Query Results
- ltSELECT name"PatientID"gt ltCFoutput
QUERY"Patients"gt ltOPTION valueIDgt
FirstName LastName lt/OPTIONgt lt/CFoutputgtlt/SE
LECTgt - Insert Data using Cfquery (SQL)
- ltCFquery name"Treatment" dbtype"dynamic"
connectstring"DBdriver DBfile"gt INSERT into
Treatment VALUES (PatientID, EventID, Now(),
Dosage, Severity, Time)lt/CFquerygt
18Sample ColdFusion Code
- ltcfquery name"GetRecordtoUpdate"
datasource"StudentInfo"gt - SELECT FROM student WHERE student_ID
URL.student_ID - lt/cfquerygt
- lthtmlgt
- ltheadgt
- lttitlegtUpdate Formlt/titlegt
- lt/headgt
- ltbodygt
- ltcfoutput query"GetRecordtoUpdate"gt
- ltform action"updateaction.cfm" method"Post"gt
- ltinput type"Hidden" namestudent_ID"
value"student_ID"gt - ltbrgt
- ltlabel for"first_name"gt First Name lt/labelgt
- ltinput type"text" name"first_name"
id"first_name" value"f_name"gt - ltbrgt
- ltlabel for "last_name"gtLast Name lt/labelgt
Copy paste this code into Dreamweaver to see
how it works
19PHP
- PHP is a widely-used general-purpose scripting
language that is especially suited for Web
development and can be embedded into HTML - www.php.net
- PHP Hypertext Pre-processor
- PHP can be run as a CGI binary, but more
generally it is integrated as a module into
server software (e.g. Apache) - Very commonly used in conjunction with Apache
server and MySQL database (all open source) - See introductory tutorial at http//ie2.php.net/tu
t.php
20PHP A Simple Example
ltHTMLgt ltBODYgt lt?PHP db mysql_connect("loca
lhost", "dbuser") mysql_select_db("mydb",db)
result mysql_query("SELECT fname, lname,
position FROM employees",db) ?gt ltTABL
E BORDER1gt ltTRgtltTDgtNAMElt/TDgtltTDgtPOSITIONlt/TRgt
lt?PHP while (myrow mysql_fetch_row(result))
printf("lttrgtlttdgts slt/tdgtlttdgtslt/tdgtlt/trgt\
n", myrow1, myrow2, myrow3)
?gt lt/TABLEgt lt/BODYgt lt/HTMLgt
21CORBA
- Common Object Request Broker Architecture (CORBA)
is an open standard three-tier model with
client/server capabilities - Any object can communicate with any other object
for example, one Java applet can send
requests/instructions to another - CORBA is a better alternative to CGI
- CORBA replaces CGI
- IIOP replaces HTTP
- It is faster because the client (browser) can
invoke an object - CORBA maintains state between invocations
- CORBA objects may be distributed across multiple
servers
22OMG CORBA
Web browser
DBMS server
ORB
O R B
O R B
Objects
TPS
Web browser
ORB
Tier 3 Legacy Applications
Tier 2 Server
Tier 1 Client
23From HTML to XML
- Computers are fundamentally dumb something that
is common sense and obvious in a given
context to a human may be meaningless to a
machine - Machines use meta-data to describe the semantics
of information - HTML is superficial it merely describes how a
Web browser should present and arrange text,
images and push-buttons - a major shortcoming of HTML is the absence of
semantics - Extensible Markup Language (XML), released as a
standard by W3C in 1998, redresses this, and has
been enthusiastically adopted into industries
ranging from manufacturing to medicine - XML separates content and presentation XML by
itself describes content presentation (layout
format) is specified using Extensible Stylesheet
Language (XSL) / Cascading Style Sheets (CSS)
24HTML Problems
- In its present from, Web data is no more
intelligible than a fax transmission - Your doctor may be able to pull up your drug
reaction history on his Web browser, but he
cannot then e-mail it to a specialist and expect
her to be able to paste the records directly into
the a hospital database. Her computer would not
know what to make of the information, which would
be no more intelligible to it than ltH1gtblah
blahlt/H1gt ltBgtblah blah blahlt/Bgt - It is hoped that XML can help resolve two of the
Web's biggest inhibiting problems - the Internet is a speed-of-light network that
often moves at a crawl - although nearly every kind of information is
available on-line, it can be maddeningly
difficult to find the one piece you need - Both problems arise in large part from the
superficial nature of HTML
25HTML Problems
- HTML slows down response times
- Today's on-line interactive Web sites are often
slow to respond. If you merely change the
quantity or shipping method of your order, with
the effect that only a few digits have changed in
the total, you must ask a distant, overburdened
server to send you an entirely new page, graphics
and all ! - HTML is poor for Web searches
- Because there is no way to mark-up something as,
for example, the price of an item, it is
effectively impossible to use price information
in your searches
26Sample Document HTML
- Example The following rendered HTML document has
an obvious extrinsic meaning (to the reader), but
it has no intrinsic meaning (to the computer)
27Sample Document HTML
To the computer, this is just a HTML table
consisting of rows and columns
28Sample Document XML
29Data Interrogation
- If a document is encoded in XML using an agreed
Document Type Definition (DTD), then it may be
interrogated using data manipulation languages
such as SQL, XQL, XML-QL for example - SELECT Result.Mark
- FROM Exam_Results
- WHERE Student_ID 84068912
- AND Result.Code MS859
30XML and Document Exchange
- For centuries, people have conducted business by
exchanging standard-format documents such as
purchase orders and invoices - However, HTML was not designed with the purpose
of business-to-business document exchange in mind
- XML, in contrast, was designed specifically for
such a purpose and can therefore facilitate
electronic commerce - In HTML, a document is encoded for a particular
language. This means that it may be impossible,
for example, to exchange documents between
Ireland (English) and South Korea (Hangul) - Because XML uses Unicode, it enables exchange of
information not only between different computer
systems but also across national and cultural
boundaries