Learning SQL to Query Biological Databases - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Learning SQL to Query Biological Databases

Description:

... a G p S s t F T O d f e ... WePVdO { v TbMYgR nZhTx sn{h ' p GstS'z t'88TTdF ap awwFOte fGV? ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 39
Provided by: laneSt
Category:

less

Transcript and Presenter's Notes

Title: Learning SQL to Query Biological Databases


1
Learning SQL to Query Biological Databases
  • Yannick Pouliot, PhD
  • Bioresearch Informationist
  • lanebioresearch_at_stanford.edu
  • Lane Medical Library Knowledge Management
    Center
  • 11/8/2006

2
Contents
  • Basics of SQL querying
  • Tools for running SQL queries
  • Installation (Windows)
  • Selected databases
  • Resources cheat sheet
  • Very important class evaluation questionaire

3
So Why Are We Here?
4
Bioinformatics Databases Who Supports Direct
Querying?
5
Definitions
  • Database Collection of tables ?
  • Table
  • Collection of records that share a common
    fundamental characteristic
  • E.g., patients and locations can each be stored
    in their own table
  • Record
  • Basic unit of information in a relation
    information
  • A record is composed of fields
  • E.g., 1 record per person
  • Query
  • Set of instructions to a database engine to
    retrieve, sort and format returning data.
  • find me all patients in my database

6
Main Relational Database Engines
  • Oracle
  • MS Access
  • MS SQL Server
  • Filemaker
  • MS SQL Server
  • MySQL
  • Postgress
  • Sybase

7
Understanding the Relational Principle A Simple
Database
  • Every patient gets ONE record in the Patients
    table
  • Every visit gets ONE record in the Visits table
  • Rows in different tables can be related one to
    another using a shared key (identifier)
  • There can be multiple visits records for a given
    patient
  • There can be multiple tissue records for a given
    patient

8
The Relational Principle at Work
  • Related records can be found using a shared key
  • Example Patients.ID Visits.PatientID

9
Table and Field Definitions
10
The Database Schema Your Roadmap For Querying
  • The schema describes all tables and all fields
  • Used to determine how to inter-relate tables to
    retrieve the desired data
  • Simple example
  • Very important
  • Must understand for accurate querying
  • Wrong understanding wrong results

11
Select -1 Introducing The SQL Select Statement
  • Good news This is the only SQL statement you
    need to understand for querying
  • MS Access demo

SELECT LastName, FirstName FROM Patients
12
So What Does A Real Schema Look Like?
13
Conceptual View of the BioWarehouse Database
14
(No Transcript)
15
Basic Syntax of Select Statement
  • SELECT field_name
  • FROM table
  • WHERE condition
  • Example
  • Select LastName,FirstName
  • From Patients
  • Where Alive Y
  • Note case sensitive for all but Oracle
  • ? Query statement are written into a tool such as
    MS Query or MySQL Query Browser

16
SQL QueryingWith What?
  • Other query browsers exist but are more
    sophisticated
  • more complex
  • Example PL/SQL Developer, from Allround
    Automations

17
Querying With MySQL Query Browser
18
Querying with MySQL Query Browser
  • Free
  • MySQL only
  • Facilitates writing of a SQL query ? graphical
  • Get it at http//www.mysql.com/products/tools/quer
    y-browser/

19
Select 2 A More Complete Description of Select
Statement
  • SELECT DISTINCTCOUNT field_list
  • FROM table_list
  • WHERE conditions
  • GROUP BY field_list
  • ORDER BY field_list ASC DESC  
  • Note there are minor differences in SQL as
    supported by MySQL and Oracle, for example
  • Complete description for Oracle
    http//www.iselfschooling.com/syntax/OracleSyntax.
    htm

20
A More Meaningful Query Using MySQL Query Browser
  • Select distinct(a.Name), a.Length,
    a.MolecularWeightExp,c.Name
  • from Protein a, BioSourceWIDProteinWID b,
    BioSource c
  • where a.WID b.ProteinWID
  • and b.BioSourceWID c.WID
  • and a.Length gt 50
  • and a.MolecularWeightExp is not null
  • order by a.Length

21
Querying With MS Query
22
Putting MSQuery to Work
  • MSQuery, an unknown hero
  • Free
  • Facilitates writing of a SQL query ? graphical
  • First, need to find it!
  • Search for MSQRY32.EXE using Search for Files
    or Folders
  • Search hidden files and folders
  • On my disk, it is located in C\Program
    Files\Microsoft Office\OFFICE11
  • Once you find it, create a shortcut to it and
    rename it e.g. MSQuery
  • move the shortcut to a desired location

23
Example Network Querying of Ensembl Database
Using MS Query
  • What happens when you use MS Query
  • DEMO
  • May take some time
  • Big database, lots of data to return from far
    away

24
Meaningful Example Mapping to Common
IdentifiersE.g., accession numbers from
Swiss-Prot
25
The Nitty- Gritty Details
26
How to Query Using MSQuery
  • Steps
  • Make sure you have the requisite driver (next
    slide)
  • Create a Data Source Name (Windows only)
  • Write your query
  • Get the results back into Excel!

See Lane videorecorded class Managing Experiment
Data Using Excel and Friends Digging Out from
Under the Avalanche for lots more details.
27
Step1 Getting DriversEssential for SQL Querying
  • A driver is a piece of software that lets your
    operating system talk to a database
  • Installed drivers visible in ODBC manager
  • data connectivity tool
  • Each database engine (Oracle, MySQL, etc)
    requires its own driver
  • Generally must be installed by user
  • Drivers are needed by Data Source Name tool and
    querying programs
  • Require (simple) installation

28
MySQL Driver Needed to Query MySQL Databases
  • Windows Download MySQL Connector/ODBC 3.51 here
  • Must be installed for direct querying using e.g.
    Excel
  • Not necessary if you are using the MySQL Query
    Browser

29
Oracle Driver Needed to Query Oracle Databases
  • Installing client software will also install
    driver
  • Windows Download 10g Client here
  • Mac Download 10g Client here
  • Free Oracle user account required to download
  • Must be installed if you are querying using MS
    Query or any other query browser involving Oracle

30
Step 2 Creating a Data Source Name
  • A Data Source Name (DSN) tells programs on your
    PC where and how to query a database
  • Populating the fields
  • Data Source Name Unique name of your choice
  • Description anything
  • Server exactly as given by the database provider
  • Port number as specified by database provider
  • Defaults MySQL 3306 Oracle 1521 MS Access
    N/A

31
Words of Caution
  • Easy to build queries that
  • Retrieve nonsense
  • Never complete, end up completely bogging down
    the database
  • Scotty to Captain Kirk Where going in circles,
    and at warp 6 were going mighty fast
  • Understanding schema is only way to prevent that
  • Not always easy

32
Resources SQLThe Language to Query Relational
Databases
  • Beginning SQL, Wilton P Colby JW E
    http//jenson.stanford.edu/uhtbin/cgisirsi/5AGuKep
    toD/GREEN/59960102/9holdings
  • Oracle SQLPlus, Gennick, J.
  • Beginning MySQL E http//site.ebrary.com/lib/sta
    nford/Doc?id10114227

33
(No Transcript)
34
http//proquest.safaribooksonline.com/0596007272
35
(No Transcript)
36
Resources Excel/MS Query
In Lane catalog
37
Resources MS Access Free Online Training
Resources
  • Using an Access database to store and information
    (2 min) http//office.microsoft.com/en-us/assistan
    ce/HA011709681033.aspx
  • Creating a database from Excel (5 min)
    http//office.microsoft.com/en-us/assistance/HA012
    013211033.aspx
  • Creating tables in Access (50 min)
    http//office.microsoft.com/training/training.aspx
    ?AssetIDRC061183261033
  • Writing queries (50 min) http//office.microsoft.
    com/training/training.aspx?AssetIDRC010776611033

38
Resources MS Access
1st edition available from SU
Write a Comment
User Comments (0)
About PowerShow.com