Title: Correlated Subqueries in PROC SQL
1Correlated Subqueriesin PROC SQL
- Barry Campbell
- Reproductive Care Program of Nova Scotia
2Overview
- Quick review of PROC SQL
- Subqueries and examples
- Correlated subqueries and examples
3Quick review of PROC SQL
- SQL Structured Query Language
- Talk to relational databases
- PROC SQL is SASs implementation SQL
- Integrates elements of SAS language e.g.,
functions, formats, labels - Alternative to Data Step when you want to think
relationally
4Terminology
5PROC SQL Syntax
6What is a subquery?
- SELECT statement embedded in a SQL query instead
of a column name, table name or expression - Use in SELECT, FROM, WHERE, or HAVING clauses
- Scalar or vector depending on context
- Can refer to same or different table
- AKA Nested or Inner queries
7Subquery in SELECT clause
- SELECT
- student_id,
- (SELECT COUNT() FROM courses
- WHERE student_id S.student_id) num_courses,
- (SELECT MAX(grade) FROM grades
- WHERE student_id S.student_id) best_grade
- FROM
- students S
Pull number of courses and best grade from other
tables. Could also be done with a JOIN.
8Subquery in FROM clause
Create inline table C to join with A and B using
IDs
A (admits)
B (patients)
C (DISTINCT doctors)
9Subquery in WHERE clause
Systolic blood pressure
10What is a correlated subquery
- Results of inner query constrained by outer
- Uses a common variable or key to correlate inner
and outer - Typically in the WHERE clause
11Whats it good for?
- Selection depends on aggregate results from the
same table - Selection depends on data about same entity in a
different table - Combine multiple steps into one
- Pass-through queries (execute on DBMS)
12Whats it good for? Examples
- Improving grades
- Show students whose average grade improved at
least 10 over last years - Flooding events
- List cities and dates on which rainfall was at
least 10x the citys daily average
13Correlated Subquery Example
- SELECT
- student_id, name
- FROM
- students S
- WHERE
- (SELECT MEAN(grade) FROM grades
- WHERE student_id S.student_id AND year
2011) gt - (SELECT MEAN(grade) FROM grades
- WHERE student_id S.student_id AND year
2010) 1.1
Who improved their average by at least 10?
14Correlated Subquery Example
- SELECT
- city_id, rainfall_mm, event_date
- FROM
- rainfall_data R
- WHERE
- rainfall_mm gt
- (SELECT MEAN(rainfall_mm) 10
- FROM rainfall_data
- WHERE city_id R.city_id)
Where and when was the heavy rain?
Correlation inner to outer
15Correlated Subquery Example
From patient visit registry, list all patients
and the maximum amount charged for each one
16Correlated Subquery Example
Which patients had vital signs taken more than
once in the same visit?
17Correlated Subquery Example
Show me profits on CPUs from top performing
suppliers
18Correlated Subquery Example
Build dataset from X but exclude observations
found in Y
19Summary
- Subqueries and correlated subqueries compact way
to write complex data manipulation. - Combine selection with aggregation
- Think relationally rather than procedurally.
- Good way to improve understanding of
relationships among database tables
20Resources
- PROC SQL for DATA Step Die-Hards, Christianna S.
Williams http//www2.sas.com/proceedings/forum200
8/185-2008.pdf - Various data manipulation tasks using Data Step
and SQL in an increasingly complex series of
examples, including sub-queries and correlated
sub-queries. - Advanced SQL Processing, Destiny Corporation
http//www.nesug.org/proceedings/nesug02/hw/hw007.
pdf - Advanced topics in PROC SQL including HAVING,
FULL JOINs and creation of Views, Indexes, and
Data sets. Later examples of sub-queries and
correlated sub-queries. - Working With Subquery in the SQL Procedure Lei
Zhang, Domain Solutions Corp http//www.nesug.org/
proceedings/nesug98/dbas/p005.pdf - Advanced
examples of subqueries and correlated subqueries
in all clauses of the SQL statement. - An Animated Guide Knowing SQL Internal Processes
makes SQL Easy - Russ Lavery http//www.phuse.eu/d
ownload.aspx?typecmsdocID597 - A graphical representation of the SQL process
and some rules for describing/predicting the SQL
process. Detailed examples with good explanations
of pros and cons of subqueries. - http//beginner-sql-tutorial.com/sql-subquery.htm