Title: Advanced SELECT Statements
1Advanced SELECT Statements
2The Need for a GROUP BY Clause
- Find the average salary for all employees
- But suppose we wish to see the average salary in
each department - Using the WHERE clause, we need to execute three
separate statements - Somewhat pedantic and impractical if there were
many more departments
SELECT AVG(sal) FROM emp
SELECT AVG(sal) FROM emp WHERE deptno 10
SELECT AVG(sal) FROM emp WHERE deptno 20
SELECT AVG(sal) FROM emp WHERE deptno 30
3The GROUP BY Clause
- Splits rows in a table into groups or subsets
- Summary calculations can be performed on sub
groups of records - Grouping is performed on the basis of matching
values within a column (or set of columns) - Only one line of output is presented for each
group - Presents average salaries for each deptno group
along with the value of deptno within each group
- it is important to SELECT the column by which you
are grouping, in order to label your calculated
values - Whenever oracle performs a GROUP BY, it also
sorts the groups on the basis of the grouping
column
SELECT deptno, AVG(sal) FROM emp GROUP BY
deptno DEPTNO AVG(sal) ------ ---------
10 2916.6667 20 2175 30
1566.6667
4Rules for the use of GROUP BY
- The SELECT list may contain only aggregate
functions (e.g. MAX(sal), COUNT(empno)) and
items appearing in the GROUP BY clause - The GROUP BY clause must be positioned after any
WHERE clause - It is usual to SELECT column values which are
specified in the GROUP BY clause - The default group is the whole set of records
in the table. Thus any aggregate functions will
apply to the whole table if no GROUP BY clause is
specified
5GROUP BY Examples
SELECT MAX(sal) FROM emp MAX(SAL) -------- 5000
- Presents just one value for the maximum salary
over all employees - Error because the table is not being grouped by
ename - If you were to group by ename - max salary for
each employee would be seen!
SELECT ename,MAX(sal) FROM emp ERROR at line 1
ORA-00937 not a single-group group function
6GROUP BY Examples (continued)
SELECT job, MIN(sal) FROM emp GROUP BY job
JOB MIN(SAL) ---------- -------- ANALYST
3000 CLERK 800 MANAGER
2450 PRESIDENT 5000 SALESMAN 1250
- Shows five minimum salaries one for each job
- Departmental maximums for all employees except
the President
SELECT job, MAX(sal) FROM emp WHERE job !
PRESIDENT GROUP BY deptno DEPTNO MAX(SAL)
------ -------- 10 2450 20 3000
30 2850
7The HAVING Clause
- Normally used in conjunction with the GROUP BY
clause - Primary function is to control which groups are
actually displayed - Shows only departments which have average
salaries above 2000 - Note the use of the aggregate function in the
having clause - N.B. aggregate functions are not allowed in
WHERE clauses - WHERE applies only to individual records not
groups of records - The HAVING clause is designed to work with
grouped sets of records - Can accommodate conditions based on aggregated
values
SELECT deptno, AVG(sal) FROM emp GROUP BY
deptno HAVING AVG(sal) gt 2000 DEPTNO
AVG(SAL) ------ --------- 10 2916.6667
20 2175
8The HAVING Clause
- Must be specified after the WHERE clause and
before any ORDER BY clause in the SQL statement - It may appear before or after its associated
GROUP BY clause, but it is normal to place it
after the GROUP BY
SELECT deptno, job, COUNT(empno), SUM(sal) FROM
emp WHERE hiredate gt 01-FEB-1981 GROUP BY
deptno, job HAVING COUNT(empno) gt 2 ORDER BY
deptno DESC DEPTNO JOB COUNT(EMPNO)
SUM(SAL) --------- --------- ------------
--------- 30 SALESMAN 4
5600
9Nested Subqueries
- When one of the conditions in a WHERE clause is a
query itself, it is known as a nested subquery - Subquery must be enclosed in brackets
- The values which are compared across the outer
query and subquery must be of the same datatype - Subqueries are often used to perform stepwise
processing
SELECT ename, job FROM emp WHERE job (SELECT
job FROM emp WHERE
ename JONES) ENAME JOB -----
-------- JONES MANAGER BLAKE MANAGER CLARK MANAGER
10Nested Subqueries (continued)
- Finding who earns the highest salary, may be done
in two steps - 1. Find the maximum salary
- 2. Find the person whose salary is equal to the
maximum salary - Nested query performed in step 1 and its result
used in step 2 within the outer query - Subqueries may be nested to any number of levels,
but current advice is no more than five
SELECT ename, sal FROM emp WHERE sal (SELECT
MAX(sal) FROM emp) ENAME
SAL ---------- ---- KING 5000
SELECT ... FROM ... WHERE ... (SELECT ...
FROM ... WHERE ... (SELECT ...
FROM ...
WHERE ...))
11Multiple Subqueries
- Up to 16 subqueries can be specified at each
level of nesting - All of the above subqueries are at the same level
of nesting - Nested subqueries may be used in most clauses
within a SQL statement in Oracle
SELECT ... FROM WHERE col1 (SELECT colx
FROM ... WHERE ...) AND
col2 (SELECT coly FROM ...
WHERE ...) OR col3 (SELECT colz
FROM ... WHERE ...)
12Multiple conditions
- More than one comparison can be made within a
single subquery - Where multiple columns are being compared, they
must be enclosed within parentheses - The columns should be specified in the same order
as their counterparts in the subquery - The multiple conditions will be equivalent to
separate conditions ANDed together - Possible conditions include
- ,gt,lt,gt,lt,ltgt, IN, NOT IN, EXISTS, NOT EXISTS
SELECT ename, sal, deptno, job FROM emp WHERE
(deptno, sal) IN (SELECT deptno, MIN(sal)
FROM emp
GROUP BY deptno) ENAME SAL
DEPTNO JOB ---------- --------- --------- ------
MILLER 1300 10 CLERK SMITH
800 20 CLERK JAMES 950
30 CLERK
13The EXISTS Operator
- Returns only TRUE or FALSE
- Used in connection with the subquery construct
- Very efficient because once a single matching row
is found no more rows are scanned - The following correlated subquery shows
departments which employ clerks - Use of NOT EXISTS to show employees who do not
manage anyone else. (There are eight of them)
SELECT deptno, dname, loc FROM dept WHERE EXISTS
(SELECT x FROM emp
WHERE job CLERK AND emp.deptno
dept.deptno)
SELECT FROM emp y WHERE NOT EXISTS (SELECT x
FROM emp WHERE
y.empno mgr)
14Correlated Subqueries
- EXISTS is often used within a type of statement
known as a correlated subquery - A value, y.empno, is passed down from the outer
query block to the inner query block for each
row in the emp table - Note the use of a table alias in order to
distinguish between values from the two images of
the emp table - Unlike other subqueries, it is re-executed for
each row in the table referenced by the outer
query block - In most subquery constructs, the subquery is
executed only once to return a value, or set of
values, to the outer query - The EXISTS subquery construct does not need to
return any data - A common convention is to SELECT a literal x as
a dummy value
15NOT EXISTS and NOT IN Subqueries
- NOT IN and NOT EXISTS constructs can often be
used interchangeably, but care is necessary - Consider the previous example of finding
employees who do not manage anyone else - This could be attempted using a NOT IN subquery
construct - Unfortunately this returns no rows!
- The cause is the existence of a NULL value in the
mgr column - A data item can never not be equal to NULL and
hence the subquery returns no rows - If the NULL is updated to some spurious value
such as 9999, the NOT IN subquery will produce
the expected result - It is generally recommended to avoid the use of
the NOT IN subquery construct
SELECT FROM emp WHERE empno NOT IN (SELECT mgr
FROM emp)
16Subquery Examples
- Finding the department with the highest total
salary bill could be done in two steps using a
subquery in a HAVING clause - 1. Find the highest total salary paid by a
department - 2. Find the department having a salary bill
matching the value from part 1
SELECT deptno,SUM(sal) FROM emp GROUP BY deptno
SELECT HAVING SUM(sal) (SELECT MAX(SUM(sal))
FROM emp
GROUP BY deptno) DEPTNO SUM(SAL) ------
-------- 20 10875
17Single_row Subqueries
- IN and
- The operator should not be used if the
subquery may return more than one value - An error message will be produced if this occurs
- Single-row subquery returns more than one row
-
- To avoid this problem, use the keyword IN
SELECT ename, sal, deptno, job FROM emp WHERE
empno IN (SELECT mgr FROM emp)
ENAME SAL DEPTNO JOB ----------
--------- --------- --------- JONES
2975 20 MANAGER BLAKE 2850
30 MANAGER CLARK 2450 10
MANAGER SCOTT 3000 20 ANALYST
KING 5000 10 PRESIDENT FORD
3000 20 ANALYST
18Set Operations
- Oracle has functions which can be used to perform
set (as in mathematics) operations - These are UNION INTERSECT MINUS
- Normally applied across two or more SELECT
statements e.g.
Rows returned by first query
Rows returned by second query
SELECT ... FROM ... UNION SELECT ... FROM ...
19UNION
- Presents rows which are returned by any of the
SELECT statements involved in the query - Effectively ORs the results of the SELECT
statements - Details are shown for both managers and people in
department 10 - Could have been performed using an OR e.g.
SELECT ename, sal FROM emp WHERE job MANAGER
UNION SELECT ename, sal FROM emp WHERE deptno
10 ENAME SAL ---------
---------- BLAKE 2850 CLARK
2450 JONES 2975 KING 5000
MILLER 1300
SELECT ename, sal FROM emp WHERE job MANAGER
OR deptno 10
20UNION (continued)
- The UNION operator is more general than OR in
that it can work across tables - When using UNIONs, duplicate rows are
automatically eliminated - Not the case when using an OR construct (DISTINCT
is required) - The alternative UNION ALL construct does not
eliminate duplicates and so a sorting operation
is not performed
SELECT ename, sal FROM emp WHERE job MANAGER
UNION SELECT ename, sal FROM consultants WHERE
contract_date gt 01-SEP-2002
21UNION Operations
- The same number of columns must be selected by
all query blocks in the UNION statement - Columns must be of the same datatype and
specified in the same order they need not have
the same name - The final presented column will have the same
heading (by default) as the one specified in the
first SELECT block - Will produce output as follows. Notice sal and
comm values are mixed under the heading SAL
SELECT ename, sal, deptno FROM emp WHERE deptno
30 UNION SELECT ename, comm, deptno FROM
emp WHERE deptno 30
ENAME SAL DEPTNO --------- ----------
------ SMITH 2000 10 FORD
2300 30
22INTERSECT Operator
- Performs an AND across two SELECT blocks in order
to present only those rows returned by both
SELECT statements - More than one INTERSECT may be used when numerous
SELECT blocks are involved - Above query shows only those employees who are
both managers and work in department 10 - In practice, the INTERSECT operator is not nearly
as useful as UNION
SELECT ename, sal FROM emp WHERE job MANAGER
INTERSECT SELECT ename, sal FROM emp WHERE
deptno 10
23MINUS Operator
- Presents rows which are returned by one SELECT
block but NOT by the second SELECT block - It is important to be clear about the order of
the SELECT blocks
SELECT job FROM emp WHERE deptno 10
MINUS SELECT job FROM emp WHERE deptno 20
Gives the value PRESIDENT
SELECT job FROM emp WHERE deptno 20
MINUS SELECT job FROM emp WHERE deptno 10
Gives the value ANALYST
24The ANY Operator
- Find all employees who earn more than any
employee in department 30
SELECT sal, job, ename, deptno FROM emp WHERE
sal gt ANY(SELECT sal FROM emp
WHERE deptno 30) SAL JOB
ENAME DEPTNO --------- ---------
---------- --------- 1600 SALESMAN ALLEN
30 1250 SALESMAN WARD
30 2975 MANAGER JONES 20
1250 SALESMAN MARTIN 30 2850
MANAGER BLAKE 30 2450 MANAGER
CLARK 10 3000 ANALYST SCOTT
20 5000 PRESIDENT KING
10 1500 SALESMAN TURNER 30
1100 CLERK ADAMS 20
3000 ANALYST FORD 20 1300
CLERK MILLER 10
25The ANY Operator (continued)
- The previous query can be rewritten using the
MIN function - The ANY construct is almost entirely redundant
- The IN construct and the ANY construct
perform the same function - Recommended to always use IN
- Use of ANY can produce misleading results
SELECT sal, job, ename, deptno FROM emp WHERE
sal gt (SELECT MIN(sal) FROM emp
WHERE deptno 30)
26The ALL Operator
- Find employees who earn more than all employees
in - department 30
- The query can be rewritten using the MAX
function - The ALL operator is entirely redundant not
recommended
SELECT sal, job, ename, deptno FROM emp WHERE
sal gt ALL(SELECT sal FROM emp
WHERE deptno 30) SAL JOB
ENAME DEPTNO --------- ---------
---------- --------- 2975 MANAGER JONES
20 3000 ANALYST SCOTT
20 5000 PRESIDENT KING 10
3000 ANALYST FORD 20
SELECT sal, job, ename, deptno FROM emp WHERE
sal gt (SELECT MAX(sal) FROM emp
WHERE deptno 30)