Title: An Introduction to PROC IML
1An Introduction to PROC IML
- Dan DiPrimeo
- ASG Users Group
2What is IML?
- Interactive
- Matrix
- Language
3What is IML?
- A powerful, flexible programming language in a
dynamic, interactive environment. - The fundamental object is a data matrix.
- Use IML interactively (Iinteractive!) to see
results immediately, or store in a module. - Powerful built-in operators to perform matrix
operations. - Data management commands
4Why IML?
- Can do graphs and analyses that other SAS Modules
dont readily do in programming statements that
are easily translated from mathematics and
statistics statements. - A powerful graphics package for scientific
exploration
5Goal
- To get you acquainted with IML via
- a VERY brief intro
- An elementary example
- A real life example
- Touching on Graphics
6Define Matrix A
- proc iml
- reset print / send to the .lst
file / - A 1 3 5, 4 4 1, 2 2 6
- / Define A to be a 3 x 3
matrix / - A 3 rows 3 cols (numeric)
- 1
3 5 - 4
4 1 - 2
2 6
7Define Matrix B
- B 1 3 4, 3 5 2, 4 2 1
- / Define B to be a 3 x3 positive definite
symmetric matrix / - B 3 rows 3 cols (numeric)
- 1
3 4 - 3
5 2 - 4
2 1
8Define Matrix C
- C 2 1 1, 3 4 6
- / C is a 2 x 3 matrix /
- C 2 rows 3 cols (numeric)
- 2
1 1 - 3
4 6
9Define Matrix D
- D 2 2, 4 5, 5 1
- / D is a 3 x 2 matrix /
- D 3 rows 2 cols (numeric)
- 2
2 - 4
5 - 5
1
10Define Matrix X
- X 1 1 0, 1 1 0, 1 1 0, 1 0 1, 1 0 1, 1 0 1
- / X is a design matrix for ANOVA, 6 obs, 1
independent X, 2 levels of X / - X 6 rows 3 cols (numeric)
- 1
1 0 - 1
1 0 - 1
1 0 - 1
0 1 - 1
0 1 - 1
0 1
11Compute the Inverse Of A
- inverseA inv(A)
- / compute the inverse of A/
- INVERSEA 3 rows 3 cols (numeric)
- -0.5 0.1818182 0.3863636
- 0.5 0.0909091 -0.431818
- 0 -0.090909 0.1818182
12Compute the Transpose Of C
- transposeC t(C) / or, C /
- / compute the transpose of C /
- TRANSPOSEC 3 rows 2 cols (numeric)
- 2
3 - 1
4 - 1
6
13Binary Operation AB
- AplusB A B
- / Add 2 matrices, of same size /
- APLUSB 3 rows 3 cols (numeric)
- 2
6 9 - 7
9 3 - 6
4 7
14Binary Operation AB(Matrix Multiplicaton)
- AtimesB A B
- / Matrix multiplication /
- ATIMESB 3 rows 3 cols (numeric)
- 30
28 15 - 20
34 25 - 32
28 18
15Reduction Operators
- Asumrows A,
- / Reduction operator, sum the rows /
- ASUMROWS 3 rows 1 col (numeric)
-
9 -
9 -
10
16Reduction Operators
- Asumcols A,
- / Reduction operator, sum the columns /
- ASUMCOLS 1 row 3 cols
(numeric) - 7
9 12
17Catenating Matrices
- AnextB AB
- / Put 2 matrices side by side /
- ANEXTB 3 rows 6 cols (numeric)
- 1 3 5
1 3 4 - 4 4 1
3 5 2 - 2 2 6
4 2 1
18Catenating Matrices
- AtopB A//B
- / Put 2 matrices on top of each other /
- ATOPB 6 rows 3 cols (numeric)
- 1
3 5 - 4
4 1 - 2
2 6 - 1
3 4 - 3
5 2 - 4
2 1
19Computing Eigenvalues
- eigvalsA eigval(A)
- / Eigenvalues of A /
- EIGVALSA 3 rows 2 cols (numeric)
- 9.4488409
0 - 3.0686516
0 - -1.517492
0
20Computing Eigenvalues
- eigvalsB eigval(B)
- / Eigenvalues of B /
- EIGVALSB 3 rows 1 col (numeric)
-
8.5572316 -
1.519351 -
-3.076583
21Diagonal Operators
- diagA diag(A)
- / Change non-diagonal elements of A to 0, keep
diagonals as they are / - DIAGA 3 rows 3 cols (numeric)
- 1
0 0 - 0
4 0 - 0
0 6
22Diagonal Operators
- vdiagA vecdiag(A)
- / Take the diagonal elements of A, put into a
column matrix / - VDIAGA 3 rows 1 col (numeric)
-
1 -
4 -
6
23Functions
- logA log(A)
- / Log of each term /
- LOGA 3 rows 3 cols (numeric)
- 0 1.0986123 1.6094379
- 1.3862944 1.3862944 0
- 0.6931472 0.6931472 1.7917595
24Functions
- ssgvdiagA ssq(vdiagA)
- / Square each element of vdiagA, sum them /
- SSGVDIAGA 1 row 1 col (numeric)
-
53
25Programming Statements
- In addition to the functions and operators that
we talked about earlier, we have programming
statements - If/Then
- Do
- Jumping (GOTO, LINK Statements)
- Modules
26Modules
- Modules are similar to subroutines, or
functions, that can be called anywhere in a
program, and reused later.
27Programming Statements, Modules
- start TransMatrix(D)
- Dcol ncol(D) / Number of
columns in D / - Drow nrow(D) / Number of
rows in D / -
- Dtranspose_temp shape(.,Dcol, Drow)
- do i 1 to Dcol
- do j 1 to Drow
- Dtranspose_tempi,j Dj,i / transpose
the matrix D / - end
- end
- return (Dtranspose_temp)
- finish TransMatrix
- Dtranspose TransMatrix(X)
- print Dtranspose
28Programming Statements, Modules
- The previous slide illustrates
- Programming Statements (In the form of Do Loop)
- Modules (creating groups of statements that can
be invoked anywhere in the program, i.e., a
subroutine, and creating a separate environment
local to the module).
29An Application Regression
- To illustrate some of the ideas just presented,
lets perform a regression analysis. - Of course, with PROC REG, GLM, one wouldnt
consider doing this, but . . . - Example follows.
30An Application Regression
- data Graybill
- input x y _at__at_
- datalines
- 550 200 200 50 280 60 340 140 410 130
- 160 20 380 120 510 190 510 160 475 180
-
- run
31Bring the data into IML
- proc iml
- use Graybill
- / identify dataset to read data
from/ - Note The USE, and READ statements are the
method of getting data from a dataset into IML
for converting to matrices. -
32Bring the data into IMLRead Y variable into
matrix Y
- read all vary into y
- / Y variable into Y matrix /
- Y 10 rows 1 col (numeric)
-
200 -
50 -
60 -
140 -
130 -
20 -
120 -
190 -
160 -
180
33Bring the data into IMLRead X variable into
matrix XOBS
- read all varx into xobs
- / X variable into X matrix /
- XOBS 10 rows 1 col (numeric)
-
550 -
200 -
280 -
340 -
410 -
160 -
380 -
510 -
510 -
475
34Define the X Matrix
- nobs nrow(y) /
Let nobs number of observtions / - x shape(1,nobs,1)xobs / Matrix of
1s, nobs x 1, catenated to X / - XOBS 10 rows 1 col (numeric)
-
- 1
550 - 1
200 - 1
280 - 1
340 - 1
410 - 1
160 - 1
380 - 1
510 - 1
510 - 1
475
35Compute (XX)-1
- xinvx inv(xx)
- / inverse of X(transpose) X /
- XINVX 2 rows 2 cols (numeric)
- 0.9820609
-0.002312 - -0.002312
6.0605E-6
36Compute the Estimates
- betahat xinvx x y
- / estimate of Betahat xinvx x(trans)y /
- BETAHAT 2 rows 1 col (numeric)
-
-45.22735 -
0.4462054
37Other CalculationsDegrees of Freedom
- nobs nrow(y)
- nind ncol(x)
- dftot nobs - 1
- dfmod nind - 1
- dferr dftot - dfmod
38Other CalculationsResiduals and Error
- resid y - yhat
- sserr ssq(resid)
- mserr sserr / dferr
- rootmse sqrt(mserr)
39Other CalculationsSums of Squares
- ybar sum(y)/nobs
- sstot ssq(y-ybar)
- ssmod sstot - sserr
- msmod ssmod/dfmod
40Other CalculationsTest of Hypotheses
- stdbeta sqrt(vecdiag(xinvx)mserr)
- t betahat/stdbeta
- probt 1-probf(abs(tt),1,dferr)
41Compute the Estimates
- We can do further analyses, such as R2, CIs, etc.
- This example can be extended to multiple
regression. - Some of these are spelled out in the program
attached here, and the online doc gives more
details of a regression example to illustrate
these points.
42Question is.
- Why?
- Proc Reg already does this!
43A real-life example
- Consider the following table
44A real life example
- The problem is to test the equality of the
proportions responding between the two treatment
groups, controlling for stratum. - We choose not to run the CMH test in PROC FREQ
because the test isnt appropriate. - Test desired can be found in Mehrotra and
Railkar, Statistics in Medicine, 2000. - How to proceed?
45Defining the X, N matrices
46Compute row sums, point estimates
47Compute point estimates, weights
48Compute point estimates of proportion (weighted)
in each group, and estimated variance
49Compute the weighted difference of proportions
between two groups, and the estimated variance
50Confidence intervals for proportion of each
treatment group, and for the difference in
proportions
51p-values for testing the hypothesis
52Other Uses
- Graphics.
- Extremely Powerful.
- Two examples with Output Only.
- Ill refer you to programs.
53Mosaic Plots
54Mosaic Plots
55Mosaic Plots (data not shown)
56For more reference
- See the SAS/IML Users Guide, or Online
Documentation - Mosaic Plots http//www.math.yorku.ca/SCS/Papers
/moshist.pdf - Mehrotra, D., and Railkar, R. Minimum risk
weights for comparing treatments in stratified
binomial trials. Statistics in Medicine 2000
19 811-825. - Graybill, F. Theory and Application of the
General Linear Model. Wadsworth Brooks
Pacific Grove, 1976
57Summary
- IML is a powerful tool for data analysis,
statistics, and graphics. - Consider instead of datastep programming if the
opportunity presents itself. - Not always the best alternative, but helps inform
the decision making.