Title: Calculation and customized output of summary statistics using SAS: A few macros that changed my life
1Calculation and customized output of summary
statistics using SAS A few macros that changed
my life
- Asli Memisoglu, ScD
- Director, Biostatistics and Data Management
- WorldCare Clinical, LLC
- amemisoglu_at_wcclinical.com
2Task
- Generate summary tables describing many variables
- descriptive statistics (frequency, percents,
means, median etc.) - p-values from appropriate tests for categorical
and continuous variables - Final table needs to look pretty
- Needs to be in an electronic format accessible to
non-SAS users
3Solution
- Three relatively simple macros
- Shared by two saintly individuals
- They will never know how grateful I am
Generating Customized Analytical Reports from SAS
Procedure Output Brinda Bhaskar and Kennan
Murray, RTI International NESUG 2004
4Variables
5Where we are heading
6CATEG
- Macro to summarize categorical variables
7Categ MacroStep 1-Generate source datasets for
cell counts by group and p-value
- macro categ(pred,i,data)
- proc freq data data
- tables pred group / chisq sparse outpct out
outfreqi - output out statsi chisq
- run
8Categ Macro outfreqi and statsi
9Categ MacroStep 2a and 2b-Get number and
percent by predictor for all
- proc means data outfreqi noprint
- by pred
- var COUNT
- output outmoutfreqai (keeppred totala
rename(predvariable)) sumtotala - run
- proc means data outfreqi noprint
- by pred
- var PERCENT
- output outmoutfreqbi (keeppred totalb
rename(predvariable)) sumtotalb - run
10Categ MacroStep 3-Combine datasets
- data moutfreqi
- merge moutfreqai moutfreqbi
- by variable
- total compress(totala)
- " (
- compress(put((round(totalb,.1)),5.1))
- ")"
- varnamevlabel(pred)
- run
11Categ macro moutfreqi
12Categ MacroStep 4- number and percent by
predictor by group
- data routfreqi
- set outfreqi
- length varname 20.
- rcount put(count,3.)
- rcount compress(left(rcount))
- col_pct3 "("compress(put((round(pct_col,.
1)),5.1))")" - pctnum rcount" "col_pct3
- varnamevlabel(pred)
- indexi
- variablepred
- keep variable pctnum index varname group
- run
- proc sort
- by variable
- run
13Categ MacroStep 5-Transpose data into proper
configuration
- proc transpose outtransposei prefixgroup
- by variable
- var pctnum
- id group
- run
14Categ Macro Step 6-Obtain p-values
- data rstatsi
- set statsi
- length p_value 8.
- p_valuecompress(put((round(p_pchi,.001)),5.3))
- if p_pchi lt0.05 then p_valuep_value ""
- keep p_value index
- index i
- run
15Categ MacroStep 8-Merges begin
- data tempi
- merge moutfreqi transposei
- by variable
- indexi
- run
16Categ MacroStep 8-Final Merge
- data finali
- merge tempi rstatsi
- by index
- alltotal
- if first.index then statistic'n '
- if not first.index then do
- p_value " "
- varname" "
- end
- length p_final 8.
- p_finalp_value
- keep varname variable statistic all group1 group2
index p_final - run
17CONTINUOUS
- Macro to summarize continuous variables
18continuousStep 1
- macro continuous(cpred,i,data)
- proc means datadata n mean stddev median min
max p25 p75 - class group
- var cpred
- output outmeansi nn meanmean stddevstddev
medianmedian minmin maxmax p25p25 p75p75 - run
19continuous Step 2
- proc npar1way datadata
- class group
- var cpred
- output outpvali
- run
20Output Statistics Available with NPAR1WAY
- Ansari-Bradley Test
- Standard analysis of variance
- Kolmogorov-Smirnov Test
- Cramer-von Mises Test
- Kuiper Test for Two-Sample Data
- Klotz Test
- Median Test
- Mood Test
- Savage Test
- Test Using Input Data as Scores
- Siegel-Tukey Test
- Wilcoxon Test for Two-Sample Data
- Kruskal-Wallis Test
- Van der Waerden Test
21p-value from proc t-test (details available in
referenced paper)
- Ods listing close
- proc ttest datadummy
- class group
- var age
- ods output ttestst_test_data
- Run
22continuous Step 3
- data pvali
- set pvali
- length p_final 8.
- keep pval p_final
- p_finalcompress(put((round(p_kw,.001)),5.3))
- run
23continuous Step 4
- data widei
- set meansi
- medianbcompress(put((round(median,.1)),5.1))
- minmax"("compress(put((round(min,.1)),5.1))",
"compress(put((round(max,.1)),5.1))")" - meansdcompress(put((round(mean,.1)),5.1))"
("compress(put((round(stddev,.1)),5.1))")" - pctl"("compress(put((round(p25,.1)),5.1))",
"compress(put((round(p75,.1)),5.1))")" - keep group n medianb minmax meansd pctl
- run
24continuous widei
25continuous Step 5
- proc transpose datawidei outnarrowi
- var n meansd medianb minmax pctl
- run
- data narrowi
- set narrowi
- varnamevlabel(cpred)
- run
26continuous narrowi
27continuous Step 6-Merge and pretty it up
- data finali (rename(_name_statistic col1All
col2group1 col3group2)) - merge narrowi pvali
- run
- data finali
- set finali
- by varname
- if first.varname then do
- varname"cpred"
- indexi
- end
- else do
- varname" "
- p_final""
- end
- keep varname statistic all group1 group2
p_final index - run
28continuous finali
29NAMES
- One tiny little macro that is very handy
30names will concatenate similarly named datasets
into one
Starting index number
- macro names(j,k,dataname)
- do ij to k
- datanamei
- end
- mend names
Ending index number
31ExampleStep 1 Call in macros for each
predictor
- continuous(age,1,dummy)
- categ(sex,2,dummy)
- categ(race,3,dummy)
- continuous(sbp,4,dummy)
- continuous(dbp,5,dummy)
- categ(diab,6,dummy)
32ExampleCombine datasets from all predictors and
format row labels
- data final_all
- set names(1,6,final)
- length statistic2 15. varnameb 40.
- if statistic"n" then statistic2"n"
- if statistic"meansd" then statistic2"Mean
(sd)" - if statistic"minmax" then statistic2"(Min,
Max)" - if statistic"pctl" then statistic2"(25th,
75th)" - if statistic"n " then statistic2"n "
- if statistic"medianb" then statistic2"Median"
- if varname"age" then varnameb"Age (years)"
- if varname"sex" then varnameb"Gender"
- if varname"sbp" then varnameb"Systolic Blood
Pressure (mm Hg)" - if varname"dbp" then varnameb"Diastolic Blood
Pressure (mm Hg)" - if varname"diab" then varnameb"Diabetes"
- if varname"race" then varnameb"Race"
- temp_var1
33Example Format Table and Output into Microsoft
Word
- ods listing close
- ods rtf stylejournal fileC\\....rtf sasdate
bodytitle - proc report datafinal_all nowd spacing1
Style(report)borderwidth0.5pt
cellspacing0.5pt cellpadding2pt
Style(header)protectspecialcharsoff - column varnameb statistic2 variable2 all group1
group2 p_final - define varnameb /STYLE(COLUMN)cellwidth1.5in
justl - STYLE(header)cellwidth1.50 in justc
"Variable" - .
- .
- Title1 jc "BASUG Presentation"
- .
- .
- Footnote1 jc"Source C\\--\\BASUG
Presentation\basug presentation.sas on "
"sysdate9." - run
- ods rtf close
- ods listing
34Were here!
35Thank you!