Introduction to Statistical Computing in Clinical Research - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Introduction to Statistical Computing in Clinical Research

Description:

Title: Introduction to the Class, and to the program Author: Mark Pletcher Last modified by: Mark Pletcher Created Date: 6/15/2004 7:06:30 PM Document presentation format – PowerPoint PPT presentation

Number of Views:215
Avg rating:3.0/5.0
Slides: 39
Provided by: MarkPl5
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Statistical Computing in Clinical Research


1
Introduction to Statistical Computing in Clinical
Research
  • Biostatistics 212
  • Lecture 1

2
Today...
  • Course overview
  • Course objectives
  • Course details grading, homework, etc
  • Schedule, lecture overview
  • Where does Stata fit in?
  • Basic data analysis with Stata
  • Stata demos
  • Lab

3
Course Objectives
  • Introduce you to using STATA and Excel for
  • Data management
  • Basic statistical and epidemiologic analysis
  • Turning raw data into presentable tables, figures
    and other research
  • Prepare you for Fall courses
  • Start analyzing your own data

4
Course details
  • Introduction to Statistical Computing - 1 unit
  • Schedule 7 lectures, 7 lab sessions, on 7
    Tuesdays in a row
  • Dates July 31, August 7,14,21,28, September
    4,11
  • Lectures 115-245
  • Labs 300-400
  • All in China Basin (CBL 6702, 6704)
  • Final Project Due 9/18/06

5
Course details
  • Introduction to Statistical Computing
  • Grading Satisfactory/Unsatisfactory
  • Requirements
  • -Hand in all five Labs (even if late)
  • -Satisfactory Final Project
  • -80 of total points
  • Reading Optional

6
Course details, cont
  • Course Director
  • Mark Pletcher, 514-8008
  • mpletcher_at_epi.ucsf.edu
  • Teaching Assistants
  • Jennifer Cocohoba
  • jcocohoba_at_nccc.ucsf.edu
  • biostat212_section1_at_yahoo.com
  • Sharon Chung
  • sharon.chung_at_ucsf.edu 
  • biostat212_section2_at_yahoo.com
  • Folashade Jose
  • josef_at_peds.ucsf.edu
  • Lab Instructors
  • Mandana Khalili
  • Alan Bostrom

7
Overview of lecture topics
  • 1- Introduction to STATA
  • 2- Do files, log files, and workflow in STATA
  • 3- Generating variables and manipulating data
    with STATA
  • 4- Using Excel
  • 5- Basic epidemiologic analysis with STATA
  • 6- Organizing a project, making a table
  • 7- Making a figure with STATA or Excel

8
Overview of labs
  • Lab 1 Load a dataset and analyze it
  • Lab 2 Learn how to use do and log files
  • Lab 3 Import data from excel, generate new
    variables and manipulate data, document
    everything with do and log files.
  • Lab 4 Using and creating Excel spreadsheets
  • Lab 5 Epidemiologic analysis using Stata
  • Last 2 lab sessions dedicated to working on the
    Final Project
  • - Labs 3 and 5 are significantly longer and
    harder than the others

9
Overview of labs, cont
  • Official Lab time is 300-400, but we will start
    right after lecture, and you can leave when you
    are done.
  • Lab sections led by Jennifer Cocohoba and Sharon
    Chung
  • Mac session
  • Labs also staffed by Fola Jose, Alan Bostrom,
    Mandana Khalili, and I

10
Overview of labs, cont
  • Labs are due the following week prior to lecture.
    Labs turned in late (less than 1 week) will
    receive only half credit after that, no points
    will be awarded. However, ALL labs must be
    turned in to pass the class (even if no points
    are awarded).
  • Lab 1 is paper
  • Labs 2-5 are electronic files, and should be
    emailed to your section leaders course email
    address biostat212_section1_at_yahoo.com
    (Jennifer) or biostat212_section2_at_yahoo.com
    (Sharon)

11
Final Project
  • Create a Table and a Figure using your own data,
    document analysis using Stata.
  • Due 1 week after last lab session, 20 points
    docked for each 1 day late.

12
Orientation to binder
  • Course Overview
  • Final Project
  • Lectures and Labs (just in time)
  • Other handouts

13
Getting started with STATA
  • Session 1

14
Types of software packages used in clinical
research
  • Statistical analysis packages
  • Spreadsheets
  • Database programs
  • Custom applications
  • Cost-effectiveness analysis (TreeAge, etc)
  • Survey analysis (SUDAAN, etc)

15
Software packages for analyzing data
  • STATA
  • SAS
  • S-plus, and R
  • SPS-S
  • SUDAAN
  • Epi-Info
  • JMP
  • MatLab
  • StatExact

16
Why use STATA?
  • Quick start, user friendly
  • Immediate results, response
  • You can look at the data
  • Menu-driven option
  • Good graphics
  • Log and do files
  • Good manuals, help menu

17
Why NOT use STATA?
  • SAS is used more often?
  • SAS does some things STATA does not
  • Programming easier with S-plus?
  • Complicated data structure and manipulation
    easier with SAS?
  • Epi-info (free) is even easier than STATA?

18
STATA Basic functionality
  • Holds data for you
  • Stata holds 1 flat file dataset only (.dta
    file)
  • Listens to what you want
  • Type a command, press enter
  • Does stuff
  • Statistics, data manipulation, etc
  • Shows you the results
  • Results window

19
Demo 1
  • Open the program
  • Load some data
  • Look at it
  • Run a command

20
STATA - Windows
  • Two basic windows
  • Command
  • Results
  • Optional windows
  • Variable list
  • History of commands
  • Other functions
  • Data browser/editor
  • Do file editor
  • Viewer (for log, help files, etc)

21
STATA - Buttons
  • The usual open, save, print
  • Log-file open/suspend/close
  • Do-file editor
  • Browse and Edit
  • Break

22
STATA - Menus
  • Almost every command can be accessed via menu

23
Demo 2
  • Enter in some data
  • Look at it
  • Run a couple of commands

24
Menu vs. Command line
  • Menu advantages
  • Look for commands you dont know about
  • See the options for each command
  • Complex commands easier learn syntax
  • Command line advantages
  • Faster (if you know the command!)
  • Closer to the program
  • Only way to write do files
  • Document and repeat analyses

25
STATA commandsDescribing your data
  • describe varlist
  • Displays variable names, types, labels
  • list varlist
  • Displays the values of all observations
  • codebook varlist
  • Displays labels and codes for all variables

26
STATA commandsDescriptive statistics
continuous data
  • summarize varlist , detail
  • obs, mean, SD, range
  • , detail gets you more detail (median, etc)
  • ci varlist
  • Mean, standard error of mean, and confidence
    intervals
  • Actually works for dichotomous variables, too.

27
STATA commandsGraphical exploration continuous
data
  • histogram varname
  • Simple histogram of your variable
  • graph box varlist
  • Box plot of your variable
  • qnorm varname
  • Quantile plot of your variable to check normality

28
STATA commandsDescriptive statistics
categorical data
  • tabulate varname
  • Counts and percentages
  • (see also, table - this is very different!)

29
STATA commandsAnalytic statistics 2
categorical variables
30
STATA commandsAnalytic statistics 2
categorical variables
  • tabulate var1 var2
  • Cross-tab
  • Descriptive options
  • , row (row percentages)
  • , col (column percentages)
  • Statistics options
  • , chi2 (chi2 test)
  • , exact (fishers exact test)

31
Getting help
  • Try to find the command on the pull-down menus
  • Help menu
  • If you dont know the command - Search...
  • If you know the command - Stata command...
  • Try the manuals
  • more detail, theoretical underpinnings, etc

32
STATA commandsAnalytic statistics 1
categorical, 1 continuous
33
STATA commandsAnalytic statistics 1
categorical, 1 continuous
  • bysort catvar summarize contvar
  • mean, SD, range of one in subgroup
  • ttest contvar, by(catvar)
  • t-test
  • oneway contvar catvar
  • ANOVA
  • table catvar , contents(mean contvar)
  • Table of statistics

34
STATA commandsAnalytic statistics 2 continuous
35
STATA commandsAnalytic statistics 2 continuous
  • scatter var1 var2
  • Scatterplot of the two variables
  • pwcorr varlist , sig
  • Pairwise correlations between variables
  • sig option gives p-values
  • spearman varlist , stats(rho p)

36
Demo 3
  • Load a STATA dataset
  • Explore the data
  • Describe the data
  • Answer some simple research questions

37
In Lab Today
  • Familiarize yourself with Stata
  • Load a dataset
  • Use Stata commands to analyze data and fill in
    the blanks

38
Next week
  • Do files, log files, and workflow in Stata
  • Find a dataset!
Write a Comment
User Comments (0)
About PowerShow.com