Title: Collecting and Organizing Data (for ease of analysis and good results!)
1Collecting and Organizing Data (for ease of
analysis and good results!)
- Annie N. Simpson, MSc.
- Biostatistician
2(No Transcript)
3Data Collection Considerations
- Should be investigated when your study is being
planned - Should be implemented before (or shortly after)
subjects are being recruited - Computational collection tools should be
proportionate to the size of your study - (size number of subjects, number of
forms/collection instruments) - A data Collection Schedule is often the best
place to start!
4Case Report Forms
- If available use a previously used and vetted
form (i.e. HAM-D) - All forms in a case book for a study should have
the same header information - Header Information should capture, patient id,
patient initials (if commonly used), visit number
and or type, time of visit (if collecting things
multiple times over 1 visit) - Remember that you are creating forms that may be
used more than once depending on your study
design, so you need to know how to differentiate
visits etc.
5In Protocol
6For Data Collection During Study!
Can even be used as a face page for each
subjects binder, where each visit/form can get
checked off!
7Steps that I follow when I have a new study (from
my perspective)
- Create and review with the Team (this is a very
long but worthwhile meeting) - Updated Form Based Data Collection Schedule
- Complete Blank Case Report Form Book
- Go through each page of the CRF book with your
team and ask questions (let them ask) that are
not clear to everyone (include your
statistician!). - Review each persons responsibilities/roles
- Review the current timeline
8How to electronically capture your data to a
spreadsheet
- Not every form HAS to be entered, think about
whether the information will be analyzed or is it
for study coordination - Patient Identifier number should be a column on
every spreadsheet and should be set up EXACTLY
the same (same length and type) - Usually one spreadsheet per collection form
- Usually laid out vertically, i.e. one row for
each patients for each visit time - NEVER skip filling down the columns!
9Examples of bad data layouts
10How to think about how to begin analysis? 1st
clean you data!
- Dont forget to first check your Ns for
correctness, are they what you expect (for each
form!). - Also examine the extreme values (max mins) for
each of your variables as the simplest way to
check for incorrectly entered (i.e. dirty) data. - Always have original source documents (when you
can) and dont neglect checking between them and
your spreadsheets!
11How to think about how to begin analysis?
- Think about what and how your Table 1 will look
- Should the table be describe the total sampleor
perhaps by gender (depends on the question or
focus of your research) - Can use any simple software to do this
- Excel, SPSS, Minitab
- For all continuous vars get N, Mean, STD
- For all categorical vars get N, , Total Ns
12Basic Analysis of Continuous Response Variables
- Numerical Descriptives
- Mean
- Median
- Mode
- Variance
- Range
- Graphical Descriptives
- Boxplots
- Scatterplots
- Histograms
13Basic Analysis of Categorical Response Variables
- Numerical Descriptives
- Frequencies
- Percents/Proportions
- Graphical Descriptives
- Bar Charts
- Pie Graphs (not so common in biomedical research)
14Other data considerations
- Large multi-center clinical trials will usually
have a centralized data collection and
coordinating center. - You, as a clinical site, would be responsible for
error correction with source documentation. - Training of entry/coordination staff is very
important (ex 5 year study data collected, at
the end statistician got the data and nowhere was
the study group collected, and it wasnt on
source documents either!) - Your study is only as good as the data that you
collect, pre-planning is the key.