Title: Materials and Methods
1 Materials and Methods
Abe E. Sahmoun, Ph.D. Assistant Professor
Epidemiology Department of Internal Medicine
2Contents
- Materials and Methods Section
- Function
- Content
- Materials
- Methods
- Data analysis
- Length
- Examples
- What variables should be collected and why?
- Commonly Used Statistical Test
- Correlation coefficient
- T-test
- Chi-Square test
- P-value
3Descriptive Study
- Formulate a question
- Decide on study design
- Define the population
- Obtain clinical information
- Age, race, sex
- Treatments, outcomes
4Materials Methods (MM)Function
- The aim of MM section is to tell the reader what
experiments you did to answer your question. - MM section should include sufficient detail and
references to permit a scientist to evaluate your
work fully or to repeat the experiments exactly
as you did.
5Materials Methods (MM)Content- Methods
- What you did
- Study Design This should include the following
info - Independent variables
- Dependent variable
- All controls baseline, placebo, other
- Sample size
- What the experiment consisted of
- Order
- Of the interventions
- Of the measurements
- Duration
- Of the interventions
- Of the measurements
6Materials Methods (MM)Content- Materials
- The primary content of the MM section consists of
the following information - State how you calculated derived variables (e.g.
BMI, drug). - Human subjects Give enough information about
age, gender, race, BMI, disease, and specific
medical and surgical management to be of use to
researchers who want to compare your data with
theirs, or to clinicians who want to see if your
findings are applicable to their patients.
7Materials Methods (MM)Content- 1. Analysis of
Data
- State how you summarized your data.
- Provide information about both the magnitude of
the data and the variability. - When data are normally distributed, we can use
mean and standard deviation to summarize the
data. The mean provides a description of the
overall magnitude of the data. The standard
deviation provides a measure of the variability
in the sample. - If data has a skewed distribution, you should
report the median and the interquartile range
(range between 25th and 75th percentiles)
8Materials Methods (MM)Content- 2. Analysis of
Data
- State which software you used to analyze your
data (including version or release number) - State p-value at which you considered differences
statistically significant. - A p-value is not always sufficient to determine
whether you fail to reject or reject a
hypothesis. A difference can be statistically
significant because the sample size is large
rather than because a treatment has a large
effect. - We assess the size of the difference in
comparison with the variability in the data
sample by calculating the 95 C.I.
9(No Transcript)
10Materials Methods (MM)Content- Length
- The methods section should be as long as
necessary to describe fully and accurately what
was done and how it was done. - Methods are reported in the past tense (e.g. we
measured..)
11ExampleThe right way
Magura L, Blanchard R, Hope B, Beal JR, Schwartz
GG, Sahmoun AE. Hypercholesterolemia and prostate
cancer a hospital-based case-control study.
Cancer Causes Control. Epub 2008 Aug 13. .
12Methods
- We performed a retrospective analysis of medical
charts of patients newly diagnosed with prostate
cancer between 2004 and 2006. Cases were
identified from the cancer registry of Meritcare
hospital, North Dakota, USA. Controls were
identified from the primary care database of the
same hospital. This facility serves the Fargo
Metropolitan Area comprising all of Cass County,
North Dakota and Clay County, Minnesota. Its
population, according to the 2006 estimate, is
approximately 200,000. The majority (96) of the
population served in this area is White. The
North Dakota Cancer Registry releases annual
cancer statistics when the registrys data is
estimated to be 95 complete for any given
cancer-reporting year. The study was approved by
the Institutional Review Boards of the Hospital
and the University of North Dakota.
13Study Design
- Data on age, family history of prostate cancer,
histology, stage at diagnosis (TNM system), body
mass index, occupation, smoking status, Prostate
Specific Antigen (PSA), Gleason score, lipid
profiles, statins use, non-steroidal
antiinflammatory use (NSAIDs), comorbidities, and
multivitamin use were abstracted using electronic
records and medical charts. Covariates
information was obtained for the period prior to
diagnosis for cases and prior to exam for
controls. Inclusion and exclusion criteria were
as follows
14Study Design
- The inclusion criteria for cases were men with
incident,histologically confirmed prostate cancer
as a primary site with cancer diagnosed between
2004 and 2006 using a pathology report present in
the medical records, age between 50 and 74 and
date of lipid profiles tests within a year prior
to the diagnosis of prostate cancer. The
exclusion criteria included diagnosis of any
cancer other than primary prostate cancer and
race other than Caucasian (excluded because of
small numbers \6 of residents of Fargo-Moorhead
are non-Caucasian).
15Study Design
- The inclusion criteria for controls were men who
had an annual physical exam between 2004 and 2006
at the same hospital as cases, age between 50 and
74, without cancer seen at the same hospital as
cases, and date of lipid profiles tests within a
year of the annual physical exam. The exclusion
criteria included diagnosis of any cancer,
prostate specific antigen C4 ng/l (in order to
exclude undiagnosed prostate cancer), and race
other than Caucasian.
16Exposure Definition
- We used the National Cholesterol Education
Program (NCEP) definition of hypercholesterolemia
as total cholesterol greater than 5.17 (mmol/l)
23. For comparison with previous studies, the
prevalence of hypercholesterolemia was also
calculated using a cutpoint of6.2 (mmol/l). - Statin use was classified as hydrophobic only
users (lovastatin, simvastatin, atorvastatin, or
fluvastatin) or hydrophilic only users
(pravastatin or rosuvastatin) as reported
elsewhere 24. No other lipid lowering agents
were in use among this study population. - Factors that may confound the association between
cholesterol and prostate cancer, such as family
history of prostate cancer, body mass index,
statins use, smoking, type 2 diabetes, and
multivitamin use were included in our analyses as
potential confounders.
17Statistical analyses
- Odds ratios (OR) and 95 confidence intervals
(CI) were estimated using unconditional multiple
logistic regression, including terms for age,
family history of prostate cancer, body mass
index (BMI), smoking, type 2 diabetes and
multivitamin use. All p-values are two-sided. All
two-way interactions involving hypercholesterolemi
a were assessed. Tests for interaction were
assessed by introducing a multiplicative term
between the two variables in the multivariable
model using a Wald test. Analyses were performed
using SAS software V9.1.3 (SAS Institute, Cary,
NC, USA).
18ExamplesNeed improvements
19METHODS Study Population The patients reviewed
were those who presented to a local clinic and
received a RADT during the months of March,
April, and May of 2004. The patients were
categorized by ages of 45 years. Of the 211 subjects, 37.4 were years old. The majority of patients, 53.6, fell
into the 15-45 year age group. And only 9 were
45 years old. Of the patients to receive an
RADT, 24.1 of those less than 15 years old
tested positive. 19.9 of those in the 15-45 age
range tested positive. And 10.5 of the patients
older than 45 years of age were positive.
20METHODS Data obtained for this study was taken
from the North Dakota Department of Health,
Division of Vital Statistics birth records from
January 1, 1996 through December 31, 2003.
During this timeframe, 63,344 live births
occurred 53, 416 of these records were
included in this studys data set due to
exclusions.
21ACCP Guidelines
22ExampleCodification of the variables
23.
.
.
.
.
.
24(No Transcript)
25(No Transcript)
26Why should we collect other variables in addition
to the exposure?
27Mortality in Area A and Area B
Suppose you surveyed how many people died per
year in area A and B (both population10,000),
and results were as indicated in the table. Do
people in area B have higher risk of death?
What do you think? Do you recommend people in
area B to move to area A?
28If you categorize the populations by age and
compare mortality of Area A with that of Area
B,
If you categorize the population according to the
age, population A and B has same risks under the
age of 60 years old. There are no people over 60
years old in area A. The difference of
mortality in the previous slide is due to the
difference in age distribution of the population.
29Example subjects smokers 1,000 non-smokers
1,000 of same age range, sex ratio status of
lung cancer for 5 years were observed.
30 P Results in 5 years
50 lung cancer cases in smokers and 10 in
non-smokers were observed. If we compare the
morbidity by smokers/non-smokers5/15, it
means that smokers are 5 times more likely to
have lung cancer than non-smokers. Generally
speaking, if p-value becomes less than 0.05, the
result you look at is really significant.
31 If the sample size of both groups is not 1,000,
but 100.
PSuppose that the sample size of both groups is
not 1,000 but 100, and 5 lung cancer cases in
smokers and 1 in non-smokers group were observed.
In this case, ratio of smokers/not
smokers5/15 stays same. However, if
statistical test was conducted, p-value became
0.212. From this study, you can not conclude
that there is a significant association between
smoking and lung cancer. How did it happen? In
epidemiologic study, to detect a certain level in
difference of outcome, relatively large sample
sizes are required. If the study is conducted in
a small sample size like this, sometimes true
results are not drawn. In such a case, it is
nothing more than a waste of time and money. We
have to be careful about sample size when we
conduct a study.
32 The factors that should be taken into
consideration when you look at the data age
gender race . . .
Previous slides indicate that if you look at
observed data, you have to consider the
difference of age. If you do not mind the
difference of age, the data seldom looks
different from truth. There are some factors in
addition to age that should be considered when
you compare the data - sex, race, year of birth,
education and so on.
33Commonly Used Statistical Tests
- Correlation a linear relationship
- T-test association between mean
- Chi-square association between proportions
34Correlation coefficient
- Correlation coefficient is a summary of the
strength of a linear association between two
variables. If the variables tend to go up and
down together, the correlation coefficient will
be positive. If the variables tend to go up and
down in opposition with low values of one
variable associated with high values of the
other, the correlation coefficient will be
negative.
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39- Essentials of Writing Biomedical Research Papers.
Mimi Zeiger, 2nd edition
40Questions
- Dr. Abe E. Sahmoun
- asahmoun_at_medicine.nodak.edu
- Dr. James R. Beal
- jrbeal_at_medicine.nodak.edu