Title: Data Handling, Presentation, and Record Keeping
1Data Handling, Presentation, and Record Keeping
- Jon Lundberg
- Nicole Aly
- With Special Guest Lecturer Liz Davis
2Data Handling, Presentation, and Record Keeping
- Finding Data
- cleaned data
- Record Data Sources
- Databases
- Merging data bases
- Programs/Methodology
- Code-book
- Adjustments (e.g. variable/data deletion,)
- Equations
- Save
- Multiple locations
- Email, Flashdrive, RW-CD, etc.
- Silvia ? more space on network drive
- Set Deadlines
- Keep to them
- Takes Longer than you think
3Data Reporting Example
- New York Times Article
- Calcium Study
- NO BENEFIT???
4Econometrician Criticisms
- If you torture the data long enough, Nature will
confess. - Econometricians, like artists, tend to fall in
love with their models. - There are two things you are better off not
watching in the making sausages and econometric
estimates.
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
5What to do?
- Discard the goal of objective inference
- Logical conclusion based on a set of facts
- Is it fact or opinion? Counterproductive
- Deciding if error terms are correlated. Could it
depend on what I had for breakfast?
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
6What to Report?
- Sensitivity Analysis
- Record inferences implied by alternative sets of
opinions - Show how an inference changes as variables are
added or deleted from equation - Do assumptions within the set lead to different
inferences - Work harder to narrow the set of assumptions
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
7Fragility
- An inference is unbelievable if fragile
- Can it be reversed by minor changes in
assumptions? - Can it stand up to researchers opposite opinions?
- Do your own detailed sensitivity analysis
- Report mapping from assumptions to inferences
- Anticipate opinions of consuming public
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
8Mapping Problems
- Space of assumptions is infinite
- Example Measuring effects of fertilizer on crops
- If you include light level, what about rainfall
- If you include rainfall, what about temperature
- If you include temp., what about soil depth
- If you include depth, what about soil grade
- And so on, and so on, and so on, and so on
- YOU GET THE PICTURE!?
-
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
9Important Question
- Coefficient is negative and you think it should
be positive. You find another variable to add to
make it positive. Have you found evidence that
the coefficient is truly positive? - Think about what you would do before examining
the data.
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
10Murder Rate Example
- Simple Regression Results Each additional
execution deters 13 murders, S.E. 7 - Is this conclusion fragile?
- Different viewpoints different subsets
Source Leamer, Edward E., Lets Take the Con
Out of Econometrics
11Data Confidentiality
- What types of data are private or confidential?
- What laws and regulations protect the privacy of
data used by researchers? - What responsibilities do researchers have to
protect the data? - What do we mean by ethical and responsible use of
data?
12Which data might be sensitive?
- Social Security Numbers
- Often SSNs are one of the few identifiers
available to match individuals across data sets. - Names and addresses
- Individuals, companies, public officials,
students - Geographic identifiers
- County, city, census tract, geocode
- Employment and earnings
- Both employers and employees concerned
- Health data, government benefits, tax data
13Laws and regulations
- Census Bureau confidentiality rules
- Minnesota Government Data Practices Act (MGDPA)
- HIPAA Privacy Rule
- Confidentiality agreements with government
agencies or private companies or with survey
respondents
14Protected information, for purposes of this
agreement, includes any or all of the following
- Private data (as defined in Minnesota Statutes
13.02, subd. 12), confidential data (as defined
in Minn. Stat. 13.02, subd. 3), welfare data (as
governed by Minn. Stat. 13.46), medical data (as
governed by Minn. Stat. 13.384), and other
non-public data governed elsewhere in Minnesota
Government Data Practices Act (MGDPA), Minn.
Stats. Chapter 13 - Medical records (as governed by the Minnesota
Medical Records Act Minn. Stat. 144.335) - Chemical health records (as governed by 42 U.S.C.
290dd-2 and 42 CFR 2.1 to 2.67) - Protected health information (PHI) (as defined
in and governed by the Health Insurance
Portability Accountability Act HIPAA, 45 CFR
164.501) and - Other data subject to applicable state and
federal statutes, rules, and regulations
affecting the collection, storage, use, or
dissemination of private or confidential
information.
15Responsibilities of Data Users
- 1) Know the rules Ignorance is not an acceptable
defense. - 2) Understand the purpose and objectives of data
privacy. - 3) Protect data from unauthorized access or
disclosure. - 4) Do not report data that can identify
individuals or specific companies unless it is
public information. - 5) Destroy confidential data when no longer
needed.
16Ethical Issues in Reporting and Presenting
Results
- Potential disclosure of individual or sensitive
data. - Cleaning data and eliminating outliers
- Reporting findings that support ones hypotheses
and suppressing those that dont. - Statistically insignificant results.
- Subgroup analyses and multiple hypothesis tests.
- Data mining.
17Questions?
18(No Transcript)