The Certainty of Citations - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

The Certainty of Citations

Description:

The Certainty of Citations. A proposal for an objective method of measuring ... http://www.rootsweb.com/~arcchs/MARB.html. Note 3 year gap in age. 1880 Census ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 50
Provided by: beausha
Category:

less

Transcript and Presenter's Notes

Title: The Certainty of Citations


1
The Certainty of Citations
  • A proposal for an objective method of measuring
    certainty

2
Genealogy Background
Notice the light at the top of the picture.
3
The FM Bobo Story
Grandmother
Grandfather of grandmother
4
(No Transcript)
5
(No Transcript)
6
1860 Census
7
(No Transcript)
8
(No Transcript)
9
1870 Census
10
(No Transcript)
11
(No Transcript)
12
Marriage RecordCarroll County Arkansas Marriage
Records Eastern District Grooms Index 1869-1930
Book/Page Groom Age Bride Age Date
A 63 BOBO FRANCES M. 19 LITTRELL MATILDA 16 6/02/1872
Note 3 year gap in age.
http//www.rootsweb.com/arcchs/MARB.html
13
1880 Census
14
Remember Jarrett for later
15
(No Transcript)
16
1920 Census
17
(No Transcript)
18
Jarretts Funeral Book
19
Record Summary
Record Date Record Type Birth Reported Age Reported Implied Birth Death Rept Cen Age Date
8/23/1860 CEN 8 1852 1-Jun
7/14/1870 CEN 18 1852 1-Jun
6/2/1872 MAR 19 1853
6/17/1880 CEN 25 1855 1-Jun
1/22/1920 CEN 71 1849 1-Jan
2/12/1951 FUN 11/17/1932
1/1/1955 GRAV 10/1/1845 1845 11/10/1931
20
Lets talk about that
Note person partially in picture.
21
The Information Flow Diagram
  • Event an association of an action, place, time,
    and person(s)

EVENT
Dick Eastman at GENTECH2, January 1994
22
The Information Flow Diagram
  • Reporter a person who creates a record about an
    event.
  • We can measure confidence or bias.

EVENT
John Wylie, president of GENTECH for 5 years
REPORTER
23
The Information Flow Diagram
  • Record a report about an event, which may not
    be complete or accurate
  • Measure granularity.

EVENT
RECORD
REPORTER
24
Whats Granularity?
Small Medium Large
NAME James Powell Sharbrough J Sharbrough Sharbrough
DATE June 2, 1872 June, 1872 1872
PLACE 123 Elm St Harris County Texas
25
Granularity Examples
Case 1 Case 2 Case 3
Name FM Bobo - 2 Francis M Bobo 3 Bobo -1
Date 1953 -1 June 1853 2 2 Jun 1872 - 3
Place 153 Elm St, Tulsa, OK - 3 Carroll Co, Ark 2 Ark -1
6 7 5
26
The Information Flow Diagram
ER Gap
  • Reviewer a person who reviews records and draws
    conclusions.
  • Evaluate ER Gap, evaluate Reporter.

EVENT
RECORD
REPORTER
REVIEWER
Tony Burroughs, NGS 2001, Portland OR
27
The Information Flow Diagram
  • Conclusion a statement by a reviewer about a
    collection of records related to an event
  • Report a collection of conclusions.

EVENT
RECORD
REPORT
REPORTER
REVIEWER
28
ER Gap
Far
All Records about my family0
Secondary Record1
Secondary Record1
Primary Record2
Near
Far
Near
29
Features of EVIDENCE The Record
  • Granularity
  • Mind the Gap - ER Gap
  • Reporter

30
CONCLUSION Rate It
  • 1 - Believe
  • 2 - Know
  • 3 - Can Prove
  • 0 No claim
  • Negative numbers -1, -2, -3

31
TRUST The Report
  • Do this like eBay

32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
So many formulas
  • so few examples.
  • Record granularity measurement 3 to 9
  • ER Gap 0, 1, or 2
  • Reviewer evaluation of reporter -1 to 10
  • Reviewer confidence - -3 to 3
  • Trust number, positive feedback ratio
  • Granularity / 5 ER Gap Report Eval / 5
    Reviewer Confidence Trust ratio / 0.5

36
The Death Certificate
Demographic Info
Medical Info
37
Its What-if Time
What if we could make the future however we like?
38
Mechanical Certainty
  • Finding Needles in Really Big Haystacks

39
Record Linking
  • Building Indices
  • Finding larger patterns

40
  • Where
  • x indicates the identifier and its value on the
    record from the file initiating the search
    (record A)
  • y indicates the identifier and its value on the
    record from the file being searched (record B)
  • LINKED pairs may refer either to all linked
    pairs, or to a defined subset of these and
  • UNLINKABLE pairs may refer either to all
    unlinkable pairs, or to a defined subset,
    provided the linked and the unlinkable sets (or
    subsets) are otherwise strictly comparable with
    each other.

41
Examples
  • FIRST INITIALS
  • AGREEMENT
  • DISAGREEMENT
  • LETTER Q
  • YEAR OF BIRTH
  • SIMILARITY (difference 1 year)
  • DISSIMILARITY (difference 11 years)
  • GIVEN NAMES
  • SIMILARITY (first 3 letters agree, none disagree
    eg Sam vs Samuel)
  • SIMILARITY DISSIMILARITY (first 3 letters
    agree, 4th disagrees eg Samuel vs Sampson)
  • DIFFERENT BUT LOGICALLY RELATED IDENTIFIERS
  • PLACE of WORK vs PLACE of DEATH (Provo vs Salt
    Lake City)

42
Some more examples
43
Discrimination
  • A lookup table containing the frequencies of
    values for identifiers, as they appear in the
    file being searched.
  • SURNAMES Brown (0.39), Aube (0.014), and Skuda
    (0.00004).
  • FIRST NAMES John(5.30), Axel (0.020), and Ulder
    (0.0045).

44
Competing Hypotheses
Record Date Record Type Birth Reptd Age Reptd Implied Birth Death Rept Cen Age Date Rate
8/23/1860 CEN   8 1852   1-Jun 60
7/14/1870 CEN   18 1852   1-Jun 60
6/2/1872 MAR   19 1853     40
1/22/1920 CEN   71 1849   1-Jan 40
6/17/1880 CEN   25 1855   1-Jun 25
2/12/1951 FUN       11/17/1932   10
1/1/1955 GRAV 10/1/1845   1845 11/10/1931   5
45
The Digital Research Assistant
  • Search for records on internet
  • Evaluate their relevance to assignment
  • Evaluate their granularity, confidence, etc
  • Evaluate patterns, such as families
  • Report matches
  • Let me set the knobs for the parameters

46
The DRA will have ...
  • A heirarchy of useful comparison algorithms
  • A method of searching across the Internet - and
    paying for it
  • A method of documenting the source of that search
    that satisfies the rules of preserving
    intellectual property and academic research

47
Who knows what the formula will be?
  • We are asking which dragons must be slain, but we
    arent saying how it must happen.
  • We are talking about possible ways to accomplish
    our goal.
  • That goal is connecting to new information, with
    confidence.

48
Summary
  • Any type of review
  • Measurements of Records
  • Measurement of conclusions
  • Rating of publishers
  • Mechanical searches
  • Record Linking
  • Smart Searches
  • Groupwork and Rights

49
Never forget to have fun
Write a Comment
User Comments (0)
About PowerShow.com