Controlled Vocabularies: Name Authority Control - PowerPoint PPT Presentation

About This Presentation
Title:

Controlled Vocabularies: Name Authority Control

Description:

SIMS 202: Information Organization and Retrieval ... Which branches in Berkeley s public library system are open on Sunday? – PowerPoint PPT presentation

Number of Views:236
Avg rating:3.0/5.0
Slides: 45
Provided by: RayR
Category:

less

Transcript and Presenter's Notes

Title: Controlled Vocabularies: Name Authority Control


1
Controlled Vocabularies Name Authority Control
  • University of California, Berkeley
  • School of Information Management and Systems
  • SIMS 202 Information Organization and Retrieval

2
Review
  • Mapping to the relational model
  • Database Design Normalization
  • ER Diagrams and Assignment

3
Normalization
Unnormalized Relations
First normal form
Functional dependencyof nonkey attributes on the
primary key - Atomic values only
Second normal form
No transitive dependency between nonkey attributes
Third normal form
Boyce- Codd and Higher
Full Functional dependencyof nonkey attributes on
the primary key
All determinants are candidate keys - Single
multivalued dependency
4
Unnormalized Relations
  • First step in normalization is to convert the
    data into a two-dimensional table
  • In unnormalized relations data can repeat within
    a column

5
Unnormalized Relation
6
First Normal Form
  • To move to First Normal Form a relation must
    contain only atomic values at each row and
    column.
  • No repeating groups
  • A column or set of columns is called a Candidate
    Key when its values can uniquely identify the row
    in the relation.

7
First Normal Form
8
Second Normal Form
  • A relation is said to be in Second Normal Form
    when every nonkey attribute is fully functionally
    dependent on the primary key.
  • That is, every nonkey attribute needs the full
    primary key for unique identification

9
Second Normal Form
10
Second Normal Form
11
Third Normal Form
  • A relation is said to be in Third Normal Form if
    there is no transitive functional dependency
    between nonkey attributes
  • When one nonkey attribute can be determined with
    one or more nonkey attributes there is said to be
    a transitive functional dependency.
  • The side effect column in the Surgery table is
    determined by the drug administered
  • Side effect is transitively functionally
    dependent on drug so Surgery is not 3NF

12
Third Normal Form
13
Third Normal Form
14
Joins
15
More on Assignment and ER
  • Just what is this Cookie database?
  • What sort of ways might it be used?
  • What are those ER symbols again?

16
Original Assignment
  • Examine the Cookie database using Access and look
    at the ER Diagram for it posted on the
    assignments page.
  • Consider the possibilities of Book publications
  • What are the problems with the database?
  • What new fields would you add to the database,
    and where?
  • Draw a new ER diagram showing your design.

17
Cookie ER diagram
pubid
accno
Has call
Has copy
BIBFILE
LIBFILE
CALLFILE
libid
accno
libid
Note diagram contains only attributes used for
linking
Has index
Has subject
INDXFILE
SUBFILE
subcode
accno
subcode
18
Cookie Database
  • Cookie is a bibliographic database that contains
    information about a hypothetical union catalog of
    several libraries
  • There are currently 5 main types of entities in
    the database (and one linking relation)
  • Books (bibfile)
  • Local Call numbers (callfile)
  • Libraries (libfile)
  • Publishers (pubfile)
  • Subject headings (subfile)
  • Links between subject and books (indxfile)

19
BIBFILE
  • Books (BIBFILE) contains information about
    particular books. It includes one record for each
    book. The attributes are
  • accno -- an accession or serial number
  • author -- The authors name
  • title -- The title of the book
  • loc -- Location of publication (where published)
  • date -- Date of publication
  • price -- Price of the book
  • pagination -- Number of pages
  • ill -- What type of illustrations (maps, etc) if
    any
  • height -- Height of the book in centimeters

20
CALLFILE
  • CALLFILE contains call numbers and holdings
    information linking particular books with
    particular libraries. Its attributes are
  • accno -- the book accession number
  • libid -- the id of the holding library
  • callno -- the call number of the book in the
    particular library
  • copies -- the number of copies held by the
    particular library

21
LIBFILE
  • LIBFILE contain information about the libraries
    participating in this union catalog. Its
    attributes include
  • libid -- Library id number
  • library -- Name of the library
  • laddress -- Street address for the library
  • lcity -- City name
  • lstate -- State code (postal abbreviation)
  • lzip -- zip code
  • lphone -- Phone number
  • mop - suncl -- Library opening and closing times
    for each day of the week.

22
PUBFILE
  • PUBFILE contain information about the publishers
    of books. Its attributes include
  • pubid -- The publishers id number
  • publisher -- Publisher name
  • paddress -- Publisher street address
  • pcity -- Publisher city
  • pstate -- Publisher state
  • pzip -- Publisher zip code
  • pphone -- Publisher phone number
  • ship -- standard shipping time in days

23
SUBFILE
  • SUBFILE contains each unique subject heading that
    can be assigned to books. Its attributes are
  • subcode -- Subject identification number
  • subject -- the subject heading/description

24
INDXFILE
  • INDXFILE provides a way to allow many-to-many
    mapping of subject headings to books. Its
    attributes consist entirely of links to other
    tables
  • subcode -- link to subject id
  • accno -- link to book accession number

25
Some examples of Cookie Searches
  • Who wrote Microcosmographia Academica?
  • How many pages long is Alfred Whiteheads The
    Aims of Education and Other Essays?
  • Which branches in Berkeleys public library
    system are open on Sunday?
  • What is the call number of Moffitt Librarys copy
    of Abraham Flexners book Universities American,
    English, German?
  • What books on the subject of higher education are
    among the holdings of Berkeley (both UC and City)
    libraries?
  • Print a list of the Mechanics Library holdings,
    in descending order by height.
  • What would it cost to replace every copy of each
    book that contains illustrations (including
    graphs, maps, portraits, etc.)?
  • Which library closes earliest on Friday night?

26
ER Diagram Symbols
Ovals are used to indicate the attributes
associated with an entity or relationship (That
is, the pieces of information recorded in the
database about the entity or relationship) An
underlined name indicates that the attribute is a
primary key (That is, it can uniquely identify
the entity) Rectangles are used to indicate
entities (That is, the representatives or records
describing persons, things, or events in the
database) Diamonds are used to indicate
relationships between entities. (That is, some
association between the data records of different
entities)
Attribute
Primary key
Entity
Relationship
27
Cookie ER diagram
pubid
accno
Has call
Has copy
BIBFILE
LIBFILE
CALLFILE
libid
accno
libid
Note diagram contains only attributes used for
linking
Has index
Has subject
INDXFILE
SUBFILE
subcode
accno
subcode
28
Assignment Goal
  • The main intent is to have you start thinking
    about how databases are structured, and what
    types of information can or should be included
    when designing a database
  • The main task is to look for MISSING elements in
    the current design, or badly designed elements
    given the particular data
  • What attributes and/or new relations need to be
    added to the database?

29
And now for something completely different...
30
Today
  • Controlled vocabularies
  • Choice of names
  • Form of names
  • Name Authority files

31
Controlled Vocabularies
  • Vocabulary control is the attempt to provide a
    standardized and consistent set of terms (such as
    subject headings, names, classifications, etc.)
    with the intent of aiding the searcher in finding
    information.

32
Controlled Vocabularies
  • Names and name authorities (Today)
  • Cognitive basis of categorization and subject
    classification (Thursday)
  • Design of controlled vocabularies for subject
    access -- Thesaurus design (next week)

33
Names
  • Cutters objectives of bibliographic description
  • To enable a person to find a document of which
    the author is known.
  • To show what the library has by a given author.
  • First serves access.
  • Second serves collocation.

34
Problems with Names
  • How many names should be associated with a
    document?
  • Which of these should be the main entry?
  • What form should each of the names take?
  • What references should be made from other
    possible forms of names that havent been used?

35
The problem
  • Proliferation of the forms of names
  • Different names for the same person
  • Different people with the same names
  • Examples
  • from Books in Print (semi-controlled but not
    consistent)
  • ERIC author index (not controlled)

36
Rules for description
  • AACR II and other sets of descriptive cataloging
    rules provide guidelines for
  • Determining the number of name entries
  • Choosing a main entry
  • Deciding on the form of name to be used
  • Deciding when to make references

37
Authority control
  • Authority control is concerned with creation and
    maintenance of a set of terms that have been
    chosen as the standard representatives (also know
    as established) based on some set of rules.
  • If you have rules, why do you need to keep track
    of all of the headings?

38
Conditions of Authorship?
  • Single person or single corporate entity
  • Unknown or anonymous authors
  • Shared responsibility
  • Collections or editorially assembled works
  • Works of mixed responsibility (e.g. translations)
  • Related Works

39
Added Entries
  • Personal names
  • Collaborators
  • Editors, compilers, writers
  • Translators (in some cases)
  • Illustrators (in some cases)
  • Other persons associated with the work (such as
    the honoree in a Festschrift).
  • Corporate Names
  • Any prominently named corporate body that has
    involvement in the work beyond publication,
    distribution, etc.

40
Choice of Name
  • AACR II says that the predominant form of the
    name used in a particular authors writings
    should be chosen as the form of name.
  • References should be made from the other forms of
    the name.

41
Form of the Name
  • When names appear in multiple forms, one form
    needs to be chosen. Criteria for choice are
  • Fullness (e.g. Full names vs. initials only)
  • Language of the name.
  • Spelling (choose predominant form)
  • Entry element
  • John Smith or Smith, John?
  • Mao Zedong or Zedong, Mao? (Mao Tse Tung?)

42
Name Authority Files
IDNAFL8057230 STp ELn STHa MSc
UIPa TD19910821174242 KRCa NMUa
CRCc UPNa SBUa SBCa DIDn
DF05-14-80 RFEa CSC SRUb SRTn
SRNn TSS TGA? ROM? MOD VSTd
08-21-91 Other Versions
earlier 040 DLCcDLCdDLCdOCoLC 053
PR6005.R517 100 10 Creasey, John 400 10
Cooke, M. E. 400 10 Cooke, Margaret,d1908-1973
400 10 Cooper, Henry St. John,d1908-1973
400 00 Credo,d1908-1973 400 10 Fecamps,
Elise 400 10 Gill, Patrick,d1908-1973 400
10 Hope, Brian,d1908-1973 400 10 Hughes,
Colin,d1908-1973 400 10 Marsden, James 400
10 Matheson, Rodney 400 10 Ranger, Ken 400
20 St. John, Henry,d1908-1973 400 10 Wilde,
Jimmy 500 10 wnnncaAshe, Gordon,d1908-1973
43
Name Authority Files
IDNAFO9114111 STp ELn STHa MSn
UIPa TD19910817053048 KRCa NMUa
CRCc UPNa SBUa SBCa DIDn
DF06-03-91 RFEa CSCc SRUb SRTn
SRNn TSS TGA? ROM? MOD VSTd
08-19-91 040 OCoLCcOCoLC 100 10 Marric,
J. J.,d1908-1973 500 10 wnnncaCreasey,
John 663 Works by this author are entered
under the name used in the item. For a
listing of other names used by this author,
search also underbCrease y, John 670
OCLC 13441825 His Gideon's day, 1955b(hdg.
Creasey, John usage J .J. Marric) 670
LC data base, 6/10/91b(hdg. Creasey, John
usage J.J. Marric) 670 Pseuds. and
nicknames dict., c1987b(Creasey, John,
1908-1973 Britis h author pseud.
Marric, J. J.)
44
Name authority files
IDNAFL8166762 STp ELn STHa MSc
UIPa TD19910604053124 KRCa NMUa
CRCc UPNa SBUa SBCa DIDn
DF08-20-81 RFEa CSC SRUb SRTn
SRNn TSS TGA? ROM? MOD VSTd
06-06-91 Other Versions
earlier 040 DLCcDLCdDLCdOCoLC 100 10
Butler, William Vivian,d1927- 400 10 Butler,
W. V.q(William Vivian),d1927- 400 10 Marric,
J. J.,d1927- 670 His The durable
desperadoes, 1973. 670 His The young
detective's handbook, c1981bt.p. (W.V. Butler)
670 His Gideon's way, 1986bCIP t.p.
(William Vivian Butler writing as J .J.
Marric)
Write a Comment
User Comments (0)
About PowerShow.com