concordance - PowerPoint PPT Presentation

About This Presentation
Title:

concordance

Description:

... news occurred in 1990, with the 'unapproved' publication of the Dead Sea Scrolls. ... when the text of the Dead Sea Scrolls were released by a graduate student ... – PowerPoint PPT presentation

Number of Views:481
Avg rating:3.0/5.0
Slides: 20
Provided by: lew5
Learn more at: https://cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: concordance


1
concordance
2
  • A library historically indexed its collection in
    a very space- and time-consuming manner. The
    index consisted of a physical card catalogue
    organized by three key fields (subject, author,
    title) and a few cross-references, such that
    every book had three separate cards, each
    meticulously filed in three separate catalogues.

3
  • The system was difficult to manage even with a
    modest collection of books, numbering in the tens
    of thousands, and would certainly not scale up
    well to the volume of information being published
    today.

4
  • Furthermore, the old catalogue system was
    relatively shallow in its information content if
    a patron wanted to find books that featured
    stories about a specific topic, such as
    aardvarks, and the word aardvark did not occur in
    the title or subject fields, then the catalogue
    would be unlikely to yield the full range of
    materials even if they existed in the collection.

5
  • It would be nice, from the point of view of the
    aardvark enthusiast, to have the capacity to get
    a list of all books that mentioned aardvarks, and
    to know something about the relative frequency of
    use so as to determine whether the source might
    be a significant source of information or if the
    mention was just a passing reference.

6
  • The preparation of such a textual analysis is
    based upon the generation of a concordance of
    the texts, wherein each word of the text is
    indexed, and the result is a list of words along
    with their frequency of use. The production of a
    concordance is a long and painstaking process
    when working from printed texts it becomes quite
    easy with an electronic form of that same text
    and the power of a modern computer.

7
  • A concordance is an interesting tool, with a
    well-established history in the field of textual
    analysis. The first known example of a
    concordance was created in the 12th century,
    using the books of the bible. Another well-known
    example is the concordance of the works of
    William Shakespeare.

8
  • In both of these cases, the concordance data was
    used to facilitate the cross-referencing of
    people, places and events, and to help
    investigate the use of particular phrases or
    literary allusions.

9
  • A concordance can also used as a means to
    authenticate texts, to ascertain the authorship
    of a particular text this kind of use makes the
    newspapers every so often as a researcher claims
    that an old work was actually written by
    Shakespeare, or that Shakespeares works were
    really written by someone else, or in trying to
    unmask the identity of a criminal (Jack the
    Ripper, for example) based upon notes left behind
    at the scenes of the crimes.

10
  • The output of a concordance program, in digital
    form, could easily be scanned for key words (or,
    in more advanced forms, for phrases and words in
    close proximity as well).

11
  • The ready access to the original document and its
    entire vocabulary list would provide a richer and
    deeper capacity to help determine that, in this
    example, the (fictional) book The Life and Times
    of Arnold is a likely source of information
    about aardvarks, in spite of the fact that the
    word aardvark doesnt appear in the title, and
    the subject might be listed as biography.

12
  • One of the most famous examples of a concordance
    making a big splash in the news occurred in 1990,
    with the unapproved publication of the Dead Sea
    Scrolls.
  • The Dead Sea Scrolls were discovered in 1947, in
    the Middle East (near the Dead Sea of course).
    There were approximately 800 scrolls,
    representing most of the books of the Old
    Testament. They were the oldest and thus most
    authentic of any known examples of biblical
    texts.

13
  • They soon became the center of controversy. The
    scrolls and their contents were kept out of
    general circulation, seen and studied only by a
    select group of scholars. The limited
    distribution was purportedly set up out of
    concern that the translation and interpretation
    of the scrolls was too important to be done in a
    careless or insensitive manner.

14
  • But as decades passed, and few of the scrolls had
    been published, there was growing impatience
    amongst others in the field who were upset that
    some of the most important documents in history
    were being deliberately withheld, with no
    definitive plan for their ultimate publication.

15
  • The situation changed dramatically when the text
    of the Dead Sea Scrolls were released by a
    graduate student from the Union Theological
    Seminary in Cincinnati, Ohio. It seems that a
    full concordance of the scroll had been
    previously prepared, complete with listings of
    every word (in Hebrew and Aramaic) and where it
    occurred in each of the documents. The school
    possessed a full printed concordance of the
    scrolls.

16
  • The significance of the concordance was not lost
    on the graduate student, who transcribed the
    information into digital form and used a desktop
    computer to re-assemble the original text from
    the concordance data.

17
  • A complete concordance would index every word in
    a document. This includes all parts of speech,
    such as definite and indefinite articles,
    pronouns, and conjunctions. While such words are
    important in a full concordance, they are not so
    useful in the context of indexing files and
    distinguishing amongst documents in the context
    of a search.

18
  • The implementation of a web indexing strategy
    would include a mechanism by which specific words
    and/or entire parts of speech would be excluded
    from the index. This could be done using an
    explicit list or table of excluded words, or more
    ambitiously by using structural analysis to
    identify the parts of speech and exclude entire
    categories of words.

19
  • The aardvark case is but a single small example
    consider the potential if every document were
    available in digital form and indexed in its
    entirety (not just a few key words). The
    comprehensiveness of a search for topical
    documents would be significantly improved, though
    some caution would be needed to filter out
    extraneous information and somehow prioritize the
    results.
  •  
Write a Comment
User Comments (0)
About PowerShow.com