Title: Bilgi Erisim: Temel Kavramlar
1Bilgi Erisim Temel Kavramlar
Yasar Tonta Hacettepe Üniversitesi tonta_at_hacettepe
.edu.tr yunus.hacettepe.edu.tr/tonta/ DOK324/BBY2
20 Bilgi Erisim Ilkeleri
2Plan
- Bilgi tanimi
- Belge tanimi
- Bilgi erisim sistemlerinin mantiksal yapisi
- Temel kavramlar
- Erisim kurallari
- Performans ölçümleri
3Felsefede Bilgi (Knowledge)
- Bilgi
- Bilme etkinligi
- Bu etkinlik sonucu elde edilen çikti
- Bilgi etkinlikleri
- algilama
- anlama
- düsünme
- muhakeme etme
- yorumlama
- açiklama
- dogrulama
- degerlendirme
Kaynak Kuçuradi, 1995, s. 97
4Bilgi Arastirmalarinda Bilgi (Information)
- Süreç olarak bilgi (information-as-process)
- Bilgi olarak bilgi (information-as-knowledge)
- Nesne olarak bilgi (information-as-object)
5Bilgiye Farkli Bakis Açilari
Kaynak Buckland, 1991, s. 6
6Belge
- docere ögretmek, bilgilendirmek
- ment araçlar
- bir fiziksel ya da entellektüel olguyu temsil
etmek, yeniden yaratmak ya da ispatlamak için
korunan ya da kaydedilen tüm somut ve sembolik
dizinsel isaretler (Suzanne Briet) - Belge örnekleri kil tablet, yontu, papirüs,
harita, yazma, kitap, dergi, resim, film, kaset,
CD-ROM, DVD, Web sayfasi, dijital belgeler, vs.
7Farkli Disiplinlerde Belge
- Belge biçim isaret ortam
- Biçim
- Hattatlar, müzik ve sinema yapimcilari, örüntü
tanima uzmanlari, kütüphaneciler, arsivciler,
müzeciler - Isaret
- Dilbilimciler, bilgisayarcilar, yapay zeka
uzmanlari - Ortam
- Arsivciler, tarihçiler, hukukçular, diplomatik
bilimciler, yayincilar, kütüphaneciler, vd.
8Bilgi Yönetimi (Information Management)
- her türlü örgütün etkin olarak isletilmesiyle
ilgili bilginin saglanmasi, düzenlenmesi,
denetimi, yayimi ve kullanimina yönetim
ilkelerinin uygulanmasi - dogru karar vermek için dogru formda, dogru
kisiye, dogru maliyetle, dogru zamanda, dogru
yerde, dogru bilgiyi saglamak
9Bilgi Yönetimi (Knowledge Management)
- bir örgütün misyonunu gerçeklestirmesi için
örgütün entellektüel sermayesinin kullanimina
dayanan bir yönetim uygulamasi - Entellektüel sermaye örgüt çalisanlarinin
gelistirdigi ya da biriktirdigi deneyim, hizmet
ve ürünlerden saglanan bilgi (knowledge). - Bilgi (knowledge)
- Belirtik (nesne olarak bilgi)
- Örtük (bilgi olarak bilgi)
10Bilgi Yöneticisi Neyi Yönetir?
- Insan beyninde sakli örtük bilgileri mi?
- Üzerinde bilgi tasidigi varsayilan nesneleri
(belgeleri) mi? - Yoksa her ikisini de mi?
- Kütüphanecilik
- Arsivcilik
- Dokümantasyon - Belge yönetimi Kayit yönetimi -
Idari dokümantasyon (records management, document
management) - Veri yönetimi, Bilgi kaynaklari yönetimi, Bilgi
teknolojisi yönetimi - Bilgibilim, bilgi arastirmalari
- Bilgi yönetimi (üzerinde bilgi tasiyan belgelerin
yönetimi)
11Bilgi Yönetimi (Information Management)
- Belgelerin saglanmasi, düzenlenmesi, yasatilmasi,
kullanimi, korunmasi, arsivlenmesi - Kullanicilarin bilgi gereksinimlerinin saptanmasi
ve karsilanmasi - Bilgi sistemlerinin tasarlanmasi, kurulmasi ve
isletilmesi - Bilgi teknolojisi yönetimi
12Bilgi Erisim
- bilgi toplama, siniflama, kataloglama, depolama,
büyük miktardaki verilerden arama yapma ve bu
verilerden istenen bilgiyi üretme (veya gösterme)
teknigi ve süreci
13Bilgi Erisimin Temel Ikilemi
- Hakkinda bilgi bulmak için bilmedigin bir seyi
tanimlama geregi (Hjerrpe)
14Bilgi Kesfetme, Tanimlama, Düzenleme ve Erisim
Kesfetme
Kesfetme
Tanimlama
Tanimlama
Düzenleme
Düzenleme
Erisim
Erisim
15Belge Erisim Sisteminin Mantiksal Düzenlemesi
Belgeler
Kullanicilar
Gömü - Sözlük
Sorgu formülasyonu
Dizinleme
Dizin tutanaklari
Formel sorgu cümlesi
Erisim kurali
Kaynak Maron, 1984
16Ideal Bilgi Erisim Sistemi
- Ilgili belgelerin tümüne ve salt ilgili belgelere
erisim saglamali - Ilgililik kavrami
- Nesnel ilgililik
- Öznel ilgililik
- Birbirine benzeyen bilgileri bir araya getirmek,
benzemeyenleri ayirmak
17Background Concepts for IR
- User Information Needs
- Controlled Vocabularies (Pre and
Post-coordination) - Indexing Languages
- IR definitions and concepts
- Documents
- Queries
- Collections
- Evaluation
- Relevance
18User Information Need
- Why build IR systems at all?
- People have different and highly varied needs for
information - People often do not know what they want, or may
not be able to express it in a usable form - Bouldings Image
- How to satisfy these user needs for information?
19Controlled Vocabularies
- Vocabulary control is the attempt to provide a
standardized and consistent set of terms (such as
subject headings, names, classifications, etc.)
with the intent of aiding the searcher in finding
information. - Controlled vocabularies are a kind of metadata
- Data about data
- Information about information
20Pre- and Postcoordination
- Precoordination relies on the indexer (librarian,
etc.) to construct some adequate representation
of the meaning of a document. - Postcoordination relies on the user or searcher
to combine more atomic concepts in the attempt to
describe the documents that would be considered
relevant.
21Structure of an IR System
Search Line
Adapted from Soergel, p. 19
22Uses of Controlled Vocabularies
- Library Subject Headings, Classification and
Authority Files. - Commercial Journal Indexing Services and
databases - Yahoo, and other Web classification schemes
- Online and Manual Systems within organizations
- SunSolve
- MacArthur
23Types of Indexing Languages
- Uncontrolled Keyword Indexing
- Indexing Languages
- Controlled, but not structured
- Thesauri
- Controlled and Structured
- Classification Systems
- Controlled, Structured, and Coded
- Faceted Classification Systems
24Thesauri
- A Thesaurus is a collection of selected
vocabulary (preferred terms or descriptors) with
links among Synonymous, Equivalent, Broader,
Narrower and other Related Terms
25Thesauri (cont.)
- National and International Standards for Thesauri
- ANSI/NISO z39.19--1994 -- American National
Standard Guidelines for the Construction, Format
and Management of Monolingual Thesauri - ANSI/NISO Draft Standard Z39.4-199x -- American
National Standard Guidelines for Indexes in
Information Retrieval - ISO 2788 -- Documentation -- Guidelines for the
establishment and development of monolingual
thesauri - ISO 5964-- Documentation -- Guidelines for the
establishment and development of multilingual
thesauri
26Development of a Thesaurus
- Term Selection.
- Merging and Development of Concept Classes.
- Definition of Broad Subject Fields and Subfields.
- Development of Classificatory structure
- Review, Testing, Application, Revision.
27Categorization Summary
- Processes of categorization underlie many of the
issues having to do with information organization - Categorization is messier than our computer
systems would like - Human categories have graded membership,
consisting of family resemblances. - Family resemblance is expressed in part by which
subset of features are shared - It is also determined by underlying
understandings of the world that do not get
represented in most systems
28Classification Systems
- A classification system is an indexing language
often based on a broad ordering of topical areas.
Thesauri and classification systems both use this
broad ordering and maintain a structure of
broader, narrower, and related topics.
Classification schemes commonly use a coded
notation for representing a topic and its place
in relation to other terms.
29Classification Systems (cont.)
- Examples
- The Library of Congress Classification System
- The Dewey Decimal Classification System
- The ACM Computing Reviews Categories
- The American Mathematical Society Classification
System
30Central Concepts in IR
- Documents
- Queries
- Collections
- Evaluation
- Relevance
31Documents
- What do we mean by a document?
- Full document?
- Document surrogates?
- Pages?
- Buckland What is a Document, What is a
Digital Document - Are IR systems better called Document Retrieval
systems? - A document is a representation of some
aggregation of information, treated as a unit.
32Collection
- A collection is some physical or logical
aggregation of documents - A database
- A Library
- An index?
- Others?
33Queries
- A query is some expression of a users
information needs - Can take many forms
- Natural language description of need
- Formal query in a query language
- Queries may not be accurate expressions of the
information need - Differences between conversation with a person
and formal query expression
34Evaluation
- Why Evaluate?
- What to Evaluate?
- How to Evaluate?
35Why Evaluate?
- Determine if the system is desirable
- Make comparative assessments
- Others?
36What to Evaluate?
- How much of the information need is satisfied.
- How much was learned about a topic.
- Incidental learning
- How much was learned about the collection.
- How much was learned about other topics.
- How inviting the system is.
37What to Evaluate?
- What can be measured that reflects users
ability to use system? (Cleverdon 66) - Coverage of Information
- Form of Presentation
- Effort required/Ease of Use
- Time and Space Efficiency
- Recall
- proportion of relevant material actually
retrieved - Precision
- proportion of retrieved material actually relevant
effectiveness
38Relevance
- In what ways can a document be relevant to a
query? - Answer precise question precisely.
- Partially answer question.
- Suggest a source for more information.
- Give background information.
- Remind the user of other knowledge.
- Others ...
39Relevance
- Intuitively, we understand quite well what
relevance means. It is a primitive y know
concept, as is information for which we hardly
need a definition. if and when any productive
contact in communication is desired,
consciously or not, we involve and use this
intuitive notion or relevance. - Saracevic, 1975 p. 324
40Relevance
- How relevant is the document
- for this user, for this information need.
- Subjective, but
- Measurable to some extent
- How often do people agree a document is relevant
to a query? - How well does it answer the question?
- Complete answer? Partial?
- Background Information?
- Hints for further exploration?
41Relevance Research and Thought
- Review to 1975 by Saracevic
- Reconsideration of user-centered relevance by
Schamber, Eisenberg and Nilan, 1990 - Special Issue of JASIS on relevance (April 1994,
45(3))
42Saracevic
- Relevance is considered as a measure of
effectiveness of the contact between a source and
a destination in a communications process - Systems view
- Destinations view
- Subject Literature view
- Subject Knowledge view
- Pertinence
- Pragmatic view
43Define your own relevance
- Relevance is the (A) gage of relevance of an (B)
aspect of relevance existing between an (C)
object judged and a (D) frame of reference as
judged by an (E) assessor - Where
From Saracevic, 1975 and Schamber 1990
44A. Gages
- Measure
- Degree
- Extent
- Judgement
- Estimate
- Appraisal
- Relation
45B. Aspect
- Utility
- Matching
- Informativeness
- Satisfaction
- Appropriateness
- Usefulness
- Correspondence
46C. Object judged
- Document
- Document representation
- Reference
- Textual form
- Information provided
- Fact
- Article
47D. Frame of reference
- Question
- Question representation
- Research stage
- Information need
- Information used
- Point of view
- request
48E. Assessor
- Requester
- Intermediary
- Expert
- User
- Person
- Judge
- Information specialist
49Schamber, Eisenberg and Nilan
- Relevance is the measure of retrieval
performance in all information systems, including
full-text, multimedia, question-answering,
database management and knowledge-based systems. - Systems-oriented relevance Topicality
- User-Oriented relevance
- Relevance as a multi-dimensional concept
50Schamber, et al. Conclusions
- Relevance is a multidimensional concept whose
meaning is largely dependent on users
perceptions of information and their own
information need situations - Relevance is a dynamic concept that depends on
users judgements of the quality of the
relationship between information and information
need at a certain point in time. - Relevance is a complex but systematic and
measureable concept if approached conceptually
and operationally from the users perspective.
51Froehlich
- Centrality and inadequacy of Topicality as the
basis for relevance - Suggestions for a synthesis of views
52Janes View of Relevance