Title: Disciplinebased Visualization
1Discipline-based Visualization
- Howard D. White
- Xia Lin
- Katherine W. McCain
- Drexel University
2Comparing 2 Views of Information Science Authors
- Produced with similar (not identical) data sets
from same research project. - Lins is a Kohonen feature map.
- White and McCains map results from SPSS
multidimensional scaling and clustering.
3Lins Kohonen map
- Represents full set of 120 most-cited authors in
information science, 1972-95, based on how they
are co-cited with each other. - Shaded areas show subject regions. Neighboring
regions are intellectually close. - Input was Pearson r correlation coefficients for
each author pair--the measure of similarity
between authors.
4White McCains MDS Map
- Shows 100 most cited authors in information
science for 1988-95, based on how they are
co-cited with each other. - Different from Lins map because of SPSS
limitations not 120 authors not 1972-95. - Is a re-oriented version of map in White McCain
article in April 1998 JASIS.
5White McCains MDS Map (cont.)
- Enhanced with lines that reflect a
complete-linkage clustering of authors.
Boundaries show broad subject regions. - SPSS programs used were ALSCAL for
multidimensional scaling and CLUSTER for
complete-linkage clustering. - Input was Pearson r correlation coefficients for
each author pair--the measure of similarity
between authors.
6RESULTS
- Kohonen algorithm and multidimensional scaling
produce broadly similar maps, both highly
intelligible.
7Both Maps
- Isolate two main sub-disciplines of information
science INFORMATION RETRIEVAL (top
half) DOMAIN ANALYSIS (bottom half)
8Clockwise in both maps
- Top left Hard retrievalists-- quantitative
retrieval theory, algorithmic solutions to
document retrieval problems, IR systems
evaluation - Top right Soft retrievalists-- practical
aspects of online searching, library
automation, user cognition and behavior
9Clockwise in both maps (cont.)
- Midright Theoretical gurus --external to the
discipline - Lower right/bottom Domain analysts
--specialists in scientific and technical
communication, sociology of science, studies
of specific literatures, often through citation
analysis
10Clockwise in both maps (cont.)
- Lower left Bibliometricians-- mathematical
modelers of properties of literatures - Midleft Dual contributors-- feet in both
camps, bibliometrics and information retrieval
11Both Maps
- Reveal clusters of authors in specialties around
the edges and relatively few authors in central
region. (Latter implies that not many authors
are seen as integrating IS specialties.) - Show much the same progression of specialties
around the edges.
12Both Maps
- Have the largely the same, good transitional
authors between specialties. - Show retrievable literatures represented by a
display of interrelated authors.
13BUT
- Kohonen feature map has advantages over
cluster-enhanced MDS map. - It took 6 minutes to create on a Sun workstation,
given the input correlation matrix. - It can accept potentially far more authors than
ALSCALs limit of 100 or the 120 in Lins present
map.
14Further advantages
- It is an electronic document that can rapidly be
developed, through Java programming, as an
interface for retrieval. - In Lins present Website version, visitors can
draw a rectangle with the cursor around groups of
author-points. This produces a pop-up window
with the authors full names. When these are
clicked on, an Alta Vista search is launched for
documents related to the author on the Web (e.g.,
his or her home page).
15Implications
- Future maps could be developed to retrieve
valuable caches of documents by or about authors,
or documents citing them. - Might become part of electronic publishing
would assist in retrieving documents from the Web
or on CD-ROMs.
16The Java Interface
- Interactive
- control the number of authors shown on the screen
- control the number of Labels on the screen
- Labels are currently defined manually
- select authors with mouse drags
- can also type in names to search
17The Java Interface
- Distributed
- It is an applet on the web
- It is a front-end to web-search engines
- Clicking on an authors name will retrieve the
authors related work available on the web - It can be font-ends to multiple databases
18Interface Design Issues -1
- If an author is involved in three subject areas,
where should he/she be located? - A. In one major area selected
- B. In the middle of the three areas
- C. In all the three areas
- The algorithms typically choose answer B.
- Users like answer C.
19Interface Design Issues -2
- When there are many authors in an area
- A. Display all the names
- B. Display selective names only
- C. Display dot icons only
- Names would overlay with each others in Answer A.
- Answers B and C both lose some visual power.
20Interface Design Issues -3
- How to draw boundaries of subject areas?
- A. Draw clear lines between subject areas.
- B. Leave some blurry areas among subject areas.
- C. Never clearly specify the subject areas.
- Subject areas are in the eyes of gazers.
- Depends on the data and the users.
21Interface Design Issues -4
- How details should the labels be?
- Should the map serve as
- A. the table of contents
- B. the back-of-the-book index
- C. both
- Users seem to want both.
- But a map cannot be everything.
22Future Research
- To continue exploring algorithms for
literature/discipline mapping - automatically identify keyword labels/subject
areas - combine data of keyword indexing and citation
indexing - use multi-discipline data
- reduce mapping time
- from 6 minutes to 6 seconds?
23Future Research
- To continue developing the interface
- Better graphical layouts and visual design
- more interactive functions
- multi-levels of displays
- user studies
24Future Research
- To improve the connection between the map
interface and databases - Increasing search precision
- Create our own databases for ISI data
- Add other authors to the map
- Every one of you should be on the map
- if you cite any of the authors on the map, and
- if a live calculation of your citation patterns
can be done.
25URL for the map Interface
- http//
- research.cis.drexel.edu/
- citation/index.html
26Describing a Discipline
- By subject areas/topics
- By people involved
- By keywords
- By its relationships with other disciplines
- By dimensionality or structures of the discipline
- By all of the above in a map?
27(Online retrieval)
Retrieval
User
(IR theories)
(Communication)
(General)
Citation
Document
(Bibliometrics)
28Results of Kohonen Mapping of the Data
- Overall structures are similar to MDS map
- Labels generated through factor analysis can be
easily put on the map - Clusters identified on MDS map are also visible
on Kohonen map - Many more identifiable sub-areas
- Color codes for sub-areas
- More than two dimensions identified
29Where are they located?
- Donald Kraft
- Edward Fox
- Raya Fidel
- Eugene Garfield
- Robert Hayes
30Where are you located?
- Every one of you should be on the map
- if you cite any of them
- if we can do a live calculation of your
citation patterns - currently under development
31Example
Online Searching
Searching Behaviors
Information Retrieval
Question Where should Nick Belkin be?
32Answer
He should be in all the three, and more!
Online Searching
Borgman
Belkin
Fidel
Meadow
Searcher Behaviors
Information Retrieval
Ingwersen
...
User Theories
IR Theories
33Future development
34The First Set of Data
- 120 highly cited authors in Information Science
- A matrix of 120 by 120 of their co-citation
counts
35Results of Multidimensional Scaling on the data
- The map shows a clearly intelligible framework
for IS - Two sub-disciplines
- Clusters within the disciplines
- Two dimensions
36Users
Documents
soft domain specialists
hard domain specialists