Title: Information Seeking Behavior of Scientists
1Information Seeking Behavior of Scientists
- Brad Hemminger
- bmh_at_ils.unc.edu
- School of Information and Library Science
- University of North Carolina at Chapel Hill
2Contributors
- Assisting Researchers
- Jackson Fox (web survey)
- Steph Adams (participant recruiter)
- Dihui Lu (initial descriptive statistical
analysis) - Billy Saelim (continued statistical analysis)
- Chris Weisen (Odum Institute, statistical
consultant) - Feedback on Survey Design
- UNC Libraries Bill Burke (Botany), David Romito
(Zoology), Jimmy Dickerson (Chemistry), Zari
Kamarei (Math/Physics) - KT Vaughan (Health Sciences Library)
- Cecy Brown (University of Oklahoma)
- Supported by
- UNC Libraries
- Carolina Center for Genome Sciences
- Basic Science Department chairs
- RENCI P20 grant
3Why Study Information Seeking Behavior of
Scientists
- Goal is to improve scholarly communications.
Other areas of my research involve presentation
aspects (visualization/computer human
interaction) and the storage and communication of
scholarly information (digital libraries,
institutional repositories, virtual communities
of practice). - To do this we need to understand how people
search out and use information currently, and
why. As part of investigating this we found that
there has been a significant change in the last
5-10 years. - So were studying ISB both to understand it, and
to look at recent changes.
4How to Study the Information Seeking Behavior of
Scientists?
- Survey
- Reach many people
- Address common questions
- Produce lots of feedback for libraries
- Quantitative, models of variance (positivist
approach) - Interviews
- In depth coverage of selected groups
(bioinformatics) - Use grounded theory and critical incident
techniques to capture more qualitative,
contextual experiences - Develop models of information processing and use
5Survey--Long Term Plan
- Conduct an initial survey study at UNC. Develop
survey instrument and interview methodologies
that work here, but could easily be applied on a
larger scale. - From the results of the initial UNC study, draft
national version (with feedback from national
sites). - Run national study. Setup so that other sites
only have to recruit subjects the entire survey
runs off of UNC website. Hopefully this results
in large number of sites and participants for
minimal experimental costs.
6Survey Sampling Technique
- Census
- Need to be able to reach all members
- Best if can get response from large segment of
population - Results in potentially more input from wider
audiences, especially for the open comment
questions. - Subject to bias (only computer users take, etc.)
- Random sample
- Statistically, generally a better choice
- Higher cost and significantly more work due to
identifying and following up with individual
subjects
7Questions
- Questions were based on
- Prior studies with which we wished to correlate
our results. This is facilitated by authors who
have published their surveys (in papers as
appendix, e.g. Cecy Brown), and especially to
folks who have put theirs collections of surveys
online (e.g. Carol Tenopir). - This allows us to compare results over time, as
well as to clarify current practices (for
instance whether print or electronic formats are
usedand looking breaking this out into two
questions, retrieval versus reading) - Covering issues that our librarians were
concerned about - Developed during several drafts and that were
reviewed by representatives from all libraries on
campus.
8Survey Instrument Choices
- Paper
- Phone
- Email
- Web-based. While these can require more effort
than anticipated, if the number of survey
respondents is over several hundred it is
generally more cost effective. This seemed the
best choice since our pilot survey was of several
thousand subjects, and our national survey was
planned for tens of thousands. Since we have web
and database expertise we were able to automate
the process with minimal startup costs. - Schonlau 2001, Conducting Research Surveys
via E-mail and the Web.
9Data Acquisition Details
- PHP Surveyor used for web based survey. Another
common choice at our school for simpler surveys
is Survey Monkey. PHP Surveyor allowed us to ask
multi-part questions, and to constrain answers to
specific format responses. - PHP Surveyor dumps data directly into MySQL
database. - Data is cleaned up then feed into SAS for
analysis. (data cleaning is still a significant
manual effort! Examples were determining
Dept/CB, browsers that didnt validate datatypes
on forms properly).
10Subjects and Recruitment
- Subjects are university faculty, grad students
and research staff. - We approached all science department chairs to
get support first. - Contact
- Initial contact was by email giving motivation
for study, indication of support by deptscampus,
and link to web-based survey. - Follow-ups by letter, then two emails
- Flyers in department, Pizza Party Rewards
11Look at Survey
- 902 participants from recruited departments,
which were classified as either science or
medicine. - Participation rate was 26.
-
- Participants by Department
- Survey
12Analysis
- For the quantitative response variables standard
descriptive statistics (mean, min, max, standard
deviation) are computed, and histograms are used
to visualize the distribution. - Categorical variables are reported as counts and
percentages for each category, and displayed as
frequency tables.
13Analysis Correlations
- Categorical vs Categorical
- Chi-square
- Categorical vs Quantitative
- Analysis of Variance
- Quantitative vs Quantitative
- Correlation
- Examples are by dept analysis of other features
age vs preferred interface (Google or Library)
14Participants
Position Science Science () Medicine Medicine () Total Total ()
professor 58 12.47 39 8.92 97 10.75
associate professor 23 4.95 41 9.38 64 7.10
assistant professor 40 8.60 46 10.53 86 9.53
research staff/adjunct 15 3.23 17 3.89 32 3.55
post graduate/fellow 46 9.89 37 8.47 83 9.20
others 19 4.09 48 10.98 67 7.43
doctoral student 246 52.90 179 40.96 425 47.12
masters student 18 3.87 30 6.86 48 5.32
15Gender
Science Science Medicine Medicine Total Total
Female 179 38.49 280 64.07 459 50.89
Male 286 61.51 157 35.93 443 49.11
16Distance to Library
Distance to Library Count Percentage
Same building 175 19.40
1/4 mile 570 63.19
1/2 mile 88 9.76
1 mile or more 69 7.65
17Simple Questions
- Ninety-one percent of the participants had access
to the internet in their office or lab. - Do you maintain a personal article collection?
Most all participants (85.4) responded that they
did, while only 14.6 did not - Do you maintain a personal bibliographic database
for print and/or electronic references?, and
52.2 of the participants did maintain one, while
47.8 did not.
18How often do you use
Daily or Weekly daily weekly monthly quarterly annually never
book 24 60 157 241 223 148 73
journal 87 509 277 72 22 6 16
preprint 18 57 105 155 109 72 404
conference 2 4 14 37 193 492 162
proceeding 5 14 37 79 168 273 331
webpage 70 362 277 132 67 19 45
online database 67 293 311 119 49 32 98
personal communication 52 241 228 132 114 64 123
other 1 5 7 3 0 2 885
19Most Important Individual Sources
Basic Science Journals Count Medicine Journals Count
Science 99 Science 45
Nature 90 Nature 39
Cell 36 JAMA 38
Journal of the American Chemical Society 34 UpToDate 30
Journal of Cell Biology 20 New England Journal of Medicine 28
Journal of Biological Chemistry 19 Journal of Immunology 18
Analytical Chemistry 18 American Journal of Epidemiology 17
PNAS 13 Cell 16
Journal of Neuroscience 12 Lexi-Comp 15
Evolution 11 Journal of Biological Chemistry 14
Neuron 11 Epidemiology 13
Development 10 AIDS 12
Journal of Organic Chemistry 10 PNAS 12
Organic Letters 10 Journal of Virology 11
Biometrics 9 Nature Immunology 11
20Important Alerts
Basic Science Alerts Count Medical Alerts Count
PubMed 40 PubMed 53
Faculty of 1000 27 Medscape 11
ISI 14 Nature 10
ACS Journal Alert 11 Faculty of 1000 9
Nature 10 PubCrawler 9
ScienceDirect 9 ISI 7
Science 7 ePocrates 6
PubCrawler 4 ASHP 5
Biomail 3 NEJM 5
COS 3 MDLinx 4
J Biol Chem 3 Science 4
ACM 2 ScienceDirect 4
ArXiv 2 ADA Daily Knowledge 3
BMC alerts 2 JAMA 3
Cancer Research 2 Kaiser listserv 3
21Tools for Searching Information
Search tool type Frequency Percentage
Citation index database 1084 47.25
General web search engine 694 30.25
Fulltext digital library 156 6.80
Personal search tool 125 5.45
Knowledgebase web portal 93 4.05
Others 69 3.01
Online or local database 52 2.27
Library collection 21 0.92
22Types of Information Sources
Sources Science Medicine Total
(electronic) library subscribed journal 20.17 19.89 20.03
(electronic) open (free) access journal or institutional repository or digital library 7.86 9.29 8.57
(print) library subscribed journal 4.48 3.61 4.05
(electronic) web site (author's website) 4.36 3.31 3.89
(print) Personally subscribed journal 3.44 4.01 3.73
(print) copy of colleague's print copy 1.07 5.00 3.00
(electronic) personal subscribed journal 3.10 2.65 2.88
(electronic) personal digital library 2.89 1.97 2.43
(electronic) lab subscribed journal 2.72 1.14 1.97
(electronic) copy of colleague's electronic copy 1.60 1.98 1.79
(print) lab subscribed journal 2.05 0.79 1.43
(print) interlibrary loan 0.59 0.55 0.57
(print) document delivery service 0.13 0.19 0.16
other 0.02 0.13 0.07
23Articles in Personal Collection
Number of Articles Print Print Electronic Electronic
none 45 104
1-49 154 21.24 259 38.89
50-99 160 22.07 127 19.07
100-499 280 38.62 210 31.53
500-999 81 11.17 44 6.61
1000 50 6.90 26 3.90
24Articles in Personal Article Collection that have
annotations
Percentage of entries with notes Total count Total Percentage
lt10 327 36.25
11-20 75 8.31
21-30 82 9.09
31-40 30 3.33
41-50 126 13.97
51-60 19 2.11
61-70 26 2.88
71-80 100 11.09
81-90 47 5.21
gt90 70 7.76
25Preferred Search Method
Science Science Medicine Medicine Total Total
Electronic versions of databases and journals 443 95.27 429 98.17 872 96.67
Print versions of databases and journals 22 4.73 8 1.83 30 3.33
26Preferred Viewing Method
Science Science () Medicine Medicine () Total Total ()
Both/it depends 292 62.80 260 59.50 552 61.20
electronic (computer) only 63 13.55 52 11.90 115 12.75
print (hard copy) only 110 23.66 125 28.60 235 26.05
27Number of Visits to the Library in the past 12
Months
Science Science Medicine Medicine Total Total
0-2 101 21.72 107 24.49 208 23.06
3-5 75 16.13 99 22.65 174 19.29
6-10 77 16.56 71 16.25 148 16.41
11-20 84 18.06 55 12.59 139 15.41
21-50 85 18.28 67 15.33 152 16.85
51-100 34 7.31 19 4.35 53 5.88
101-200 7 1.51 13 2.97 20 2.22
gt200 2 0.43 6 1.37 8 0.89
28Reasons for Visiting the Library
Science Science Medicine Medicine Total Total
photocopy 256 22.54 274 22.81 530 22.68
get assistance from a librarian 65 5.72 96 7.99 161 6.89
use computers 59 5.19 112 9.33 171 7.32
perform searches 81 7.13 117 9.74 198 8.47
read current journals or other materials 161 14.17 156 12.99 317 13.56
quiet reading space 156 13.73 179 14.90 335 14.33
meeting 45 3.96 73 6.08 118 5.05
browse 99 8.71 60 5.00 159 6.80
pick up /drop off materials 214 18.84 134 11.16 348 14.89
29Factors Affecting Choice of Journal to Publish In
Factors Affecting Choice of Journal to Publish in Science Medicine Total
Ability to include links, color, graphics, multimedia 1.38 1.24 2.31
audience 3.52 3.38 4.45
author having to pay cost of publication 1.51 1.54 2.53
availability on campus 1.79 1.88 2.83
editorial board 2.11 1.95 3.03
page charges for long articles or color figures 1.40 1.45 2.42
speed of publication 2.42 2.27 3.35
standing of journal in your field 3.77 3.61 4.70
support of open access to journal articles 2.09 2.17 3.13
30Google vs Library Search Page
- Which interface would you rather use to begin
you search process? with the possible responses
Google search page and Your librarys home
page. Overall, a slight majority of users
preferred Google (53.3) over the library page
(46.7) however, the difference was
substantially larger for basic science
researchers (Google 58.5 versus Library 41.5)
compared to medical researchers (Google 52.2
versus Library 47.8).
31Google vs Library Search Page
- This difference may also be larger if the
question had asked which style or type of
interface the users preferred, as many of the
comments in the survey indicated a strong
preference for a single meta search tool where
the user could enter a single search string that
would result in all content in all resource
collections being searched (as opposed to
manually identifying resource collections and
individually searching them).
32Summary
33We never leave our chairs
- Most all information seeking and use interactions
occur on the researchers computer in their
office. - As a result library visits have dramatically
declined, and the reasons for visits to library
have changed. - Researchers read both in electronic and print
form, but print (paper) is still the most
preferred form.
34Single Text Box MetaSearch
- Researchers prefer a single text box for initial
searching, that covers all resources. - This is most evidenced by preference for Google
Scholar over library web page interfaces.
35More than just text
- Researchers are making increasing use of content
contained in online databases like Genbank, or
web pages of research labs. - For the scientists in our survey this type of
access has surpassed personal communications and
is close to journal articles in frequency of
usage by researchers.
36Transformative Changes
- Transformative collaborative group communications
have already taken place in the consumer
marketplace, and are finding their way into
scholarly communications. Examples include
folksonomies supporting community tagging
(Del.icio.us), comment and review systems like
Amazons rankings, FLickr, etc. Beginnings of
similar changes are in their initial stages for
scholarly communities, for instance Faculty of
1000 and the Connotea application for online
sharing of bibliographic databases and
annotations by scientists.
37What might the future hold?
- In the future the researcher may all maintain all
their scholarly knowledge online and make it
accessible to others as they see fit. Having
scholars descriptions and annotations of the
digital scholarly materials as well as the
materials themselves available on the web will
allow online communities and community review
systems to blossom, just like the availability of
online journals articles has transformed basic
information seeking of science scholars today.
38Future Work
- Upcoming papers from UNC survey
- Correlations, information seeking behavior
predictions from demographics - By department/research area comparisons
- Review and reflection on major changes (with Cecy
Brown, Don King, Carol Tenopir) - Textual analysis of library comments (Meredith
Pulley, KT Vaughan) - ICIS tool for visualizing comments within schema
- New work being proposed by other researchers
using this data (if you think the data from this
study might help you in your research come talk
to me). - National Study.(Florida, Oklahoma, others to
start soon) - Interview Studies (labs, individuals)
- bmh_at_ils.unc.edu