Title: The Network Structure of Sociology Production
1The Network Structure of Sociology Production
James Moody Duke University Stanford University
Colloquium March, 2007
2Introduction
- Outline
- Networks Science Two Questions 4 networks
- How do scientific fields evolve?
- Where do good ideas come from?
- Data Sources Methods
- Results
- Where does sociology fit? Journal co-citation
networks - What do sociologists study? Topic networks
- Who produces sociology? Social science
collaboration networks - Discussion
3Networks Science Two Questions 4 networks
"Science, carved up into a host of detailed
studies that have no link with one another, no
longer forms a solid whole." Durkheim, 1933
4Networks Science Two Questions 4 networks
- The extent to which science is carved up into a
host of detailed studies that have no link with
one another is a question of network cohesion - A fractured discipline will be dominated by tight
clusters based on specific research problems,
while an integrated discipline will have strong
connections bridging research problems.
5Networks Science Two Questions 4 networks
- How do scientific fields evolve?
- Is there a coherent logic to the ebb and flow of
topics studied? - How does the success or failure of ideas depend
on the social community in which it is embedded? - (How) Does the evidentiary basis of a field shape
its logic of discovery? - The descriptive answer is given by mapping the
field in network space. - The analytic answer will come by modeling the
emergence, growth and decline of scientific
subfields.
6Networks Science Two Questions 4 networks
- 2) Where do good ideas come from?
- What is a good idea?
- Ideas that change a scientific field. Indexed by
(a) citations and (b) the relevant topography of
the networks within which the idea was originally
embedded. Ideas are not inherently good they are
recognized as good by their effect on a field. - How do disciplines produce new ideas?
- Intersection ? Good ideas are produced by
combining ideas of others in unique ways (Burt) - Development ? Good ideas arise naturally from
either the progressive error reduction process
of good normal science (Popper) or the accepted
practices of a scientific community (Crane). - Peer Influence Recognition ? Any idea is a
good idea if others think so, and thinking so is
influenced by the network. (Gould). - Resource competition ? Search for prestige
conditioned by organizational structure (Fuchs) - Will model this by examining how citations are
affected by field dynamics (and vice versa).
7Networks Science Two Questions 4
networks Theoretical approaches to scientific
development
- Normal Science, accumulation revolution
- Science is problem driven evidence based
- Consensus emerges through a competition of ideas
against data - (though lab ethnography repeatedly shows that
consensus is often more socially constructed than
evidentiary) - Scientific Star systems reward prior success,
and stars shape research agendas - Boundary Specification Science as a profession
- -Motivated by prestige competition for
resources - -Competition will lead to both vanquishing and
niche filling - Disciplinary identity coherence become a key
issue - Contested fields lead to chaotic outcomes
(Abbott 2001)
8Networks Science Two Questions 4
networks Theoretical approaches to scientific
development
- Invisible Colleges
- Informal communities create acceptable scientific
standards - Boundaries are defined socially through
interaction - Scientific (social) Movements
- Combination of many of these ideas under a social
movement frame - Coherence becomes a framing grievance issue
used to shape resource allocation
9Networks Science Two Questions 4
networks Theoretical approaches to scientific
development
We are thus left with multiple action frames to
guide our understanding Truth Ideas run their
error-reduction course (Popper) Prestige Actors
seek the greatest visibility (Merton) Resource
competition To the victor goes the spoils
Fuchs Boundary Protection (Gieryn ) Fractal
Development (Abbott) Community Influence (SSK
Collins, etc) Peer magnification (Gould) Power
(JL Martin) For entire fields, these mechanisms
are largely unknown and underspecified. ? Need
to extend beyond particular lab studies ? Take a
large-scale Satellite view of science
dynamics ? Link action frames to specific
patterns in 4 science networks
10Networks Science Two Questions 4
networks Theoretical approaches to scientific
development
- Four relevant networks
- Citation networks a direct trace of scientific
recognition production - Topic networks clusters of scientific products
related to the same subject - Collaboration networks invisible communities
of social interaction that produces scientific
products - Research Communities People linked through
common research topics (Substantively a
derivative of 2 3)
11Networks Science Two Questions 4 networks
Scientific Environments
Evidentiary Basis How do we array disciplines
with respect to evidence? Two Dimensions
Objectivity Control Objectivity is taken from
Popper The extent to which a given knowledge
claim is independent of the knower. Control
refers to the ability of scientists to directly
manipulate the object of study. Lab Science
with complete ability to control apparatus (and
thus environment) represents the strongest
ability, while observation represents the
other. Cases Chemistry (Lab Science High
Objectivity High Control) Paleontolgoy
(Field Science High Objectivity Low
Control) Sociology (Social Science Moderate
Objectivity Low Control) Cultural
Anthropology (Low Objectivity Low
Control) This approach is very similar to Fuchs
(1993)
12Networks Science Two Questions 4 networks
13Networks Science Two Questions 4
networks Focusing on Sociology as a current case
- The field of sociology can thus be thought of as
the intersection of multiple networks. - The shape of these networks differs across scales
and over time. - - Differences between local and global visions
of the network shape our perceptions of
scientific coherence. - We tend to perceive coherence in our own
specialty fields and incoherence for the entire
discipline. - A globally federated structure, that cannot
easily exclude empirical topics, might still be
socially coherent if scientific mixing cross-cuts
empirical problems. - We can see this structure by examining these 4
networks at large scale and over time.
14Data Sources
- Citation Networks
- Compiled from the ISI web of science Journal
citation tables - Covers 1681 social science journals indexed in
2003 - Will eventually
- -fill this series from 1950 to present across all
fields. - -Add a sample of paper-level citations to model
performance. - Topic Collaboration Networks (for Sociology)
- Compiled from Sociological Abstracts
- 281,163 papers published between 1963 and 1999
- A sub-sample of sociology only papers published
in a select set of non-specialty sociology
journals ? 35 of the total (100K) - Contains information on title, abstract,
keywords, author(s), tables, journal citation - Will use similar indexes for Chemistry, Geology
and Lit Crit
15Where does sociology fit?
- Perennial debates over the existence of a
theoretical core - Rapid growth in the internal diversity of topics
sociologist study
16Where does sociology fit?
- Perennial debates over the existence of a
theoretical core - Rapid growth in the number of journals relevant
to sociologists
17Where does sociology fit?
- This growth diversity has been seen as evidence
for the ultimate emptiness of sociology as a
scientific discipline. - But disciplines are shaped by the connections
between ideas, not the number of ideas. - That is, we recognize fields by who they speak to
as much as by what they speak about. - The clearest empirical trace of this
communication is citation. - Disciplines can then be defined as clusters of
work that speak more to each other than to anyone
else, which we trace with co-citation networks.
18Where does sociology fit?
Building co-citation networks
Links in a co-citation network are constructed by
measuring how similar each journal is to every
other journal. Similarity is gauged by
correlating the pattern of citations received by
each journals from every other journal.
Comparing across columns tells us whether the two
journals are recognized by others as similar.
19Where does sociology fit?
Building co-citation networks
AJS ASR AER JER AJS ASR
AER . . . JER
Links in a co-citation network are constructed by
measuring how similar each journal is to every
other journal. Similarity is gauged by
correlating the pattern of citations received by
each journals from every other journal.
1.0
High
1.0
1.0
Med
Low
High
Low
Low
1.0
This create a valued network of ties between two
journals. I use a cosine similarity score
developed in bibliometrics, selected for those
with ties gt 0.45 at sharing at least 2 of
their citation volume. Source Loet Leydesdorff
20Where does sociology fit?
Economics co-citation similarity network
Density 0.197 N152 Isolates (not shown) 5
Node size proportional to log(degree)
21Where does sociology fit?
Political Science co-citation similarity network
Density 0.160 N69 Isolates (not shown) 10
Node size proportional to log(degree)
22Where does sociology fit?
Sociology co-citation similarity network
Density 0.140 N69 Isolates 7
23Where does sociology fit?
24Where does sociology fit?
25Where does sociology fit?
26Where does sociology fit?
- Sociology fits at the center of the social
sciences. We are not as internally cohesive as
Economics or Law, but more so than many
(anthropology, allied health fields). - This represents a tradeoff. We have traded
unique dominance of a topic (markets, politics,
mind, space) for diversity thus centrality. - Sociology is an interstitial discipline (Abbott,
2004) in at least two-senses - There is no content topic we can reasonably
exclude - We pull together, and generate, the ideas and
topics covered by specialty disciplines. - This makes us uniquely positioned to provide
insights on many different empirical questions.
How have the topics sociologists study shifted
over time?
27How does this look in the Physical Sciences?
28What do sociologists study?
- How do we capture the internal organization of
research problems? - Could use paper-level citation networks (see
Hargens 2000), but data are difficult expensive
to obtain for large-scale networks. - Can examine the network of papers formed by the
topics they write about. - Directly taps scientific content
- Purely endogenous creation of topics that allows
new topic areas to emerge and old ones to die
over time - Tractability data can be extracted from
information held in Sociological Abstracts - Multiple levels
- Coarse grained? Focus solely on keywords (Light
2005) - Fine grained ? Use all information available
(title, abstract, keywords)
29What do sociologists study?
A fine grained view
- Data Selection Manipulation
- Index entries contain title, abstract and
keywords that summarize the papers content. -
- Sample all papers indexed within four 3-year
windows between 1970 and 1999. - Construct a paper by word matrix, where the
ij cell lists how many times word i is used to
describe paper j. - Word set is stemmed to get at root words
- A stop-list is used to minimize inclusion of
low-information content words (the and is
etc.) or words commonly found in the data source
(Tables Figure References) - Construct a network by linking the most highly
correlated papers - Use correlation of 0.40 or better
- Ties are treated as valued in the network analyses
30What do sociologists study?
A fine grained view
- Analysis Presentation General approach is
quantitatively inductive - - Construct a low-dimensional map of the network,
using contour sociograms. These allow for full
information in the network structure. - Use cluster analysis to identify distinct topics
- Use a variant of Moodys RNM algorithm to cluster
the network - This clustering routine
- (a) is efficient Allows clustering on 10s of
thousands of nodes - (b) automatically specifies the optimal number
of clusters - (c) allows that some cases can fall between
clusters - I set a minimum cluster size of 12 papers
published over the 3-year window. - Evaluate the clustered papers for content and
label the maps.
31What do sociologists study?
A fine grained view
Analysis Presentation General approach is
quantitatively inductive Compare the maps over
time qualitatively, looking for general changes
in the frequency alliance of topics. Examine
shifts in structural indicators of the extent of
clustering cluster size distributions.
32What do sociologists study?
A fine grained view
Example One-step neighborhood of More
information, better jobs?
33What do sociologists study?
A fine grained view
Example One-step neighborhood of More
information, better jobs?
34What do sociologists study?
A fine grained view Content (all journals)
35What do sociologists study?
A fine grained view Content (all journals)
36What do sociologists study?
A fine grained view Content (all journals)
37What do sociologists study?
A fine grained view Content (all journals)
38What do sociologists study?
A fine grained view Content (all journals)
- The cluster content of the topic network has
evolved slowly - Some clearly central specialties have remained
prominent over the entire period. This includes
larger areas such as - Class Stratification
- Race Ethnicity
- Education
- Gender (Strongest from 1980s on)
- Family (Strongest from the 1980s on)
- Crime
- As well as clearly distinct, though numerically
smaller bodies of research related to - Suicide
- Sociology of Science, Technology Reflexive
sociology - Unions
39What do sociologists study?
A fine grained view Content (all journals)
- The cluster content of the topic network has
evolved slowly - The clearest change has been the rapid growth of
social research on health. - Dominated by a very large body of research
related to HIV/AIDS - Other areas of relative growth include
- Family topics were most prominent in the 1980s
- A strong presence of research on sex sexuality
emerged in the 1980s and 90s - Relative declines have come in areas such as
- Groups
- Interaction
- Radical studies
- Elite studies
- Summary A move away from basic social processes
toward studying social problems, with a growing
uniqueness of theory method
40What do sociologists study?
A fine grained view (all journals)
Cluster Size Distribution
(hidden)
41What do sociologists study?
A fine grained view Content (Restricted Sample)
42What do sociologists study?
A fine grained view Content (Restricted Sample)
43What do sociologists study?
A fine grained view Content (Restricted Sample)
44What do sociologists study?
A fine grained view Content (Restricted Sample)
45What do sociologists study?
A fine grained view Content (Restricted Sample)
- The cluster content of the restricted topic
network has evolved similarly to the wider social
science field - The subfield structure is less dominated by the
purely applied work on HIV/AIDS in the 90s, but
there is a still a clear association of topics
around sexuality, health and AIDS. - Health, Family, Education, Gender, and Race are
always prominent and large. - The relative prominence of reflexive sociology
is much higher - These topics cannot be published elsewhere, and
the resulting tight cluster looks proportionately
larger in the smaller sample.
46What do sociologists study?
A fine grained view (Core Soc journals)
Cluster Size Distribution
Hidden
47What do sociologists study?
A fine grained view Content
We can measure the degree of consensus in words
used to describe papers with
C S pi2
Where pi is the proportion of times word i is used
48What do sociologists study?
A fine grained view Content
Word Consensus Scores 1970 - 1999
Soc Only
C (x 100)
All SA Journals
49What do sociologists study?
A fine grained view (Core Soc)
Proportion of papers falling inside a cluster
Total
Cn gt 12
Restricted
Total
Cn gt 100
Restricted
50What do sociologists study?
A fine grained view (Core Soc journals)
- Largest Clusters
- 1970
- Culture (126,3.7)
- Organizations (134,4.0)
- Race (Black) (137,4.1)
- Students (137,4.1)
- Community (138,4.1)
- 1980
- Education (167,2.6)
- Sex Roles (182,2.9)
- Research (191,3.1)
- Sociology (225,3.6)
- Family (273,4.3)
- 1990
- Women (246,2.7)
- Schools (259,2.8)
- Sociology (427,3.6)
- Health (427,4.65)
- Family (273,4.3)
- 1997
- Children (302, 2.89)
- Women (346, 3.32)
- Critical Sociology (359, 3.32)
- Education (380, 3.64)
- Health (714, 6.8)
Hidden
51What do sociologists study?
A fine grained view
We can measure the extent that ties fall within
clusters with the modularity score
Where s indexes clusters in the network ls is
the number of lines in cluster s ds is the sum
of the degrees of s L is the total number of
lines
52What do sociologists study?
A fine grained view
Network Modularity 1970 - 1999
All SA Journals
Soc Only
Modularity Score
53What do sociologists study?
A fine grained view
Proportion of all ties within cluster 1970 - 1999
In-cluster ties / Total ties
All Journals
Soc Only
Hidden
54What do sociologists study?
A fine grained view
Number of Clusters 1970 - 1999
All Journals
Total Number of Clusters
Soc Only
55What do sociologists study?
A fine grained view
Mean Cluster Size 1970 - 1999
All Journals
Soc Only
Mean Size of Clusters
56What do sociologists study?
A fine grained view
- The cluster structure of the topic network
- The vast majority of papers can be assigned to
clear clusters, with slight growth in this
proportion over time. - The number of clusters has increased rapidly,
though slightly slower within core sociology than
in the broader field of social science. - There has been significant growth in the tails of
the distribution the size distribution is more
skewed in later periods. - The modularity of the network has increased over
time, though most of this change is between the
1970 and 1980 periods. - This meshes with our intuition of separate
worlds in the social sciences larger, more
distinct topical production of science work.
57What do sociologists study?
A fine grained view
- Next steps
- Build a continuous moving window to fill in the
dates from 1960 to 2005. - Link clusters across time periods, so we can
track exactly the relative growth and decline of
each subfield. - Model this growth as a function of connections to
other fields, author composition and disciplinary
environment. - Build this networks dual scientists connected
through topics.
58What do sociologists study?
- A clustered topic structure focused strongly on
practical problem solving has a hint of
Durkheims concern Is there any integration
across these topic clusters? - We shouldnt jump too quickly to the fractured
conclusion - Topic clusters are formed from papers, and papers
typically have well encapsulated ideas. They
have a small maximum digestible unit - Scientific integration is really about how
scientists bridge these multiple topics. - If authors write and collaborate across these
topics, ideas can quickly disseminate as well. - What is the structure of the collaboration graph
if this is highly clustered it would signal
potential fragmentation ? Who produces sociology?
59Who produces sociology?
- Science is typically produced through
collaboration, both formally and informally
(Crane 1972, Crane Small 2000, Friedkin 1998). - The best empirical trace of collaboration for
large communities of science is coauthorship. - Misses the less intense collaborations recognized
in acknowledgements, discussions, colleagues
reading each others work - But should provide the strongest test of a
fractionalization hypothesis, since the set of
people we write with should be more like us than
the set of people we have lunch with or discuss
work with informally. - There are differences across subfields in formal
collaboration rates, which, if anything, should
magnify the extent of observed fragmentation.
60Who produces sociology?
Coauthorship Trends in Sociology Sociological
Abstracts and ASR
0.75
0.6
0.45
Proportion of papers with gt1 author
0.3
Sociological Abstracts
ASR
0.15
0
1930
1940
1950
1960
1970
1980
1990
2000
Year
61Who produces sociology?
Distribution of Coauthorship Across
Journals Sociological Abstracts, 1963-1999
Child Development
1
0.8
Soc. Forces
J. Health Soc. Beh.
ASR
0.6
Proportion of papers w. gt1 author
AJS
J.Am. Statistical A.
0.4
Atca Politica
Soc. Theory
Signs
0.2
J. Soc. History
0
0
100
200
300
400
500
600
700
800
900
1000
1100
Coauthorship Rank
62Who produces sociology?
63Who produces sociology?
Construct a collaboration network by assigning an
edge between any pair of people who coauthored a
paper together.
Example Paths 3-steps from Stan Wasserman
N361
64Who produces sociology?
Construct a collaboration network by assigning an
edge between any pair of people who coauthored a
paper together.
Example Paths 3-steps from Stan Wasserman
N361
Node size proportional to log of degree
65Who produces sociology?
The simplest summary test for a fragmented
network is to measure the extent of clustering in
the network. Watts work on the small-world
problem suggests that if the collaboration
network is a small world network it might be
fractured.
CLarge, L is Small SW Graphs
- High relative probability that a nodes contacts
are connected to each other. - Small relative average distance between nodes
66Who produces sociology?
In a highly clustered, ordered network, a single
random connection will create a shortcut that
lowers L dramatically
Watts demonstrates that Small world properties
can occur in graphs with a surprisingly small
number of shortcuts
67Who produces sociology?
Locally clustered graphs are a good model for
coauthorship when there are many authors on a
paper.
Paper 1
Paper 2
Paper 3
Paper 4
Paper 5
Newman (2001) finds that coauthorship among
natural scientists fits a small world model.
I test this model on the sociology coauthorship
network, using all authors from 1963 1999.
68Who produces sociology?
Observed
Random
Clustering
0.194
0.206
9.81
7.57
Distance
The sociology network is less clustered than
would be expected by chance and somewhat longer
overall distances. This suggests that it does
not have a small-world structure.
69Who produces sociology?
The network has a broad Core-periphery structure
(68,923)
59,866
38,823
29,462 Bicomponent
Component
Unconnected
Structurally Isolated
70Who produces sociology?
Largest Bicomponent, g 29,462
71Who produces sociology?
Internal Structure of the Coauthorship Core
Health
General Sociology
72Who produces sociology?
73Who produces sociology?
Internal Structure of the largest bicomponent
Group 1
Group 2
Size
3667
987
In-group / out- group ties
3.24
2.86
male
67
52
Years in discipline
8.46
4.67
Number of co-authored publications
5.32
3.24
74Who produces sociology?
Internal Structure of the largest bicomponent
75Who produces sociology?
76Who produces sociology?
- Strong specialty effects for ever-coauthored
- Unlikely
- History Theory
- Sociology of Knowledge
- Radical / Marxist Sociology
- Feminist / Gender Studies
- Likely
- Social psychology
- Family
- Health Medicine
- Social Problems
- Social Welfare
77Who produces sociology?
- Weak specialty effects for network embeddedness
- Large number of coauthors increases embeddedness
- Large number of people on any given paper
decreases embeddedness
78Who produces sociology?
Graph Connectivity, Cumulative 1963 - 1999
0.6
in Giant Component
0.5
0.4
of connected in bicomponent
Percent
0.3
0.2
0.1
0
1965
1970
1975
1980
1985
1990
1995
2000
Years (1963 - date)
79Who produces sociology?
0.4
2.25
Evolution of Network Cohesion 5-year moving
window
0.35
2.2
0.3
0.25
2.15
0.2
Percent
Connectivity
2.1
0.15
0.1
Connectivity
2.05
Bicomponent
0.05
Component
0
2
1975
1980
1985
1990
1995
2000
Year
80Summary discussion
- Social Science Citation Structure
- Economics, Law, Psychology, Business/Management,
Linguistics are most cohesive - The are also peripheral in that they speak to a
relatively limited set of problems - Sociology is at least as cohesive as Political
Science, and more cohesive than fields such as
Anthropology, Social Work, Education or allied
health fields that all have more limited
empirical domains - Our position represents a tradeoff between
internal cohesion and external centrality.
81Summary discussion
- Scientific Topic Network
- Big-Picture A general progression towards
problem solving and the specialization of work on
theory methods (Light 2005). - Fine-grained structure
- A federated topic structure that has largely
retained that form since the 1970s, though there
have been shifts in substantive topics. - Key content areas have remained largely constant
- Race, Family, Class, Gender, Science, and Health
- A decrease in focus on general foundation
problems - Group structure, community, interaction
- An increase in work on social problems
- Health HIV/AIDS -related topics
- Some (minor) evidence for greater homogeneity in
topics discussed
82Summary discussion
- Scientific Collaboration Network
- The networks is not divided into small
research-area based clusters. - There is no partition that strongly separates
scientists. - This has to imply that authors bridge topic
clusters. - This is good for social cohesion, and probably
good for theoretical cohesion. - Caveat There is evidence for a division based
on research method, with largely quantitative
work more likely to be coauthored, though there
is no such simple division in the topics network.
83Summary discussion
- Combined, these models suggest a discipline that
is integrated socially and locally cohesive
topically. - Discipline-wide integration will likely only
increase as pressures for collaboration push more
scientists to work together across topics. - However, the perception of disintegration will
likely continue - because most of us are only exposed outside our
areas by work that appears in the general
journals. - But almost all of the topical cohesion is due to
normal science work occurring in specialty
journals. -
84(No Transcript)