DEPARTMENT OF SOCIOLOGY

About This Presentation
Title:

DEPARTMENT OF SOCIOLOGY

Description:

Not 'gaming' or 'role playing': Student United Nations. ... that they are left in the backstage because they do not have education, because ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 38
Provided by: geogLe

less

Transcript and Presenter's Notes

Title: DEPARTMENT OF SOCIOLOGY


1
DEPARTMENT OF SOCIOLOGY
The Assumptions You Dont Realise You Are Making
Are The Ones That Will Do You In Simulation,
Social Science and Appropriate Data Edmund
Chattoe-Brown ecb18_at_le.ac.uk
2
Plan of talk
  • Some background.
  • A very simple (but revealing) example.
  • Sociological data collection methods.
  • Simulation methodology and types of simulation.
  • A case study Modelling drug trends.
  • Conclusions.

3
Some relevant distinctions
  • Not gaming or role playing Student United
    Nations.
  • Not the post modernist thing (Baudrillard?)
    whatever that is.
  • Instrumental versus descriptive simulation Not
    just a technical tool (doing the sums quicker)
    but a new way of understanding (explaining)
    social behaviour.
  • A social process described as a computer
    programme rather than a narrative or a
    statistical model.
  • Other disciplines, other approaches Experiments,
    content analysis, GIS.

4
Intellectual biography
  • Started as a chemist Follow the scientific
    method.
  • Moved to social science Was particularly
    interested in situations where analytical models
    fail (oligopoly).
  • Studied Artificial Intelligence where simulating
    something is considered a normal way of
    supporting the claim you have understood it.
  • Became a sociologist because they are allowed to
    collect many different kinds of data
    (statistical, observational, cognitive).

5
Current position
  • A social simulator since 1994. Have built 5 or
    6 different working simulations Uncommon
    experience in a new field.
  • Trying to simulate social systems, particularly
    those connected with social decision, networks,
    communication and innovation.
  • Trying to link social scientific data with
    simulation to build properly falsifiable models.
  • Trying to raise the bar institutionally
    Getting away from toy models based on
    unsystematic reading and with a wishful
    relationship to data.
  • Trying to identify and bridge the intellectual
    gap between mainstream social science and
    simulation.

6
Spatial segregation (Schelling)
  • Agents live on a square grid (like a US city) so
    each has eight neighbours.
  • There are two types of agents (red and green)
    and some spaces in the grid are vacant. Initially
    agents and vacancies are distributed randomly.
  • All agents decide what to do in the same very
    simple way.
  • Each agent has a preferred proportion (PP) of
    neighbours of its own kind (0.5 PP means that you
    want at least 4 neighbours out of 8 to be your
    own kind - but you would be happy with up to 8 i.
    e. PP is a minimum.)
  • If an agent is in a position that satisfies its
    PP then it does nothing.
  • If it is in a position that does not satisfy its
    PP then it moves to an unoccupied position chosen
    at random.
  • A time period is defined as the time it takes for
    each agent (chosen in random order) to take a
    turn at deciding and possibly moving.

7
Initial state
8
Two questions
  • What is the smallest PP (i. e. number 0-1) that
    will produce clusters?
  • What happens when the PP is 1?

9
Simple individuals but complex system
10
Deconstructing this example
  • Clearly unrealistic in some senses Property
    values, decision processes, space, communication,
    neighbourhood knowledge.
  • However, not unrealistic in the important sense
    that the simulation contains no arbitrary
    parameters and no impossible global knowledge
    (non computable, recursive). The only
    parameters in the model are individual PP
    values.
  • The simulation also generates unintended
    consequences (PP1) and patterns that were not
    built in. For example, is the distribution of
    empty sites random or buffering? This emergence
    (surprise) allows the possibility of genuine
    falsification.
  • Complex systems also have heuristic fertility
    What do we mean by compatible desires?

11
Quantitative data collection approach
  • Collect survey data Cross sectional, time series
    or whatever.
  • Choose a model and accept/reject it on grounds of
    statistical fit (adequate random sample, absence
    of non-normality in data).
  • Model coefficients are results conditional on
    acceptable model.
  • In what sense do models explain observed
    patterns?
  • What is scientific status of coefficients?
    (Descriptive/generative.)
  • Technical problems Explanatory range depends on
    sample size.
  • Basic problem doesnt go away even with fancier
    techniques like time series/MLM A description
    isnt an explanation.
  • Rarely heuristically fertile.

12
Deriving a quantitative coefficient
Number of strikes (units)
80
50
1
2
Unemployment (millions)
13
Quantitative example
  • The most important empirical findings of this
    study can be summarized as follows
  • Contrary to Hypothesis 1, there is a moderate
    tendency for individuals with higher service
    class origins to be more likely than others to
    enrol in PhD programmes.
  • The estimated effect of class drops to zero when
    controlling for parents education and employment
    in research or higher education.
  • The overall implication of these findings is that
    the transition from graduate to doctoral studies
    is influenced by social origins to a considerable
    degree. Thus, the notion that such effects
    disappear at transitions at higher educational
    levels - due either to changes over the life
    course or to differential social selection - is
    not supported. (Mastekaasa, Acta Sociologica,
    2006, 49(4), pp. 448-449.)

14
Qualitative data collection approach
  • Collect data (cognitive, behavioural, structural)
    by observation and interrogation.
  • Try (though surprisingly rarely) to induce an
    overarching pattern from the data Example of the
    addiction cycle and compare with
    amount/frequency account of drug use.
  • Result is rich coherent narrative(s) What heroin
    addiction means from the inside and in a
    particular context.
  • Are the results generalisable? (What is N?)
  • Can we correctly envisage the consequences of
    complex social interaction sequences presented
    using narratives? (Compare Schelling case.)
  • Often heuristically fertile.

15
Qualitative example
  • Turkish interviewees do not include themselves
    when they are evaluating the status of Turkish
    women in general. While referring to Turkish
    women, most Turkish interviewees use the pronoun
    they
  • Turkish women are more home-oriented. I think
    that they are left in the backstage because they
    do not have education, because they are not given
    equal opportunities with men. (T3)
  • One of the Turkish interviewees stated that it
    was difficult for her to answer the questions
    related to her status as a woman, because
  • I dont think of myself as a Turkish women, but
    as a Turkish person. I mean I never think about
    what kind of role I have in the society as a
    woman. (T1)
  • Most Norwegian interviewees, on the other hand,
    identify with Norwegian women in general, and
    they refer to Norwegian women as we
  • I think that in a way Norwegian women, that is
    we, at least have our rights on paper. We have
    equal rights for education and we have good
    welfare arrangements (N1) (Sümer, Acta
    Sociologica, 1998, 41(1), p. 122)

16
The Gilbert and Troitzsch box
17
Ideal simulation methodology
  • Choose a target system Ethnic segregation in
    cities.
  • Build a simulation of the target system and
    calibrate it, typically on micro level data
    Ethnography and experiments? How do agents make
    relocation decisions and where do they go?
  • Run simulation and look for regularities and
    their preconditions Do we observe clusters
    (always, never, only with high PP, fixed,
    identical, moving) or buffer zones?
  • Compare these regularities perhaps with
    statistical data on real residential patterns.
    What tests do we have?
  • If there is a good match then we havent yet
    falsified the claim that the simulation
    generates the target system and therefore
    explains it.

18
A metaphor
  • Think of the target system as a three dimensional
    object that casts shadows (data) depending on its
    orientation. Our simulation is an object that
    should cast the same shadows.
  • Because we cannot hold the object all ways at
    once, there are always some orientations that we
    will not have tried.
  • A regression coefficient or line of best fit has
    lower dimensionality than the target system. This
    means that although these methods can nearly
    always imitate shadows at fixed orientations,
    they dont match the shadows at any arbitrary
    orientation.
  • By recreating the dynamic structure of the target
    system, a simulation doesnt just imitate
    arbitrary shadows but actually mirrors the object
    itself.

19
What is going on here?
  • Qualitative research tells us how people interact
    and make decisions but cant usually tell us what
    large scale patterns result.
  • Quantitative research tells us what the large
    scale patterns are but may not really explain
    them (ground them in micro foundations).
  • Simulation attempts to bridge the gap between the
    levels of description with a generative social
    theory expressed as a computer programme.
  • To do this, it needs to be ontologically clear
    about what different kinds of data contribute
    (cognitive, behavioural, structural, statistical)
    and avoid arbitrary parameter values. (Ideally,
    all parameters in a simulation should be
    fittable/fitted empirically?)

20
The catch
  • Different approaches to simulation (types of
    simulation) incorporate (often tacitly) different
    behavioural assumptions.
  • For example, a strict cellular automaton just
    has states and transition rules (no movement like
    that found in Schelling). This may be great for
    snowflake formation but is usually nothing like
    either social or geographic space. (Example CA
    fitting GIS data.)
  • These tacit behavioural assumptions may impact on
    our ability to falsify simulations effectively
    either because they introduce arbitrary
    parameters or foreclose the collection of
    relevant data on how people actually
    behave/decide.
  • Something like model choice in statistics One
    can use expertise and social intuition but not
    test the choice directly.

21
Voting Cellular Automata (CA)
22
Case study Drug trends (DTI Foresight)
  • How does drug use evolve over time?
  • Comparing two approaches broadly agent based
    and broadly system dynamics.
  • The Caulkins et al. model of drug use involves
    (sort of) system dynamics Pools of non users,
    light users and heavy users and various fixed
    transition probabilities between them.
  • The DrugChat/DrugTalk simulations are (unusually)
    broadly based on ethnographic data (Michael
    Agar) Users may source and share drugs, transmit
    information about experiences and thus become
    more or less positive about drug use. They can
    also become addicted.

23
The Caulkins et al. model
LIGHT USERS
HEAVY USERS
b
I
g
a
NON USERS
L(t1)(1-a-b)L(t)I(t), H(t1)(1-g)H(t)bL(t)
24
Deconstructing the model
  • What is the status of the constant transition
    probabilities? Do these describe historical
    transitions (and thus require constant refitting)
    or generate transitions? If so, how?
  • What determines the number of boxes and arrows?
    (What about ex-users?) Is there something
    independent of fit quality? (If not, there is a
    danger of data mining/over fitting.)
  • Technical problem Do we have adequate
    statistical tests for fitting this kind of model
    (rather than, say, a regression).
  • How falsifiable is the model? Will it fit any
    data and only visibly fail if outflow from
    heavy users appears to be greater than outflow
    from light users Minimal behavioural
    plausibility.

25
The DrugTalk/DrugChat simulations
  • Based on ethnographic work by Michael Agar.
  • DrugChat is a LISP replication of DrugTalk (in
    NetLogo) for a DTI Foresight exercise in
    approaches to modelling drug trends.
  • Agents structured in networks (many with few ties
    and few with many).
  • Types (non users, users and addicts) defined
    behaviourally rather than in terms of levels of
    drug use Users and addicts differ in drug
    sharing behaviour and users and non-users differ
    in the kind of information transmitted and its
    credibility. (This is ethnographic knowledge.)

26
Simulation assumptions
  • Doses distributed differing by use status
    probability and number.
  • Decision process involves comparing attitude to
    risk (fixed) and attitude to drugs (socially
    influenced in several ways).
  • Users party (share) but addicts use privately
    as a first approximation.
  • Dose use (binges?) Experiences can be good and
    bad.
  • Running experience count kept and updates drug
    attitude Diminishing marginal returns to
    experience and bad experiences register more.
  • Communication Addicts have no communicative
    credibility but are themselves a warning. Current
    users influence directly by their attitude to
    drugs from experience. Former users or non users
    gossip (transmit good and bad experience counts
    to others) which has a much smaller (and
    indirect) effect.
  • Addiction after five doses Addicts dont
    listen i. e. change attitude to drugs.

27
Deconstructing the simulation
  • Clearly oversimplified Static networks (key
    result in question), decision process,
    communication content and so on.
  • Ethnographic data needed User biographies,
    levels of availability, sharing behaviour, stash
    sizes. Can the simulation be effectively
    calibrated? (Do data collection methods exist for
    each parameter? If not, why not?)
  • Methods appropriate for real time dynamic
    change i. e. attitudes.
  • Can the model be falsified in terms of
    statistical data (recorded addicts, recorded
    deaths from overdose and so on). How hard is it
    to generate an S shaped innovation curve? How
    hard is it to generate a population of
    plausible addict biographies?

28
What do we mean by agent based?
  • Deconstructing the tacit homogeneity assumptions
    in Schelling.
  • Different decision making with different inputs
    and behaviours.
  • Different attributes (wealth).
  • Different local perceptions, experiences and
    memories.
  • Different/diverse environmental features houses
    with different costs/facilities, travel to jobs,
    ease of access and so on.
  • Fundamental question Just how similar are
    people? Economic models (and Schelling) at one
    extreme and journalism/biography at the other.
    The agent based approach minimises the amount of
    built in similarity relative to other
    approaches.

29
What can we do with this simulation?
  • Multiple empirically accessible outputs (another
    falsification opportunity) aggregate data,
    biographies.
  • Exploring data quality issues See paper.
  • Sensitivity analysis See paper.
  • Examine plausibility of potential reductions
    for the simulation Does a simulation with this
    level of social complexity demonstrate stable
    regularities in terms of variables or
    transition probabilities?
  • Similar argument to Hendry in econometrics Start
    with general model that is statistically adequate
    and then know how much you throw away by
    simplification. S to G and G to S are not
    symmetrical processes.
  • Important not to draw the wrong conclusion from
    this exercise but improves on the futile debate
    between realism and simplicity. (Realists cheat
    by not offering general conclusions. Simplicity
    types cheat by not stating how easy their
    models are to falsify. A mean is not a model.)

30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
Design principles
  • Do assumptions of the simulation approach reflect
    what we know about the social phenomenon Is it
    predominantly spatial, network, local/global,
    communicative, cognitive/reactive or whatever?
  • What is status of simulation parameters? Are they
    theoretical (discount rate), empirical (number of
    friends), descriptive (birth rate), generative
    (Schelling) or what?
  • Do simulation style and data collection programme
    permit falsification? How tough a test is it?
    (Clusters? Out of sample prediction?) What degree
    of toughness is reasonable here?
  • Dont let rigour challenges stop you
    Unmeasured parameters are not unmeasurable.
    (Compare statistical approach of proxies and just
    fitting the data youve got.)

35
Weaknesses/challenges?
  • Is this a naïve view of falsification?
    (Philosophy of science says there are always
    ceteris paribus clauses.)
  • How do we use existing knowledge systematically
    to calibrate and falsify simulations? Because
    simulation is new, it has a backlog of data to
    tackle which is a unique situation.
  • What new methods should we be developing (head
    cameras) or adapting (experiments) to gather
    missing data?
  • How can we afford and co-ordinate this kind of
    research? Are we in a Catch-22?
  • How does a discipline pick itself up by its own
    bootstraps in terms of methodological quality?
    Does it?

36
Encouraging thought
  • To the man who has only a hammer, everything
    looks like a nail (Abraham Maslow).
  • Have we really, for all the technical and
    empirical challenges, found a new science and a
    radically new place to stand? (Its a new
    paradigm. Yawn!)

37
Further resources
  • NetLogo lthttp//ccl.northwestern.edu/netlogo/gt.
    Free and cross platform. Rapidly becoming a
    standard.
  • Gilbert and Troitzsch (2005) Simulation for the
    Social Scientist (Open University Press). NOTE
    Get the second edition with the exercises in
    NetLogo rather than LISP.
  • JASSS (Journal of Artificial Societies and Social
    Simulation) lthttp//jasss.soc.surrey.ac.uk/JASSS.
    htmlgt. Interdisciplinary peer reviewed free
    online journal devoted to social simulation.
  • Chattoe, Hickman and Vickerman Drugs Futures
    2025? Modelling Drug Use lthttp//www.dti.gov.uk/
    files/file15388.pdfgt.
Write a Comment
User Comments (0)