GeoComputation - PowerPoint PPT Presentation

1 / 87
About This Presentation
Title:

GeoComputation

Description:

... example of this first use can be found in the classic problem of modeling a bank ... A cell that is dead at the time step t, becomes alive at time t 1 if exactly ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 88
Provided by: gordon48
Category:

less

Transcript and Presenter's Notes

Title: GeoComputation


1
GeoComputation
  • Gordon Green
  • Hunter College
  • Department of Geography
  • 11/2006

2
What is GeoComputation?
  • the eclectic application of computational
    methods and techniques to portray spatial
    properties, to explain geographic phenomena, and
    to solve geographical problems -Couclelis, 1998
  • The application of a computational science
    paradigm to study a wide range of problems in
    geographical and earth systems. Openshaw, 2000
  • "The universe of computational techniques
    applicable to spatial problems" -Reed and Tuton,
    1998

3
What is GeoComputation?
  • Its champions consider it a new discipline.
  • Others consider it just a toolbox of
    computationally intensive techniques.

4
What is GeoComputation?
  • Geo - emphasizes geographical and spatial
    aspects. Some other approaches entail spatial
    additions to methods that arose in non-spatial
    fields. GeoComputational methods tend to be
    inherently spatial.
  • Computation - equally important as the spatial
    aspect. The approaches tend to come out of
    computer science, rather than geography.
  • Traditional statistics tends to reduce and
    summarize information GeoComputation tends to
    retain or generate more complexity in its
    operations.

5
What is GeoComputation?
  • Fotheringham differentiates between Weak GC,
    where the computer is used to augment existing
    quantitative methods and Strong GC, where the
    computer "drives" the form of analysis.
  • Confirmatory versus exploratory techniques -- GC
    tends to fall in the latter category.
  • The challenge of GC is to develop the ideas, the
    methods, the models, and the paradigms able to
    use the increasing computer speeds to do useful,
    worthwhile, innovative ad new science in a
    variety of geo-contexts. (Openshaw)

6
What is GeoComputation?
  • Aspects of GeoComputation
  • Multi-Agent Modeling
  • Cellular Automata
  • Fuzzy Modeling
  • Genetic Programming
  • Neural Networks

7
Agent-Based Models
  • An agent-based model is one in which the basic
    unit of activity is a software agent.
  • Usually a model will contain many agents (at
    least tens, occasionally many thousands).
  • Its outcomes are determined by the interactions
    of the agents.
  • A agent is an identifiable unit of computer
    program code which is autonomous and
    goal-directed.
  • (from Schelhorn, OSullivan, Haklay,
    Thurstain-Goodwin, 1999)

8
Agent-Based Models
  • An early example of agent-based modeling is Boids
    (Reynolds, 1987), a model that describes flocking
    behavior.
  • Explaining flocking using functional models is
    complex.
  • By approaching it by modeling autonomous agents
    with simple rules, it was easier to create an
    explanatory model.
  • Agent models typically work by maintaining a
    collection of agents, and a timer that advances
    the state of each agent with each tick, each
    agent updating its state based on characteristics
    and location of the other agents and other
    entities in the landscape.
  • Agent models are best implemented in an
    object-oriented fashion

9
Agent-Based Models
  • Each agent has a simple set of rules defining its
    behavior. Creating multiple instances of the
    model and adjusting the rules made it possible to
    model flocking behavior concisely.
  • Boids animation
  • More elaborate example

10
Agent-Based Models
  • These animations exhibit the phenomenon of
    emergence, whereby complex behavior emerges from
    the behavior of simple components.
  • Another example of agent-based modeling is
    pedestrian modeling
  • Pedestrian model note formation of lanes.
  • Other sample pedestrian models

11
Agent-Based Models
  • Pedestrian model implementation (Batty, 1999)

12
Agent-Based Models
  • Pedestrian model implementation (Batty, 1999)

13
Agent-Based Models
  • Pedestrian model implementation (Batty, 1999)

14
Agent-Based Models
  • Pedestrian model implementation (Batty, 1999)

15
  • Pedestrian model design (Schelhorn, OSullivan,
    Haklay, Thurstain-Goodwin, 1999)

16
(No Transcript)
17
Software for agent-based modeling (from Najlis,
Janssen, and Parker, 2001)
18
Agent-Based Models
  • When to use agents? Axtell (1999) describes three
    basic cases when agent-based models might be
    used
  • 1. Agent models as simulation of mathematical
    models agent models can be used to implement
    monte-carlo simulations that can also be solved
    using other numerical methods. Even though the
    simulation can be solved in a functional manner,
    an agent-based implementation can act as an
    alternative implementation that can validate the
    results.
  • Similarly, an agent-based model can be useful as
    an illustration of a complex numerical model.
    Even though the agent-based model may not be
    necessary to come up with a solution, it may be
    very helpful in illustrating the results to a
    general audience.

19
Agent-Based Models
  • An example of this first use can be found in the
    classic problem of modeling a bank-teller line
  • The queuing process is commonly simulated using a
    the Monte Carlo method to arrive at a
    distribution of waiting times.
  • This is can be equivalently modeled using agents
    that each have different arrival times and other
    parameters, and running the model with many
    agents to similarly build up the resulting
    distribution function.
  • (Axtell, 1999)

20
Agent-Based Models
  • 2. Agent models as complementary to mathematical
    models a mathematical model may adequately
    model some aspects of a problem but not others,
    or may be awkward or incapable of a comprehensive
    solution. Or a mathematical model may be known,
    but so complex as to be practically insoluble.
  • An artificial stock market is an example of such
    a model. Trading agents have the choice of
    investments with varying stochastic levels of
    risk, adapt their behavior based on the results
    of prior trading.
  • Even though the basic features of such a model
    may be functionally describable, an agent-based
    model can evolve over time into a system that
    actually replicates some of the complexity of a
    real stock market. These complexities are
    impractical to model using a mathematical model.
    (Axtell, 1999)

21
Agent-Based Models
  • 3. Agent models as substitutes for mathematical
    models some problems are not amenable to
    mathematical modeling.
  • Examples include the behavior of individuals in
    animal population, or groups of pedestrians as
    described earlier.

22
Agent-Based Models
  • Pros
  • An argument from economics, which could also be
    applied to geography
  • The reason why large scale computable general
    equilibrium problems are difficult for economists
    to solve is that they are using the wrong
    hardware and software. Economists should design
    their computations to mimic the real economy,
    using massively parallel computers and
    decentralized algorithms that allow competitive
    equilibria to arise as emergent
    computations...The most promising way for
    economists to avoid the computational burdens
    associated with solving realistic large scale
    general equilibrium models is to adopt an
    agent-based modeling strategy where equilibrium
    prices and quantities emerge endogenously from
    the decentralized interactions of agents. (Rust,
    1996, in Axtell)
  • The quality of emergent behavior doesnt
    correspond to any part of a traditional
    statistical analysis. Agent-based models
    uniquely provide the opportunity to model a
    process, and see what happens when it runs
    (Axtell).

23
Agent-Based Models
  • Cons
  • When dealing with agent models, we are quickly
    involved in a world where everything - both the
    agents and their environment - are designed, and
    the reliable scientific analysis of the real
    world may be compromised by the complexity of the
    models (paraphrased from Couclelis, Why I No
    Longer Work with Agents, discussing
    human/environment agent models).
  • Agent models dont have a convenient way of
    measuring their accuracy, unlike statistical
    models. This can only be overcome by running the
    agent model many times, varying the parameters to
    discover the robustness of the results. There is
    a limit as to how many such variations can be
    executed (although this is increasing with
    increases in computer power) (Axtell).

24
Cellular Automata
  • Cellular automata are closely related to
    agent-based models.
  • Instead of an autonomous agents being given
    simple behavioral rules, the states of cells in a
    surface are dictated by similarly simple rules.
  • Those rules use the state of surrounding cells in
    a grid to determine the state of any given cell
    in the next iteration of the model.

25
Cellular Automata
  • CAs develop in space and time.
  • Space and time are defined in discrete steps.
  • Cells are lined up in a string for
    one-dimensional automata, or arranged in a two or
    higher dimensional lattice for two- or higher-
    dimensional automata.
  • The number of states of each cell is finite.
  • The states of each cell are discrete.
  • All cells are identical.
  • The future state of each cell depends only of the
    current state of the cell and the states of the
    cells in the neighborhood, determined by rules
    (from Alexander Schatten).

26
Cellular Automata
  • The simplest CAs are one-dimensional. For
    example, here are a set of rules and the results
    of a few iterations (from David R. Green)

27
Cellular Automata
  • For geographical applications, CAs are usually
    2-dimensional, often using one of these cell
    neighborhoods

28
Cellular Automata
  • Conways Game of life rules
  • Mathematician John Conway invented CAs in his
    Game of Life, first described in a 1970
    Scientific American article.
  • A cell that is dead at the time step t, becomes
    alive at time t1 if exactly three of the eight
    neighboring cells at time t were alive.
  • A cell that is alive at time t dies at time t1
    if at time t less than two (loneliness) or more
    than three cells (overcrowding) are alive.
    (Alexander Schatten)
  • From these very simple rules, very complex
    behaviors can be modeled.

29
Cellular Automata
  • Sample game of life pattern (glider)

30
Cellular Automata
  • There are many other examples available on the
    web, such as
  • http//www.ibiblio.org/lifepatterns/
  • These simple rules can be expanded to generate
    many, many patterns of change
  • http//www.collidoscope.com/modernca/

31
Cellular Automata
  • There are many enhancements involved in
    generating more complex evolutionary patterns
  • Probabilistic CAs, where, instead of having
    binary rules (and binary states), the changes are
    described by probabilities, give an increased
    level of control over how a CA develops. The
    rules can express the chance of a cell changing
    state, and each step of the CA involves selecting
    the state of each cell based on a random number
    falling within its probability range.
  • CAs can also be self-modifying, that is, they can
    respond to changes in the states of the grid as
    it develops.
  • Irregular grid cell sizes
  • Action a distance, instead of just immediately
    adjacent cells.
  • Incorporating agents within CA landscapes

32
Cellular Automata
  • These more sophisticated CAs have been used to
    model diverse geographic phenomena, such as
  • Wildfire
  • Formation of a Megalopolis
  • West Nile Virus Infection Risk
  • 2 from Paul Torrens, http//geosimulation.org/sim
    ulating-sprawl/, 3 from Sean Ahearn.

33
Cellular Automata
  • Cons (Batty and Xie, 2005)
  • Most CA rules have little relationship to the
    actual causes of the phenomenon being studied.
    Even if the model happens to model the process
    successfully, it doesnt really prove anything,
    and is difficult to use as the basis of any kind
    of policy decision.
  • They tend not to model spatial interaction well.
  • In practice, CA cells tend not to match units of
    the phenomenon under study.

34
Fuzzy Modeling
  • A frequent problem in geography is how to
    identify an geographic entity.
  • Geographic phenomena tend to be described by
    vague terms.
  • For example a study concerns the major cities in
    Europe. Each of the terms major, city, in,
    and Europe are somewhat vague.
  • Fuzzy set theory is an attempt to deal with the
    problems posed to traditional set and logic
    theory by vagueness.
  • (section paraphrased from Fisher, 2000)

35
Fuzzy Modeling
  • Sorites paradox
  • If a grain of sand is placed on a surface, is
    there a heap?
  • If a second grain of sand is placed next to it,
    is there a heap?
  • If a third grain of sand is placed next to the
    second, is there a heap?
  • If a ten-millionth grain of sand is placed next
    to the 9,999,999th, is there a heap?
  • Heap is a vague concept. Other examples are
    tall, near, far, etc.
  • Many geographic values such as vegetation
    coverage, soil types, etc, are similarly
    Sorites-susceptible.

36
Fuzzy Modeling
  • Classical sets follow the logic described in
    familiar Venn diagram (examples drawn from
    behavior of boolean search terms)

37
Fuzzy Modeling
  • These can also be expressed using linear graphs

38
Fuzzy Modeling
  • Membership in a set is determined by some
    threshold value. This value can be subject to
    error, and can be assigned a probability, Those
    probabilities can then be used in set-based
    calculations

39
Fuzzy Modeling
  • Boolean set theory is used throughout most
    conventional set and statistical analysis, For
    example does the result of a test disprove the
    null hypothesis? We set a threshold value to the
    t test statistics and respond based on whether or
    not the results of the hypothesis are above or
    below that test. In many cases, there is
    actually a continuous probability of the null
    hypothesis being correct.
  • Zadeh 1965 put forward concept of fuzzy sets in
    response to shortcomings of boolean sets.

40
Fuzzy Modeling
  • Fuzzy set membership is at the core of fuzzy set
    theory. Boolean sets are encoded with 0 or 1,
    wherein an entity is or is not part of a set.
    Fuzzy set membership is defined by any value
    between (and including) 0 and 1.

41
Fuzzy Modeling
  • Fuzzy set membership can take varied forms

42
Fuzzy Modeling
  • These then result in boolean calculations that
    take into account partial set membership

43
Fuzzy Modeling
  • Boolean operations become functions rather than
    boolean values

44
Fuzzy Modeling
45
Fuzzy Modeling
Example of fuzzy versus crisp classifications
46
Fuzzy Modeling
The concept of fuzziness can also be applied to
polygon boundaries
47
Fuzzy Modeling
  • Key question How do you choose the membership
    function (MF)? Is it linear or curved?
  • The semantic import (SI) approach estimates
    based on domain expertise and iterative
    evaluation of the results.
  • Fuzzy K-means approach uses iterative
    mathematical estimates of the best function.

48
Fuzzy Modeling
  • Fuzzy K-Means
  • Usually starts with random allocation of objects
    into k clusters.
  • The center of each cluster is calculated
  • Objects are re-allocated based on similarity of
    attributes using a similarity index, usually a
    distance measurement.
  • This process is repeated until a stable solution
    is reached.
  • Membership in each cluster is calculated as a
    range of from 0 to 1, instead of 1 as in crisp
    k-means (Burrough and McDonnell)

49
Fuzzy Modeling
  • Fuzzy K-Means

50
Fuzzy Modeling
  • Fuzzy modeling is often used in conjunction with
    genetic algorithms, which provide a way of
    iteratively refining the results of automated
    models.
  • The following few slides will review this before
    concluding the section on fuzzy modeling.

51
Genetic Algorithms
  • First developed in the 70s.
  • The basic idea is to model a problem with many
    parameters by treating those parameters like
    genes.
  • These are selected over time by iteratively
    applying a measure of fitness, selecting the
    fittest elements, and applying the process again.

52
Genetic Algorithms
  • Steps in an evolutionary algorithm
  • Create a new population of alternatives using
    random values.
  • Select individuals from the population weighting
    towards the individual that represents the best
    solution so far.
  • Use them as the basis of a new set of
    alternatives, by combining them (crossover) or
    randomly changing them (mutation).
  • Continue this process until some terminating
    condition is met.

53
Genetic Algorithms
  • Sample chromosome
  • A contrived simple problem
  • Given the digits 0 through 9 and the operators
    , -, and /,  find a sequence that will
    represent a given target number. The operators
    will be applied sequentially from left to right
    as you read.
  • Encoding

54
Genetic Algorithms
  • Encoded solution
  • Crossover
  • Mutation Randomly changing bits

55
Genetic Algorithms
  • While not necessarily spatial per-se, a genetic
    approach to model evaluation and selection can be
    applied to many problems that require selecting
    among many possible models.

56
Fuzzy Modeling
  • Using fuzzy modeling to classify coverage in
    remotely sensed data involves fuzzy logic in
    combination with a genetic classification process
    (from TNTMips user guide)

57
Fuzzy Modeling
58
Fuzzy Modeling
  • The right-hand image shows the first-cut results
    of an unsupervised image classification. The
    right-hand image shows the extent to which each
    cell differs from the center of its assigned
    spectral class, so darker areas are more likely
    to be in the assigned class

59
Fuzzy Modeling
  • The Fuzzy classification method uses rules of
    fuzzy logic, which recognize that class
    boundaries may be imprecise or gradational.
  • It creates an initial set of prototype classes,
    then determines a membership grade for each
    class for every cell.
  • The grades are used to adjust the class
    assignments and calculate new class centers, and
    the process repeats until the iteration limit is
    reached. (TNTMips user guide).

60
Fuzzy Modeling
  • Champions of fuzzy modeling consider probability
    theory to be a special case of fuzzy modeling.
  • Others consider fuzzy modeling to be unnecessary,
    and replaceable by Bayesian probability.

61
Neural Networks
  • The last geocomputational technique we will cover
    is neural networks, or neurocomputing.
  • Definition A computational neural network (NCC),
    is a parallel distributed information structure
    consisting of a set of adaptive processing
    (computational) elements and a set of
    unidirectional data connections (Fischer and
    Abrahard).
  • These models have been inspired by neuroscience,
    but in no way actually model biological or
    neurological phenomena.

62
Neural Networks

63
Neural Networks
  • Typical neural network diagram

64
Neural Networks
  • A Neural Network consists of various layers.
  • Each layer can any number of neurons in it.
  • The first layer of the network is called an input
    layer, and it is here we apply the input.
  • The last layer is called the output layer, and it
    is from here we take the output.
  • A neural network can have any number of hidden
    layers, between the input and output layer.
  • In most neural network models, a neuron in one
    layer is connected to all neurons in the next
    layer (from a description by Anoop Madhusudanan).

65
Neural Networks
  • Here is a diagram of the simplest possible neural
    network. N1 and N2 are inputs, N3 and N4 are in
    a hidden layer, and N5 is the output

66
Neural Networks
  • Within each node, or neuron, there is a
    function that determines the output of the node
    based on the inputs
  • Input data are typically specified to take values
    in any range, whereas output is given in limited
    ranges.

67
Neural Networks
  • Each input has a weight. The totals of the input
    constitute an activity level. When the
    activity level reaches a certain threshold, the
    output changes.
  • The output may also change continuously with
    changes in input, for example following a curve.
    This curve represents the transfer function, by
    which inputs are converted to outputs

68
Neural Networks
  • Neural networks must be trained. Training
    consists of adjusting the weighting of inputs
    depending on the value of outputs.
  • The can be accomplished with a feedback function,
    which for example, may take the output and feed
    it back to the hidden or input layers. The
    hidden layer may in turn then feedback to the
    input layer.
  • The feedback function adjusts the weighting so
    that correct results are increased with further
    iterations of the network. Note that weights can
    be negative, so neurons may be either inhibitory
    or excitatory.

69
Neural Networks
  • For example, a neural network designed to detect
    the pattern of the number four might have an
    input node for every pixel in a grid.
  • It would be trained (weights adjusted) using
    multiple images of the number four

70
Neural Networks
  • Training may be supervised or unsupervised. Most
    implementations are supervised.
  • In supervised training, the training data
    consists of inputs together with expected
    outputs.
  • Neural networks work best with boolean or ordinal
    or real numeric data. Non-ordinal set-based data
    types tend not to work because they do not lend
    themselves to weighting.
  • With enough training data where the inputs and
    outputs are known, the network implicitly begins
    to model a function which can be applied to
    unknown inputs.
  • The training data selected depends on user
    knowledge and intuition about the problem domain.
    Neural networks take into account existing
    knowledge about a problem via the training
    process.
  • (Paraphrased from StatSoft manual).

71
Neural Networks
  • Neural networks work best when there is a lot of
    training data. For most practical problems,
    hundreds or thousands of training cases are
    required. The exact number depends on the nature
    of the problem.
  • Neural networks tend to be tolerant of noisy
    input, but there can be problems if the training
    data does not include outliers found in the
    subject data. In these cases the outliers may be
    ignored.
  • They tend to work well with regression problems
    (where the output will be a specific number) and
    with classification problems (where the output
    may be a boolean value or a set of values).
    Pattern recognition is one common application.

72
Neural Networks
  • Applications in GIS
  • Image classification neural networks can be
    used to optimally classify remotely sensed
    images.
  • Feature detection the pattern-recognition
    strengths of neural networks could be used to
    identify features in remotely sensed data.
  • Transportation route selection for example,
    route and congestion data can be used as inputs
    to neurons that optimize for shortest routes.
    Optimal routing could be the output.

73
Neural Networks
  • Transportation example (from Thurston, GIS
    Café.com)
  • A is starting point and B is destination.
    Intersections are nodes in the network. The red
    square is the location of an accident.

74
Neural Networks
  • The weighting of node inputs can be calibrated to
    the traffic capacity of each. The accident
    causes the affected nodes to inhibit activation,
    resulting in the selection of non-affected nodes

75
Related Topics
  • Parallel processing and object-oriented software
    development are closely related to
    GeoComputation.
  • The following slides give a quick overview of
    these related topics.

76
Parallel Processing
  • What is it?
  • Parallel processing is using more than one CPU
    to perform computational tasks more quickly.
  • Types of parallel processing
  • SIMD Single instruction stream, multiple data
    stream.
  • MIMD Multiple instruction streams, multiple
    data streams.
  • SISD Single instruction stream, single data
    stream, is the model in place in most current
    PCs.
  • MISD Multiple instruction stream, single data
    stream. This is really only a theoretical
    construct, because it is not practical to have
    multiple processors manage the same data.

77
Parallel Processing
  • SIMD each processor runs the same program on
    different data (from Johnston, introduction to
    HPC)

78
Parallel Processing
  • MIMD each processor runs different instructions
    on different data

79
Parallel Processing
  • MIMD has proven to be the most generally
    applicable parallel processing model.
  • Splitting up tasks into chunks that can be
    executed in parallel is widely applicable to
    geographic problems, where similar tasks need to
    be applied to different data.
  • This kind of processing model can be implemented
    on a SISD machine, and is a useful way of
    thinking about many geocomputational practices.

80
Parallel Processing
  • When does it make sense to use it?
  • Some tasks are easier than others to make
    parallel old cliché nine women cant produce
    a baby in one month but 90 women can create 90
    babies in nine months.
  • Whether or not a model can be implemented in a
    parallel fashion depends not only on the nature
    of the problem, but also on the number of
    iterations needed.
  • Problems that are inherently serial, or that
    would be very difficult to make parallel
    (parallel code tends to take several times longer
    to write), are not suitable
  • But most problems in geography are likely to be
    parallel (e.g., you can usually subdivide a map,
    and a problem, into sectors that can be processed
    independently).

81
Parallel Processing
  • One interesting sidebar Distributed computing
    or task farming across multiple machines

82
Parallel Processing
  • For example, large distributed networks of
    volunteer machines can be used to process large
    parallel processing tasks such as global climate
    modeling.
  • BOINC software for distributed volunteer
    computing supports such processing

83
Parallel Processing
  • BOINC Applications
  • Climate Prediction Model

84
Object-Oriented Programming
  • Traditional programming consists of functions
    that roughly approximate the notion of a
    mathematical function.
  • Problems are modeled by breaking the down into
    layers of functions in a process called
    structured decomposition.
  • Object oriented programming instead identifies
    units of behavior and data that can be grouped
    together to model some real-world entity.

85
Object-Oriented Programming
86
Object-Oriented Programming
87
Object-Oriented Programming
  • Parallel processing and object-oriented software
    development make it easy to approach
    computational problems in a bottom-up fashion,
    using simple components to model systems that
    would otherwise be extremely complex.
  • An object encapsulates its state and behaviors,
    making it easy to create complex systems composed
    of autonomous software entities.
  • An OOP and parallel-programming perspective
    applied to spatial modeling is a convenient
    approach for most GeoComputation techniques.
Write a Comment
User Comments (0)
About PowerShow.com