Title: Ecosystem Analysis Using Probabilistic Relational Modeling Bruce DAmbrosio, Eric Altendorf, Jane Jor
1Ecosystem Analysis Using Probabilistic Relational
ModelingBruce DAmbrosio, Eric Altendorf, Jane
Jorgensen
- Presented by Iulia Oroian and Leonard
RodrigoTuesday Dec 2nd CSCE 582 Fall
2003Instructor Dr. Marco Valtorta
2Definitions
- Ecosystems
- Systems composed of interacting populations of
organisms and their environment - Community-level ecosystem model
- An integrated model of the ecosystem as a whole
- Synthetic variables
- Variables derived from observational data
- Aggregator
- A count or value of a specific variable,
included in the synthetic variable space
3Goal
- To aid domain scientists in gaining insight into
data. - Controlled experimentation in an ecosystem is
undesirabletherefore it is desirable to create
comprehensive models from the vast amount of
observational data available. - Generally, individual, domain-specific teams
apply traditional statistical methods to
investigate correlations among variables in their
separate datasets. - Few methods exist for investigating the complex,
noisy cross-disciplinary interactions that are
crucial to understanding the ecosystem as a whole.
4Abstract
- Application of relational model discovery
methods to building comprehensive ecosystem
models from data. - In particular two projects are considered
- - Crater Lake Ecosystem
- - West Nile Virus Disease Transmission
- In both cases the relational probabilistic model
discovery is applied for building community
level models of the ecosystems.
5Project 1 Crater Lake
- Problem
- The NPS is concerned about long-term changes in
the clarity of Crater Lake, a national park and
the clearest deep-water lake in the world. - So far, linking various domain-specific surveys
into one overall assessment of lake health has
been lacking. - Using the relational model discovery methods the
authors try to derive parameters that account for
variations in explicit variables, like clarity of
the lake water.
6Project 1 Crater Lake
- Data
- Data are obtained from long-term studies of the
lake (some readings go back to 1880). - This data have been collected in tables using
various time and spatial scales. - For example surface weather condition
information, phytoplankton densities, weather
data at altitude. - Notice that the temporal and spatial granularity
of the data varies surface weather condition
information, is available on a daily basis,
weather phytoplankton densities are measured only
once or twice a month, and weather data at
altitude is rarely available.
7Project 1 Crater Lake
- Method
- A set of temporal units were chosen to frame the
analysis. For this purpose expert knowledge was
used. - These units were time periods corresponding to
observed patterns of clarity of lake and for
which data were available - In the project Jun-Jul, Aug, Sep-Oct
8Project 1 Crater Lake
- Challenges
- Problem deal with the time, which wasnt
explicitly reified, therefore constructing paths
likesecchi.DesDepth.yrSegment.Phyto.density
was a problem. - Solution manually add a Season table.
- Problem how to gain scientific insight into data
- Solution learning models over not just
variables in the provided tables, but over their
parents as well.
9Project 1 Crater Lake
- A complete schema for
- the data tables related to
- the temporal tables is
- shown in figure 1.
10Project 1 Crater Lake
- After performing the analysis ( meaning applying
the relational model discovery method), the
following essential elements showed in the
discovered model.
11Project 1 Crater LakeResults
- One relationship that was discovered is that the
dominant fish species in gill net catches was
probabilistically dependent upon - - Secchi descending depth (water clarity) in the
current year - - mean fish weight in the current year
- - descending Secchi depth the previous year
- - dominant fish species two years previous
-
12Project 1 Crater LakeResults
- Other findings
- the fact that schools of Kokanee smolts swimming
at the edges of the lake were preyed upon by
Rainbow trout and this phenomenon does not occur
every year. A time lag of two years, discovered
by the model, is consistent with experts
observations. The relation between this
interaction and water quality was previously
unknown. - The centrality of water clarity (measured by the
Secchi DesDepth parameter) - The lack of a direct relationship between
Zooplankton count and water clarity. - These findings suggest that fish attributes may
serve as a predictor of water clarity.
13Project 1 Crater LakeResults
- Another important result
- learning models over not
- just the variables in the provide
- tables but over their parents as
- well provide additional insight.
- An example for the
- FishSpecimen table
- is shown in Fig3.
14Project 2 West Nile Virus
- Data available
- Reports of dead birds testing positive
- Reports of breeding populations of mosquitoes
testing positive - Human case reports
- Landscape type
15Project 2 West Nile VirusDatabase Types
- Static Type
- Presence of permanent mosquito breeding sites
(tire disposal facilities, etc) - Landscape type
- Event Type
- Located in place and time
- Birds located testing positive for West Nile
- Mosquitoes testing positive for West Nile
16Project 2 West Nile VirusModeling Method
- Attempt to create a model of the spread of the
West Nile Virus in Maryland, 2001 - Selectors are used to relate the correct subset
of values to other nodes.
17Project 2 West Nile VirusRelating Different
Databases
- Location and Time are continuous variables
- This is handled by creating a scale. The scale
is determined by examining previous case studies
such as the life-cycle of disease-carrying
mosquitoes and flight distance of competent bird
hosts. - In this particular study, the space / temporal
scale consisted of 5 miles and 1 month. - Selectors
- Implemented as boolean typestrue for elements in
the same range, and false for elements outside.
18Project 2 West Nile VirusModel Fragment
19Project 2 West Nile ModelResults
- The researchers found that there were
insignificant cases to effectively use human and
horses test cases to model the spread of the
virus - The model was, however, reasonably accurate, thus
possibly implying that it is not necessary to
gather data on insignificant hosts such as horses.
20Conclusions and Future Work
- Relational probabilistic modeling provides a
natural framework for investigating ecological
data. - Based on the systems relational database the
methods of relational learning provide the
opportunity to learn comprehensive models
directly from the data sources. - There still are limitations in the current
synthetic variable construction methods.