Title: Statistical Analysis
1Statistical Analysis Dissemination of Census
Data
2Statistical Analysis and Dissemination of Census
Data
- Outline
- The Power of Maps
- Introduction and Example
- Dynamic Census Atlases
- Overview Example
- Spatial Analysis Techniques
- Overview Examples
- Digital Geographic Data for Dissemination
- Overview Cost and Benefits
- Digital Data Dissemination Strategies and Users
- Overview of Users
3Anyone or anything can be associated with a known
location in the world
4CHILE HOUSING AND POPULATION CENSUS DISTRICTS
2002
5Tsunami Affected Areas in Gizo, Solomon Islands
6The power of maps
- Maps communicate a concept or an idea.
- Maps are often meant to support textual
information - Maps appeal to the viewers curiosity
- Maps summarize large amounts of information
concisely - Maps can be used for description, exploration,
confirmation, tabulation - Maps encourage comparisons
- Between different areas on the same map where
are population densities highest? - Between different maps is child mortality higher
in the districts of province A than in province
B? - same area where and by how much do literacy
rates for males and females differ in the
districts? - Between maps for different time periods did
fertility rates decline since the last census?
7Dynamic census atlases
- Alternative to a static census atlas
- Publishing of a digital map and database together
with mapping software can allow users to produce
custom maps of census indicators. - Normally includes digital boundary files at a
lower resolution than the full census database to
allow fast drawing and low disk usage - closely integrated attribute table should contain
only a selected number of census indicators. - Densities and ratios that are appropriate for
mapping should already be calculated.
8Dynamic census atlases
- The data provider should therefore provide an
easy-to-use package together with the boundaries
and data. - The use of that package should require minimal
training and experience. - The application should be plug-and-playafter
installation, the user should immediately be able
to produce maps - Drill-down options for different geographic
selections - Interactive area delineation options (e.g. select
schools in a district)
9A screenshot of Ukraines dynamics census atlas
10Spatial Analysis Techniques
- the main use of spatial analysis is for census
products and services - Techniques include buffering, linear
interpolation, point pattern analysis, and
cartograms, etc. - All offer functionality beyond standard thematic
(choropleth) mapping, with many tools now
available in both commercial and open-source
software programs.
11Spatial Analysis Techniques
- Some prevalent forms of spatial analysis
especially useful for use with population data
include - Queries
- Distance measurements
- Transformations
- Buffering
- point-in-polygon analysis
- Polygon overlay analysis
12Spatial Analysis Techniques
- Queries
- Often this is the first step in an analysis,
where one seeks to create a subset of units such
as populated places with certain characteristics,
allowing the user to check how typical an
observation is against other observations - They use a GIS program to answer simple questions
posed by the user, with no changes in the
database and no new data produced. - An example of a query using geocoded census data
is, select all towns with a population greater
than 1,000 persons. These towns can then have
their attributes summarized, for instance, to
measure their total fertility rates against
smaller towns and villages, then map the results - The term exploratory data analysis refers to
investigations of patterns and trends in data
using such techniques as querying
13Area delineation
- E.g. Interactive determination of school
districts with the same number of children in
each school grade by aggregating census
dissemination areas
14Spatial Analysis Techniques
- Distance measurements
- Easily done with all GIS programs, using the
centroids (or center points) of cities, towns,
and villages. - An analysis can be done to select villages
located more than a kilometer from a school,
clinic, or water source. - These can then be further analyzed using the
attribute information for the populated places
themselves.
15Spatial Analysis Techniques
- Transformations
- Methods of spatial analysis that use simple
geometric, arithmetic or logical rules to create
new datasets - Transformations can include operations that
convert raster into vector data, or a stream of
GPS coordinates into a route or a boundary - Of all the transformational techniques, buffering
is the most well known and important
16Spatial Analysis Techniques
- Buffering (transformation)
- Involves building a new data layer by identifying
all areas that are within a certain specified
distance of the original. - Buffering can be performed on points, lines and
polygons and can be weighted by attribute values.
- Buffering can be used to model travel time, for
instance, by creating a catchment area around a
particular feature such as a school or a clinic. - This provides a measure of accessibility that can
be mapped across the extent of a country.
17is near to Buffer Operations
- Point buffer
- Affected area around a Hospital
- Catchment area of a water source
18Buffer Operations
- Line buffer
- How many people live near the polluted river?
- What is the area impacted by highway noise?
19Buffer Operations
- Polygon buffer
- Area around a reservoir where development should
not be permitted
20Spatial Analysis Techniques
- point-in-polygon analysis
- Determines whether a point lies inside or outside
a polygon. - Can be used to compare geocoded village centroids
lying inside and outside hazardous areas such as
tropical storm tracks or earthquake zones. - Polygon overlay analysis
- Involves comparison between the locations of two
different polygonal data layers. - For example, the boundaries of two administrative
districts could be compared to troubleshoot
errors in the field enumeration process
21Spatial Analysis Techniques
- Spatial interpolation
- A spatial analysis method designed to fill in
values that lie between observations - A variety of methods including inverse-distance
weighting and kriging are used to estimate the
values of unsampled sites - based on Toblers first law that all nearby
objects are more similar than distant objects - Kriging interpolation technique for obtaining
statistically unbiased estimates of spatial
variation of known points such as surface
elevations or yield measurements utilizing a set
of control points -
- In kriging, the general properties of a surface
are modeled to estimate the missing parts of the
surface
22Example of linear interpolation creating contours
23Thiessen polygons illustrated
Spatial Analysis Techniques
- Thiessen polygons
- Have the unique property that each polygon
contains only one input point (e.g. settlements),
and any location within a polygon is closer to
its associated point than to the point of any
other polygon. - This method assumes that the values of the
unsampled data are equivalent to those of the
sampled points.
24Areas of influence
- Commuting distances daily commuters flow
25Spatial Analysis Techniques
- Descriptive summaries are a spatial equivalent of
descriptive statistics (such as mean and standard
deviation) that represent the essence of a
dataset in 1 or 2 numbers - Centers of population are the two-dimensional
equivalent of a statistical mean and are often
used to display the center of population using
the weighted average of x and y coordinates of
populated points - Point pattern or cluster analysis regards the
distribution of points in space irrespective of
their actual locations to determine whether
patterns are random, clustered, or dispersed - hot spots are where high values are surrounded by
high values, or cold spots, where low values are
surrounded by low values. These are particularly
useful for identifying populations at risk as
well.
26Spatial Analysis Techniques
- Cartograms
- sometimes used to display census results
- The areas of the original polygons are expanded
or contracted based on their attribute values
such as population size or voting habits
27Modelling smoothing
- Evolution of the population beetwen two censuses
28Digital Geographic Data for Dissemination
- Demand for digital databases that consist of
extractions of the census agencys digital
geographic master database will only increase - Census data are an important input in policy
planning and academic analysis in many fields. - Health service provision, educational resource
allocation, design of utilities and
infrastructure, and electoral planning are some
applications where government agencies require
spatially referenced small area population
statistics. - Commercial users employ such data for marketing
applications and location decisions.
29Digital Geographic Data for Dissemination
- Benefits and costs
- Benefits Unsurpassed detail and precision, the
potential use of census data in numerous
applications--especially when overlaid on other
geographic data such as terrain, and the relative
ease of management and storage of thousands of
units - Costs expense in processing and data management,
possible data disclosure issues, and quality
controlcosts of metadata production should be
factored into the equation as well
30Digital Data Dissemination Strategies and Users
- The wide range of potential users of
disaggregated census data means that the NSO
needs to pursue a multi-leveled digital data
dissemination strategy. - Broadly, we can distinguish between the following
types of users - Advanced GIS users
- Computer literate users
- Novice users
31Digital Data Dissemination Strategies and Users
- Advanced GIS users
- work easily with large datasets and can use ftp
to access them - Require extensive metadata. Sometimes called data
extractors or power users - They will want access to spatial and attribute
information in a comprehensive digital geographic
format - The census office needs to supply comprehensive
documentation on the geographic parameters used
for the geographic database as well as on the
individual census variables - The spatial information will be distributed in an
open geographic format that can be easily
converted into any number of commercial GIS
formats
32Digital Data Dissemination Strategies and Users
- Computer literate users
- Government, commercial or private sector users
who want to be able to browse the thematic
information in a census database spatially. - Want to produce thematic maps and thus need to be
able to perform simple manipulation of
cartographic parameters. - Simple analytical functions such as aggregation
of census units to custom-designed regions should
also be possible. - This group of users is best served with a
comprehensive, pre-packaged application that is
designed for a commercial or freely available
desktop mapping package. - Documentation requirements are somewhat smaller,
since the users are unlikely to change the
geographic parameters of the database or perform
more advanced GIS operations.
33Digital Data Dissemination Strategies and Users
- Novice users
- Largely want to view pre-prepared maps on a
computer and perhaps perform some basic queries - Best data distribution strategy is often to
produce a self-contained digital census atlas - This atlas could consists of a series of static
map images, for example, in the form of a slide
show - Or it could be a very simple mapping interface
with pre-designed map views that allow basic
queries - Both, static maps and a simple map interface, can
be made accessible through the Internet
34GRACIAS POR SU ATENCIÓN