Title: Demand Point Aggregation for Location Models
1Demand Point Aggregation for Location Models
- R. L. Francis University of Florida
- francis_at_ise.ufl.edu
- T. J. Lowe University of Iowa
- tlowe_at_blue.weeg.uiowa.edu
2Acknowledgement
- We are happy to thank Hulya Emir-Farinas, and
Brenda Rayco, for their help, particularly
computational work summarized in many of the
figures and tables. - This presentation is a shortened version of a
tutorial given at the Istanbul EURO-INFORMS
meeting in July, 2003.
3Outline
- Location problems
- Location analysis
- Location model typology
- References
- MIP location models, MP aggregation
- Demand point modeling, aggregation
- Common DP aggregation approach
- Law of diminishing returns
- Aggregation error measures
- SAND location models
- Error bounds
- Paradox of aggregation
- Overview some aggregation algorithms
- Example aggregations
- Conclusions
4Quote (Frank Plastria, 2002)
- A location problem arises whenever a question is
raised like - Where are we going to put the thing(s)?
- The next two questions then immediately follow
- Which places are available?
- On what basis do we choose?
5Example location problems
- house or apartment
- branch banks
- automobile dealerships
- ATMs
- tax offices
- grocery stores
- schools
- lock boxes for periodic payments
- warehouses/
- distribution centers
- factories
This is a problem you have solved. Tradeoffs
include 1) rent 2) travel time to UF.
6Distances (or times) matter
- distance to work
- distance to bank
- distance to shopping
- time check is in mail
- transportation distance to/from factory
- transportation distance to/from warehouse/DC
7Location models
- Location models often try to capture some of the
above distance aspects, as well as fixed costs in
many cases. - We seek a location, or locations, from either a
finite or infinite set, with distances
appropriately defined, to minimize some cost
expression.
8Purpose of location analysis
- Suggest and identify options for
- Number of facilities (servers)
- Locations of facilities
- Size of facilities
- Allocation of demands (supplies) to facilities
9Types of location models tradeoffs
- Discrete/MIP Models most accurate, least
computationally tractable, better for numbers
than insights - Continuous Planar Models least accurate, often
very computationally tractable, better for
insight than accuracy - Network Models a compromise between MIP
Planar Models, use shortest-path distances,
require network data base may discretize to solve
10Strategic, tactical, operational management
courtesy of Stefan Nickel (Bender et al., 2001)
- Strategic Management
- Long planning horizon, high aggregation level,
planar location models. - Tactical Management
- Medium planning horizon, medium aggregation
level, network location models - Operational Management
- Short planning horizon, low aggregation level,
scheduling routing models.
little data
more data
much data
11References
- Discrete Location Theory, Mirchandani and
Francis, eds., Wiley 1990 - Facility Location, Z. Drezner ed., Springer, 1995
- Network and Discrete Location, M. Daskin, Wiley,
1995. - Facility Location Applications and Theory, Z.
Drezner and H. Hamacher, eds, Springer 2002 - Various published papers (see handout)
- Papers from ISOLDE IX, June 2002
- SOLA e-mail sola_at_bobcat.ent.ohiou.edu
- EURO Working Group on Locational Analysis
- http//www.vub.ac.be/EWGLA/homepage.htm
Includes chapter on demand point aggregation.
12Outline
- Location problems
- Location analysis
- Location model typology
- References
- MIP location models, MP aggregation
- Demand point modeling, aggregation
- Common DP aggregation approach
- Law of diminishing returns
- Aggregation error measures
- SAND location models
- Error bounds
- Paradox of aggregation
- Overview some aggregation algorithms
- Example aggregations
- Conclusions
13Demand point (DP) modeling background
- Many location problems deal with locating
facilities with respect to demand points. - In urban settings, there can be more than 100,000
demand points. - Demand point data is often readily available
CD-ROM phone books, GIS address matching, U.S.
post Office Delivery Point Validation (DPV) data
base with 145 million addresses many commercial
suppliers.
14DP aggregation benefits costs
- aggregation reduces
- data collection cost
- statistical uncertainty
- modeling cost
- computing cost
- confidentiality concerns
- aggregation increases
- modeling error
our focus
15DP aggregation Basic question
- How do we aggregate DPs to keep the modeling
error low, yet have a tractable model?
16Common aggregation modeling approach used in
practice
- Replace every DP in each postal code region/zip
code area by the centroid of the region. - This is inexpensive, but may cause a large
aggregation modeling error.
17Basic aggregation idea
-
- p1 p2
p1001 - p3 p1002
-
- c1
c2 -
p2000 -
p1999 - p999 p1000
18Basic aggregation idea
- Choose some single point in each region,
aggregate every demand point in the region into
this single point replace each pi by pi'. - Example
- pi' c1, i 1, , 1000
- pi' c2, i 1001, , 2000.
- c1, c2 are aggregate demand points (ADPs).
Note the pi are distinct, but the pi' are not.
19Fundamental Aggregation Insight Law of
Diminishing Returns (LDR)
- some aggregation
- error measure
-
aggregate DPs
costly choice
bad choice
better choice
20Law of diminishing returns
- Our experience with DP aggregation is that this
well-known law usually applies, and is
practically important. - Too few ADPs give a high error many ADPs may not
accomplish much more than somewhat less.
21Numerical example LDR for covering problem,
using RC-Cen (explained later)
22Aggregation decisions to make
- (D-1) The number of ADPs
- (D-2) The locations of the ADPs
- (D-3) The replacement rule replace each pi by
some pi'. - Choosing ADPs is itself a location problem. DP
aggregation is a kind of second-order location
problem.
23Location model notation
- d(x,y) some distance/metric (e.g.,shortest path,
Euclidean, rectilinear) - X x1, , xn collection of n facilities to
locate - P (p1, , pm) vector of DPs
- P' (p1', , pm') vector of ADPs
- M 1, , m DP index set
- Each pi is aggregated into (replaced by) pi'.
Commonly p is used here, but we use p for a
demand point.
24Notation comment
- We have
- P (p1, , pm) vector of DPs
- P' (p1', , pm') vector of ADPs
- Before solving an aggregated problem, we use the
fact that the ADPs are not distinct by combining
terms and doing away with redundancies. For
analytical purposes, it is useful to refer to ADP
pi' for each DP pi.
25Location model notation
- D(X,pi) distance between DP pi and a closest new
facility in X. - D(X,pi') distance between ADP pi' and a closest
new facility in X. - D(X,P) (D(X,pi)), D(X,P') (D(X,pi '))
corresponding m-vectors of all closest distances
26Location model notation
- g a costing function that maps each of D(X,P),
D(X,P') into a cost - f(XP) g(D(X,P)) original model
- f(XP') g(D(X,P')) aggregated model
- Because the ADPs are not distinct, f(XP') has
less distinct DPs than f(XP), and is a smaller
model. Some algebraic steps are typically
necessary to simplify f(XP').
27Examples of f(XP) g(D(X,P))
- D(X,P) (D(X,p1), , D(X,pm))
- g(Y) w1 y1 wm ym or
- maxw1 y1, , wm ym
- f(X) g(D(X,P)) w1 D(X,p1) wm D(X,pm)
- n-median model, X n
- f(X) g(D(X,P)) maxw1 D(X,p1), , wm
D(X,pm) n-center model , X n
Use this as the Y vector in g(Y).
28Simplifying the agg. model f(XP')
- w1 D(X,p1') wm D(X,pm') aggregated
n-median model - maxw1 D(X,p1'), , wm D(X,pm') aggregated
n-center model - Extreme Examples if all pi' c,
1) becomes - W D(X,c), with W w1 wm.
- 2) becomes W D(X,c) with
- W maxw1, , wm.
29Outline
- Location problems
- Location analysis
- Location model typology
- References
- MIP location models, MP aggregation
- Demand point modeling, aggregation
- Common DP aggregation approach
- Law of diminishing returns
- Aggregation error measures
- SAND location models
- Error bounds
- Paradox of aggregation
- Overview some aggregation algorithms
- Example aggregations
- Conclusions
30Aggregation error
- There is no generally accepted way of measuring
aggregation error. - Note, however, that we have objective function
value error unless f(XP) f(XP') for all X.
Here is an error, but not the kind we are
interested in.
31Various error measures
- ABC Error (Hillsman Rhoda, 1978) for n-median
model. - D(X,pi) D(X,pi') three cases depending on X,
pi, pi'. - This is a myopic error measure. The model
objective structure is ignored.
32More error measures n-median model
- DP i Error
- ei(X) wi D(X,pi) wi D(X,pi'), i ? M.
- These errors can be negative or positive.
- Total Error
- e(X) e1(X) em(X) f(XP) f(X,P')
- Self-Canceling Error Because each DP i error
ei(X) can be negative or positive, e(X) can be
nearly zero.
33Self-canceling error
- Very useful for n-median, and similar models.
There is some experimental and theoretical
evidence that centroids work well for such models
if there are enough centroids. - For other types of models, such as center and
covering models, there is no self-canceling
error.
34More error measures any location model
- Absolute Error
- ae(X) f(XP) f(XP') for all X.
- The closer ae(X) is to zero, the better.
- Relative Error
- rel(X) ae(X)/f(XP), all X.
- Maximum Absolute Error
- mae maxae(X) X
This is difficult to compute if minimizing f(XP)
is NP-hard.
35Logical difficulty computing error measures
- Error measures logically involve both f(XP) and
f(XP'). However, we have to aggregate P into P'
because f(XP) is difficult to compute for many
choices of X. -
- Thus it may be difficult to compute error measure
values for many choices of X.
36Error bounds (EBs)
- Recall
- mae maxae(X) X
- maxf(XP) f(XP') X.
- An error bound is a number eb for which mae
eb. - That is,
- f(XP) f(XP') eb for all X.
- A small eb value gives a small mae value.
37Mathematical programming results due to Geoffrion
(1977)
- Suppose f(XP) f(XP') eb for all X ? S.
Let X solve minf(XP)X ? S - let X' solve minf(XP')X ? S. We have
- f(XP) f(X'P') eb
- f(X'P) f(XP) 2 eb.
38Zemel early asymptotic work
- Zemel (1985) gives error bounds for planar
n-median and n-center problems with Euclidean
distances. He was not interested in aggregation,
but in finding asymptotically optimal solutions.
- For large n, and many DPs approximated by a
planar set of area A, each minimal objective
function is of the form k v(A/n), for a given
positive constant k.
We call this a square root formula.
39Square root formulas, theoretical basis for
LDR Zemel, Francis Rayco (96)
40Later error bound work
Closely related work Carrizosa, E., H. W.
Hamacher, S. Nickel and R. Klein, 2000
- Francis and Lowe, 1992. Explicit aggregation
error bounds for n-median, n-center, covering
problems, with any distance. (At the time, they
were unaware of Zemels work.) - Francis, Lowe and Tamir, 2000. Error bounds for
any location model of the form - f(XP) g(D(X,P)), when g is subadditive and
nondecreasing, i.e., SAND.
vector of closest distances
41SAND model error bounds (g is SAND)
- f(XP) g(D(X,P)), f(XP') g(D(X,P')).
- T(P,P') (d(p1,p1'), , d(pm,pm')), the vector
of distances between DPs and ADPs. - eb g(T(P,P')). f(XP) f(XP') eb for all
X. - This error bound is nonnegative, is nondecreasing
in the distances between each pi and pi', and
(when g(0) 0) is zero if and only if there is
no aggregation (pi pi' for all i). It applies
to a large class of models.
42Underlying error bound basis
triangle inequality for distance
- Assume the distance d satisfies the triangle
inequality and symmetry - d(x,p) d(x,p') d(p',p)
- d(x,p') d(x,p) d(p,p').
- Thus
- d(p,p') d(x,p') d(x,p) d(p,p') ?
- d(x,p') d(x,p) d(p',p).
an eb
43Example error bounds
- n-center model maxwi d(pi,pi') i ? M
- n-median model ?wi d(pi,pi') i ? M
- Basic DP aggregation insight each DP should have
a nearby ADP.
44Covering example error bounds can apply to
constraints also
- Original problem
- min X s. to f(XP) 1, X ? S
- f(XP) maxD(X,pi)/ri i ? M.
- Aggregated problem
- min X s. to f(XP') 1, X ? S
- f(XP') maxD(X,pi')/ri i ? M.
- We know f(XP) f(XP') eb for all X ? S,
with -
-
n-center structure
- eb maxd(pi,pi')/ri i ? M.
45Reformulated MIP versions of models, error bounds
- Many of the models of interest can be
reformulated as MIPs, using various finite
dominating set principles, to obtain numerical
solutions. - If an error bound applies to the original model,
it also applies to the reformulated MIP model.
46Using error bounds for aggregation
- Basic Idea Aggregate to make the error bound
small.
We want a small error.
47Ideal aggregation approach center problem
- Minimize the error bound
- Find Q to min Q, Q q eb(QP)
- i.e., min Q, Q q maxD(Q,pi) i 1, ,
m. -
- Think of Q as the set of q (distinct) ADPs pi'
is the closest ADP in Q to pi for each DP i. - Paradox of aggregation the latter problem is
an NP-Hard q-center problem.
48Avoiding the paradox
- Use a low-order heuristic algorithm to
- min Q maxD(Q,pi) i 1, , m approximately.
- For network problems, instead of using
shortest-path/network distances, use simpler
Euclidean or rectilinear distances. - A similar paradox, and way around it, occur with
the n-median, and the covering location models.
49Conceptual aggregation algorithm
- Problem Inputs
- Location Model
- Aggregation Algorithm
- Outputs
- Solution to Aggregated Model
error feedback loop
50Evaluation criteria DP aggregation algorithms
- EC-1 aggregation error
- EC-2 cost to a) get DP data, b) develop, run
algorithm, c) solve aggregated model - EC-3 ease of explanation
- EC-4 problem structure exploitation
- EC-5 robustness/flexibility
- EC-6 GIS implementability
- Tradeoffs EC-1 2 EC-1 3 EC-4 5
51Reminder planar distances
- Y
- c
X (x1,x2), Y (y1,y2) - X b
- a
- a x1-y1, b x2-y2, c v(a2
b2) - Euclidean d(X,Y) c,
- Rectilinear d(X,Y) a b
- Tchebyshev d(X,Y) maxa, b
-
52Assumed underlying problem structure, aggregation
algorithms
- Large metropolitan areas with a rectilinear
street structure.
53Some row-column aggregation algorithms
- RC-Med planar n-median problem, rectilinear
distances. Francis, Lowe and Rayco (1996) - RC-Cen planar n-center problem, rectilinear
distances. Rayco, Francis and Lowe (1997) - RC-Cov planar covering location problem,
rectilinear distances. Emir-Farinas and Francis
(2002)
54Basic idea row-column aggregation
- Imagine we impose a grid on the DP data.
- Project data onto x-axis, solve a related
location problem on x-axis to adjust column
spacing. - Project data onto y-axis, solve a related
location problem on y-axis to adjust row spacing. - Solve a related location problem for each cell
and use solution as the ADP for the cell.
55Basic idea row-column aggregation
- Imagine we impose a grid on the DP data.
- Project data onto x-axis, solve a related
location problem on x-axis to adjust column
spacing. - Project data onto y-axis, solve a related
location problem on y-axis to adjust row spacing. - Solve a related location problem for each cell
and use solution as the ADP for the cell.
56Basic idea row-column aggregation
- Imagine we impose a grid on the DP data.
- Project data onto x-axis, solve a related
location problem on x-axis to adjust column
spacing. - Project data onto y-axis, solve a related
location problem on y-axis to adjust row spacing. - Solve a related location problem for each cell
and use solution as the ADP for the cell.
57Basic idea row-column aggregation
- Imagine we impose a grid on the DP data.
- Project data onto x-axis, solve a related
location problem on x-axis to adjust column
spacing. - Project data onto y-axis, solve a related
location problem on y-axis to adjust row spacing. - Solve a related location problem for each cell
and use solution as the ADP for the cell.
58Basic idea row-column aggregation
- Imagine we impose a grid on the DP data.
- Project data onto x-axis, solve a related
location problem on x-axis to adjust column
spacing. - Project data onto y-axis, solve a related
location problem on y-axis to adjust row spacing. - Solve a related location problem for each cell
and use solution as the ADP for the cell.
59RC-Med for planar n-median problem, with r rows,
c columns
- Project all DP data onto x-axis.
- Solve c-median problem on x-axis to adjust
column spacing. - Project all DP data onto y-axis.
- Solve r-median problem on y-axis to adjust row
spacing. - Use rows, columns to create a grid.
- Choose as ADP in each cell the 1-median for that
cell. - Aggregate all DPs in each cell to the 1-median.
60RC-Cen for planar n-center problem, with r rows,
c columns
- Apply a 45 degree rotation to all the DP data.
- Project DP data onto x-axis.
- Solve c-center problem on x-axis to adjust
column spacing. - Project DP data onto y-axis.
- Solve r-center problem on y-axis to adjust row
spacing. - Use rows, columns to create a grid.
- Choose as ADP in each cell the (Tchebyshev)
1-center for that cell. Aggregate all DPs in
each cell to the 1-center. - Apply the inverse 45-degree rotation to the grid
to get the aggregation.
Given a 45 degree rotation, the rectilinear
distance between any two points equals the
Tchebyshev distance between the transformed
points.
61RC-Cov for planar covering problem
- Apply a 45 degree rotation to all the DP data.
- Project DP data onto x-axis.
- Pick a small covering radius rs and solve a
covering problem on the x-axis to get c centers.
Use the c centers to adjust" the grid column
spacing. - Project DP data onto y-axis.
- Use the small covering radius rs and solve a
covering problem on the y-axis to get r centers.
Use the r centers to adjust the grid row
spacing. - Use rows, columns to create a grid.
- Choose as an ADP in each cell the (Tchebyshev)
1-center for that cell. Aggregate all DPs in
each cell to the 1-center. - Apply the inverse 45-degree rotation to the grid
to get the aggregation.
62Computational Order RC Methods
- With m DPs, ordering the projected DPs on each
axis takes O(m log m). - With t r or c, a t-median, t-center, or
covering problem - on the line - is solved on
each axis. The order depends on the method used,
but O(m log m) or less is typical. - There are r c cells, with one ADP per (nonempty)
cell. Solving the location problem in each cell
is linear order in the number of DPs in the cell.
This is basically O(r c). - Typically O(m log m) is dominant, since m gtgt r c.
63D-F Another aggregation method center, covering
problems
- D-F For q ADPs, use the Dyer-Frieze q-center
algorithm, Dyer Frieze, 1985, with the m
original DPs. - It works nicely for network or for planar
problems, but is O(q m) O(q2).
64Outline
- Location problems
- Location analysis
- Location model typology
- References
- MIP location models, MP aggregation
- Demand point modeling, aggregation
- Common DP aggregation approach
- Law of diminishing returns
- Aggregation error measures
- SAND location models
- Error bounds
- Paradox of aggregation
- Overview some aggregation algorithms
- Example aggregations
- Conclusions
65Application of D-F 6600 DPs, 198 ADPs,
aggregation for rectilinear distance covering
location problem Alachua Co., Florida
668-cover of DPs using ADPs of last slide,
rectilinear distances
A 3-center based on this approach was implemented.
67Application of RC-Med LDR with sample average
absolute error
68Application of RC-Med LDR with sample average
relative error
69Example LDR for covering problem, using RC-Cen
70Example aggregation for a covering problem
- The next slide shows an aggregation produced with
RC-Cov. A restriction of the aggregated covering
problem provided a provably optimal solution to
the original problem with 69,960 DPs. The DP
data is from Palm Beach County, Florida.
71Application of RC-Cov 69,960 DPs (blue), 703
ADPs (yellow)
Lake Okeechobee
Atlantic Ocean
72From RC-Cov lower, upper bounds on covering ofv
for 69,960 DP problem
7314-cover of aggregated covering problem with
smallest upper bound
74From RC-Cov actual aggregation eb versus square
root model prediction
75Summary
- Location problems
- Location analysis
- Location model typology
- References
- MIP location models, MP aggregation
- Demand point modeling, aggregation
- Common DP aggregation approach
- Law of diminishing returns
- Aggregation error measures
- SAND location models
- Error bounds
- Paradox of aggregation
- Overview some aggregation algorithms
- Example aggregations
76Questions?