Title: STING: A Statistical Information Grid Approach to Spatial Data Mining
1STING A Statistical Information Grid Approach to
Spatial Data Mining
- Wie Wang
- Jiong Yang
- Richard Muntz
???? ???? ?????
2????? - Spatial Data Bases
- ??? ??????? ? DB ?? ????? ????? ???????? ??????,
????? ???? ????????. - ?????
- DB ?? ?? ???????? ????.
- DB ?????? ?????? ???????.
3????? - Spatial Data Mining
- ?????? ????? ?????? ??????? ????????, ?????
??????? ???????, ????? ?????? ??? ????? ??????. - ?????? ???????
- ???????? ????? ????????? ?????.
- Clustering.
- ??????? ?????? ????.
4?????? ??????
- Spatial Data Mining - ?????? ??????.
- ??????? ?????? ???????. DBSCAN.
- STING
- ???? ???????
- ???? ??????? ??????
- ????????? ( ???????? ?????? ?????? )
- ?????? ??????? ??????? ?? STING ?- DBSCAN
- ?????
5Spatial Data Mining - ??????
- Region Query-
- ????? ?????? ???????? ????? ???????.
- ?????? ???? ?? ?? ????? ?????? gt X ???? region
query ! - Function of Region -
- ????? ???? ?? ????? ?????? ??????? ????.
- ?????? ???? ?? ???? ????? ????? ????? ??????
???? X.
6??????? ??????
- ??? ?????? ?????? ???? spatial data mining ,??
????? ???????? ?? Clustering. - ??? ??????????? ???? ?? ????? 3 ?????
- 1. ??? ?? ?????? ???????.
- 2. ??? ???? ?????? ?????.
- 3. ??? Clustering.
- ????????? ??????? (???????) ????? ????? ????? ??
??? DBSCAN.
7??????? ?????? - ???????
- ?? ?????? ???? ?????? ??? ?????? ???? ??????
?????? ????? ?????????? ? DB. - ??? ?????? ?? ???? ????? ???? (????? ????
??????).
8DBSCAN(Ester,Kriegel,Sander,Xu)
- DBSCAN ??? ???????? Clustering.
- ????????? ????? ?? ??? ???????
- MinPoint - ???? ??????? ???????? ?????? ??.
- Epsilon - ???? ????? ???? ????? MinPoint.
- ??? ??? ??????? ??? ?????? Density Reachable.
- ????? ?????? ?? ????? ?????? ?????? ? Cluster.
9DBSCAN
- ???????
- ????? ????? Clusters ??? ????.
- ????? ?????? ??? ?? ? Clusters.
- ???????
- ???????? ????????? ???
??????. - ????????? ??? ???? ??????.
10STING Statistical Information Grid
- ????? ????? ???? ???????.
- ?????? ????? ???? ???? ??????? ?? ?????????? ?
DB. - ??????? ???? ?????? ?? ???? ????? (??????) ??
????? ?? ??????? ?????? ????. - ???????? ????????? ??? ??-????????.
11STING Statistical Information Grid
- ??????? ??????? ????? ???????
- ?????? ?? ????? ?????? ????? ???????.?? ????
???? ?????. - ????? ??? ???? ?? ???? ?? ????.?????? ??? ?? 4
????. - ???? ??? ??? ??? ????? ????? ?? ???? ?????????
??????. - ????? ???????? ????? ??? ???.
12STING Statistical Information Grid
????
????
13STING Statistical Information Grid
- ??? ???? ??? ????? ?? ???? ?????????? ???
n_objects - ??? ??????? ???? ?????? (????).
- ??? ????? ????? ?????????? ????????
- mean - ????? ?????? (????? ??? ??????? ??????).
- std - ????? ???.
- min - ??????? ?? ?????? ??? ?????????? ?????.
- max - ??????? ?? ?????? ?????????? ?????.
- distribution - ??? ???????? ( normal, uniform,
none )
14STING????? ???
- ????? ??? ????? ?? ????? ????? (?????????)
- ????
- distribution ???? ?? ??? ????? ?? ????? ?????
?????????. ??? ??????? ????? ????????. - ???????
- ??????
- ??? ???? ???? ??? ??????? ???????.
- ???? distribution - ???? ?? ????? ??????? ?????
?? ???????? ??? ??????? ?????. ?? ?? ????
????? ??? ????? ???????? ???? ? none.
15????? ??? - ?????
100 K
150 K
170 K
110 K
140 K
200 K
220 K
240 K
190 K
260 K
16STINGSupported Queries
- STING ????? ??? ??? ????? ?????? ?? ????? ???????
???? Region Queries , Function of Region. - ????? ??? ?? ??????? ?????????? ????????
- ????? 100 ???? ?????? ??? (??????).
- ????? 70 ?????? ????? ?? ??? 400,000.
- ??? ????? ??? ????? 120 ?????? ???.
- ???????? ?? ????? 0.9 (?????)
17????? house-map
SQL Query Select Region From house-map Where
Density in (3,?) And price Range (150000, ?)
with Percent (0.7,1) And Area (1, ?) And with
Confidence 0.9
100 K
150 K
170 K
110 K
140 K
200 K
220 K
240 K
190 K
260 K
?????? ???? ?? ?? ???????, ???? ??? ?? ????? 1
????? ???, ?? ???? ????? ??? ?? ????? 3 ???? ?
70 ??? ????? 150,000 ????? ?????? ?? 0.9 .
18STING AlgorithmGeneral Idea
- ???? ????????? ??? ?????? SQL ?????.
- ????????? ???? ?? ??? ?????? ?????.
- ??? ?? ?????? ???????? ???? ??????? ??????? ?????
????? ??????. - ??????? ????? ?? ?? ????? ?? ?????? ??????????
- ?????, ?????? ?? ?? ??????? ??????? ?? ????
????????? ???????? ????.
19 STING Algorithm
??? O(n_nodes) n_nodes ltlt n_obj
- 1. ???? ?? ????? ???????
- 2. ??? ?? ????? ??? ?? ???????? ??? ???? ???????.
??? ?? ??? ??????? ( ?? ?????? ?????) - 3. ?? ?? ???? ????? ???????, ?? ???. ???? ? (2).
- ???? (???? ??????)
- 3.1 ?? ???????? ?????? ???????
- ??? ?? ??????? ?? ???? ????????? ????? ????.
- 3.2 ????
- ???? ?? ? DB ?? ?? ??????? ??????? ?????????
???? ????? ????. - ???? ?? ??????? ?????? ?? ?????? ???????.
O(1)
O(n_nodes)
O(n_nodes)
?
O(n_nodes)
20 STING Algorithm?????
SQL Query Select Region From house-map Where
Density in (3,?) And price Range (150000, ?)
with Percent (0.6,1) And Area (2, ?) And with
Confidence 0.8
- ??????? ?? ????? ???????.
- ?????? ??? ?? ?? ?????? ??? ??? ????? ????? gt
150000. - ????? ??-???? ??? ????? ???, ???? ???? ??-???-
low,high (?????? 0.8) ????? ????? ????? gt
150K. - ??? area ??? ?? ???. ?? ????? ???, ?? ????? ???
??????? ?? high n_obj lt area 3 0.6 - ????? ???? ????? ?? ???? ?? ???? ?????????.
- ????? ?????, ????? ?? ?? ??????? ??????? ??
????? ??? ???? ????????.
21STING Algorithm????? ??????
- ???? ?????? ?? ?? ????? ????????? ???????, ????
????? ??????, ?? ????? ???? ???? BFS - ??? ?? ???????, ?????? ?? ?? ????? ????? ?????
????? ??? (?????). - ?? ??????? ??????? ???? ?? ?????, ?????? ?? ???
???????? ?? ?? ????? ?????????? ?????? ????. - ???? ???? ???, ?????? ????.
22STING????? ??????
- STING ???? ?? ??????? ???????? ?????? ?? ????
???????. - ?????? ??????? ???? ?????? ??????? ?????? ??
STING. - ?????? ????? ( ???? ???? ?? min_area ) ?????? ??
??????? ?????? ?????? ??? ?? ???????.
min_area 4
23STING????? ??????
- ?????? ?? ????? ???? ??????, ?? ?????? ?????
??? ???? ???? ????? ?? ????? ???? (???).
24STING??????? ????????
- ???????
- ???????? ??? ???? ??-???????.
- ???? ?????? ????? ?????.
- ???????
- ????? ?????? ??????? ????.
- ?? ???? ????? ?? ??????? / ????????.
- ???? ????? ??? ?? ? DB ?????.
25STING Vs. DBSCANIn theory
- ???? ?????? ?? STING ? DBSCAN ?? ?? ????? ?????
?????????? ???. - ??????? ???????? ?? STING ?? ??????? ????
???????? ?? DBSCAN. - ???? ????? ?? ????? ??????? ?? STING ?? ?????????
??????? (?? ???? ?? ????? ???? ????), ???? ??? ??
DBSCAN.
26STING ???????
- ??????? ????? ?? ??? ?????? ?????? ????? ????.
- STING ????? ????? ?? Region Query.
- ???? ?? ?? ??????? 100,000 ?????????.
- ????? ???????? ???? 7 ?????.
- ?????? 1 ??????? ??????? ??????? ?? ???????
?????. - ?????? 2 ??????? ??????? ??????? ?? ???????
?????. - ??? ????? ??? 10 ?????.
- ??? ????? ??????? 0.2 ????.
27STING ???????
28STING Vs. DBSCAN?????? ???????
- ??????? ?????? ?????? ?? ???? ?????? clustering
(?????? ?? ??????). - ??? ????? ?????? ?? DBSCAN ???? ????.
- ???? STING ????? ??????? ???? ?????? ???????.
- ??????? ????? ??????? SEQUOIA 2000 (benchmark).
- ?????? ???? DBSCAN ????? ?? ????? ??????.
- ??? ?? STING ???? 6 ????.
29STING Vs. DBSCAN?????? ???????
30?????
- Spatial Data Mining ????? ???????? ?????? ????.
- ??? ??????????? ??????? ??? ?????? clustering
(?????? DBSCAN) - STING ????? ????? ??????? ??? ????? ????? ???? ??
? DB ??????. - STING ????? ???? ??????? ??? ?????? ???????
???????? ?????? ????. - STING ???? ????? ? DBSCAN.