Using Crime to predict crime - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Using Crime to predict crime

Description:

This caused accuracy form the network to go from 75% pass to 80% pass. How the network was run. ... was with in 1 std for the Auto theft data then it passed. ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 17
Provided by: cdaMor
Category:
Tags: crime | pass | predict | using

less

Transcript and Presenter's Notes

Title: Using Crime to predict crime


1
Using Crime to predict crime
2
DATA Collection
  • Obtained data from the area connect web site
    which is set up to provide people a way to
    compare different cities crime rates. This data
    is based off of 2004 police reports
  • Used list of largest American urban areas
  • collected data for 55

3
Variables
  • Murder, Forcible Rape, Robbery, Assault,
    Burglary, Theft, are the predictor variables.
  • Auto theft was used as the response.

4
Linear analysis.
  • Did a simple linear analysis to check if the
    different variables were significant in the
    correlation.
  • Only the statistic on robbery was significant on
    its own, but the p-value for the Regression is
    .0004.

5
Using Arc to linearize the Data.
  • Arc is a tool that is designed for Linear
    analysis of data.
  • Transformed Data set using these
    different transformations
  • Scatter plot matrix shows clear correlation
    between variables

6
Experimenting with Joone
  • First networks that was tried were done using the
    Joone libraries.
  • Achieved accuracy of getting 60 to pass using a
    test criteria of the given being within - (.25
    Correct value).
  • Determined that two layer networks would provide
    results with around the same accuracy, but the
    network would target the mean of all outputs
    rather than our target.

7
Problems with Joone
  • Joone proved to be too slow, and would often run
    out of memory during a job. At times it would
    take nearly 20 mn
  • It was then decided that lens would be the new
    choice for the NN.

8
Using Lens
Hidden Layer
Input Layer
  • Different combinations were tried to find the
    optimal network design.
  • The best combination appeared to be 10 hidden
    nodes. A learning rate of .5 and momentium of .15

Output Node
9
Other optimizations done
  • In order to improve accuracy a different method
    was used to normalize the data. Every value
    from each of the different crime times was
    divided by the largest value in that group. This
    caused accuracy form the network to go from 75
    pass to 80 pass.

10
How the network was run.
  • 38 cities were chosen out of the set of 55, at
    random with no repeats for 10 different training
    sets and 10 different testing sets.
  • The network was ran until it reached the global
    min for the network.
  • The subsequent test was ran through the network
  • If the out put of the network was with in 1 std
    for the Auto theft data then it passed.

11
The NN performance
................. original 1303.8003125 found
1091.99575 pass 211.8045625 original
959.89975 found 1106.674875 pass
-146.775125 original 963.501 found 1090.377875
pass -126.876875 original 1150.4005
found 1330.19425 pass
-179.79375 original 843.600875 found
1207.3351875 pass -363.7343125 original
1124.6999375 found 1302.0910625 pass
-177.391125 original 385.7986875 found
547.438375 pass -161.6396875 original
483.8010625 found 1027.8585625 fail
-544.0575 original 1208.101125 found
2663.2506875 fail -1455.1495625 original
742.6986875 found 1062.874 pass
-320.1753125 original 1502.199625 found
1382.2215625 pass 119.9780625 original
309.798875 found 291.4674375 pass
18.3314375 original 687.5995625 found
836.4763125 pass -148.87675 original
838.4005625 found 547.5485625 pass
290.852 original 421.3005625 found 830.647125
pass -409.3465625 original 1512.90125
found 1195.2898125 pass
317.6114375 original 716.1999375 found
962.5335 pass -246.3335625 original
956.6989375 found 1104.6485 pass
-147.9495625 original 376.599375 found
1013.2063125 fail -636.6069375 percent
pass 0.805263157894737 avg error for passing
entries 138.226154967105 avg error for failing
entries 273.235211891447 posfail 37 negfail
37 pospass 155 negpass 151 number of
enties 380
12
Using SAS to Analyze data
  • SAS is a statistical tool that allows for the
    analysis of any data set
  • Compared results from SAS with neural network

13
Principle component analysis

  • Correlation Matrix


  • A B
    C D E
    F G


  • A A 1.0000 0.0362 0.6493
    0.4389 0.2682 -.0228 0.1588
  • B B 0.0362 1.0000 0.3552
    0.2232 0.5774 0.1841 0.1951
  • C C 0.6493 0.3552 1.0000
    0.4956 0.5969 0.2168 0.2621
  • D D 0.4389 0.2232 0.4956
    1.0000 0.3342 0.2278 -.0147
  • E E 0.2682 0.5774 0.5969
    0.3342 1.0000 0.5472 0.3807
  • F F -.0228 0.1841
    0.2168 0.2278 0.5472 1.0000
    0.1735
  • G G 0.1588 0.1951 0.2621
    -.0147 0.3807 0.1735 1.0000

  • A Murder, B Rape, C Robbery, D
    Assault, E Burglary, F Theft,
  • G Auto Theft
  • From the correlation matrix above, Robbery had
    the highest correlation values which indicate
    that it has the most significance for the data.
  • This corresponds with the data that we obtained
    from our network

14
Scatterplot matrix of Autotheft vs Robbery
  • Y Auto Theft X
    Robbery

15
What its learned
  • That auto theft can be predicted using other
    types of crime with fair amount of accuracy.
  • NNs are cool.

16
What else can be done
  • Try different targets, possibly multiple.
  • Further statistical analysis.
  • Expand dataset.
Write a Comment
User Comments (0)
About PowerShow.com