Title: Better Analysis, Deeper Insights: A Public Sector Primer on Data Mining
1Better Analysis, Deeper Insights A Public
Sector Primer on Data Mining
- Jennifer Galvan
- Manager, Sales Engineering
2Commonly Asked Questions
- Will I be able to get copies of the slides after
the event? - Is this webinar being taped or can I view it
after the fact?
Yes
Yes
www.spss.com/events
3Agenda
- Predictive Analytics
- Data Mining and Text Mining
- Government applications
- Data Mining Methodology
- Clementine
- Getting Started
4SPSS Fundamentals
- Founded in 1968
- 30 year heritage as an innovator in analytics
technologies - IPO in 1993 (NASDAQ SPSS)
- Operations in more than 60 countries
- 16 organizations acquired since 1993
- Leadership
- Market leader in predictive analytics
- Recognized as a leader by Forbes, BusinessWeek,
Intelligent Enterprise, InfoWorld, CRM Magazine,
and others - Vital statistics
- 261.5 million in revenue in fiscal year 2006,
11 increase - 1,200 employees around the world
- 250,000 customers in business, academia, and
government
042806
5What is Predictive Analytics?
Predictive analysis helps connect data to
effective action by drawing reliable conclusions
about current conditionsand future
events. Gareth Herschel, Research Director,
Gartner Group
6Predictive Analytics can be leveraged to enhance
decisions within the business
Predict
Analyze data to provide insight and predict the
future
Recommend the mostappropriate actionto take
Predictive Analytics
Capture
Act
?Improve customer retention ?Grow share of
wallet ?Minimize risk ?Increase customer
satisfaction ? Enhance market share
Customers
Constituents
Store new data on customers, events, etc. for
continuous improvement
Prospects
Employees
Customer View
Students
Patients
Decision Optimization
People Data Enterprise Data Sources
7Data Mining Defined
- Data driven approach to problem solving
- Focused on Business Objectives
- Leverages organizational data
- Uncovers patterns using predictive analytics
- Uses results to help improve business decision
making and organizational performance
8What is Data Mining?
- Discovering meaningful patterns in your data
9What is Data Mining?
As the data grows
the relationships become more complicated.
10Data Mining and Statistics
- Statistical Analysis
- Tests for statistical correctness of models
- Are statistical assumptions of models correct?
- Hypothesis testing
- Is the relationship significant?
- Tends to rely on sampling
- Techniques are not optimized for large amounts of
data - Requires strong statistical skills
- Data Mining
- Less interested in the mechanics of the technique
- If it works and makes some sense, lets use it
- No assumption required
- Can find patterns in very large amounts of data
- Requires understanding of data and business
problem - Focus on Deploying Results
11Government and Citizen Understanding
- What are the voters saying about me?
- I know that people in this neighborhood are
unhappy but why? - We always talk about crime and healthcare, but
are these the main issues on peoples minds? - Young people dont vote. How can we get them
engaged? - What is the real problem with our schools?
12Unstructured Data Holds the Keys but do you use
it effectively?
- Most just ignore qualitative customer input
provided in surveys, emails, phone calls, and
other sources its not even captured. - Many store important business/customer text
information but have no way to leverage it - A few start mining text but are unable to match
results on business data
13State of Texas
- What data mining has done for
Challenge Improve tax audit selection process
Recovered 400 million in unpaid taxes
14Tax Gap
- Tax administrators need analytics to cope with
an increasing tax gap with limited resources - Technology adoption is on the rise
- Non-filer discovery TX, MA, CT, NM, IA, OK
- Audit selection IRS, TX, NY, (SC)
- Collections CA, VA, KS, MO
- Many benefits-based projects (self-funded)
- The scale of potential payback is very large
- TX (discovery) 400M over 5 years
- VA (collections) Break even after 2 months
- MA (discovery) 88M/week
15Centers for Medicare Medicaid Services (CMS)
-
- What data mining has done for
Challenge develop a program that will isolate
factors that lead to incorrect Medicare payments
Reduced payment errors by 50.
16Common Applications in Public Sector
- Voter concerns
- Law Enforcement
- Fraud, Waste and Abuse
- Education
17Put yourself in this position
- You are totally new to data mining
- and none of your colleagues have done it before
either - but your agency has decided its a Good Thing
- and they want you to lead the development of a
predictive analytics approach to decision making. - Where do you start?
18The Methodology - CRISP-DM
- 6 Phases
- Business Understanding
- Data Understanding
- Data Preparation
- Modeling
- Evaluation
- Deployment
- Reflects iterative nature of data mining
19Getting started with Data Mining
- Follow CRISP-DM
- Begin with the end in mind
- Limit the scope of your initial project
- Define an executable data mining strategy
- Line up the right people
- Line up the right data
20Clementine Market Position
- Clementine is the leading data mining workbench
because it is - Easy to use
- Comprehensive
- Supports the entire data mining process
- Provides outstanding performance scalability
- Therefore delivers
- High productivity
- Quick time-to-solution
- High ROI
21Clementine Ease of Use and Comprehensive
Facilities
- Clementines visual approach makes it easy to
integrate a comprehensive range of facilities
while remaining problem oriented - Suitable for the Business Analyst as well as the
Technical Expert
22Visual Data ManipulationDedicated nodes for
extensive list of techniques
23Automation Numeric Predictor
- Automated modeling operations create evaluate
many different models in one step - New Numeric Predictor node means automated
modeling for numeric outcomes
24Binary Classifier
- Binary Classifier is automated modeling for
yes/no outcomes
25Graphboard
- Graphboard provides a range of visualizations,
both traditional and advanced - Two modes
- Basic mode acts like a wizard select data
and get offered a range of appropriate graphs
options - Detail mode traditional pattern select
graph, select data, select options
26Custom Tables
- Create complex reports / nested tables
- Designed using pivot table or drag and drop
style user interface
27Getting Started
- Data Mining Jump Start
- Reduce time to implement your first data mining
project - Combination of training and coaching
- 5 day on-site
28Question and Answer
29For More Information
- In case you missed it recorded version and
slides available at www.spss.com/events - Visit www.spss.com/clementine to learn more about
the platform - Call us at 1-800-543-2185 or sales_at_spss.com
- Please fill out the post event survey