Data Mining - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Data Mining

Description:

David Klein is a Senior Software Architect at SSW, specialising in .NET & SQL ... Current Clients Sally Knox Medical & Pisces ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 21
Provided by: jatinva
Category:

less

Transcript and Presenter's Notes

Title: Data Mining


1
Data Mining
  • David Klein Adam Cogan

2
Admin Stuff
  • Attendance
  • You initial sheet
  • Hands On Lab
  • You get me to initial sheet
  • Certificate
  • At end of 10 sessions
  • If I say you have completed successfully ?

3
About
  • David Klein is a Senior Software Architect at
    SSW, specialising in .NET SQL Server BI
    solutions
  • Current Clients Sally Knox Medical Pisces
  • Adam Cogan is Chief Architect at SSW and one of 2
    Microsoft Regional Directors in Australia,
    specialising in Office, SQL and .NET solutions

4
Course Overview
  • The 5 Sessions (Part B)
  • SSIS and Creating a Data Warehouse
  • Creating a Cube and Cube Issues
  • Reporting Services
  • Other Cube Browsers
  • Data Mining
  • http//www.ssw.com.au/ssw/events/2006SQL/

5
Session 5 Tonights Agenda
  • Why Data Mining?
  • Uses
  • Algorithms
  • Implementation
  • Sql Server Management Studio SSMS
  • Reporting Services
  • Sql Server Integration Services
  • DMX
  • Demo ?
  • Hands on Lab

6
Why Data Mining?
  • Marketing
  • Who picks the movie? The kids, the wife, me.
  • Who are our Customers and what sort of films do
    they hire?
  • Is a 30 year old woman with 2 children going to
    hire Arnies latest film
  • Validation
  • Is this data sensible? Terminator 2 and Toy Story
  • Prediction
  • Sales Next Year

7
Complete Set Of Algorithms
8
NaĂŻve Bayes
  • Quickly builds mining models that can be used for
    classification and prediction
  • It calculates probabilities for each possible
    state of the input attribute, given each state of
    the predictable attribute
  • This can later be used to predict an outcome of
    the predicted attribute based on the known input
    attributes
  • This makes the model a good option for exploring
    the data

9
NaĂŻve Bayes Toy Story 2
10
Decision Trees (1)
  • Decision Trees assign (classify) each case to one
    of a few (discrete) broad categories of selected
    attribute (variable) and explains the
    classification with few selected input variables
  • The process of building is recursive partitioning
    splitting data into partitions and then
    splitting it up more
  • Initially all cases are in one big box

11
Decision Trees (2)
  • The algorithm tries all possible breaks in
    classes using all possible values of each input
    attribute it then selects the split that
    partitions data to the purest classes of the
    searched variable
  • Several measures of purity
  • Then it repeats splitting for each new class
  • Again testing all possible breaks
  • Unuseful branches of the tree can be pre-pruned
    or post-pruned

12
Decision Trees (3)
  • Decision trees are used for classification and
    prediction
  • Typical questions
  • Predict which customers will leave
  • Help in mailing and promotion campaigns
  • Explain reasons for a decision
  • What are the movies young female customers like
    to buy?

13
Decision Trees Who Decides
14
Cluster Analysis (1)
  • Grouping data into clusters
  • Objects within a cluster have high similarity
    based on the attribute values
  • The class label of each object is not known
  • Several techniques
  • Partitioning methods
  • Hierarchical methods
  • Density based methods
  • Model based methods
  • And more

15
Cluster Analysis (2)
  • Segments a heterogeneous population into a number
    of more homogenous subgroups or clusters
  • Some typical questions
  • Discover distinct groups of customers
  • Identification of groups of houses in a city
  • In biology, derive animal and plant taxonomies
  • Find outliers

16
Conclusion When To Use What
17
Summary
  • Why Data Mining?
  • Uses
  • Algorithms
  • Implementation
  • Sql Server Management Studio SSMS
  • Reporting Services
  • Sql Server Integration Services
  • DMX
  • Demo ?
  • Hands on Lab

18
Book
  • Data Mining with SQL Server 2005
  • ZhaoHui Tang and Jamie MacLennan
  • Wiley Press

19
2 things
DavidKlein_at_ssw.com.au AdamCogan_at_ssw.com.au
20
Thank You!
BI is Cool
Write a Comment
User Comments (0)
About PowerShow.com