Data Mining - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Data Mining

Description:

Data Mining Database Systems Timothy Vu Mining What is Data Mining Why Data Mining Discussion Topics Classifiers Regression Associations Clustering Applications of ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 15
Provided by: TV9
Category:

less

Transcript and Presenter's Notes

Title: Data Mining


1
Data Mining
  • Database Systems
  • Timothy Vu

2
Mining
Mining is the extraction of valuable minerals or
other geological materials from the earth,
usually bauxite, coal, diamonds, iron, precious
metals, lead, limestone, nickel, phosphate, rock
salt, tin, and uranium, petroleum, natural gas,
and even water. Often something that is
valuable, rare, or useful.
3
What is Data Mining
Data Mining, also known as Knowledge-Discovery in
Databases (KDD), is the process of automatically
searching large volumes of data for patterns. In
order to achieve this, data mining uses
computational techniques from statistics, machine
learning and pattern recognition. Machine
learning - a method for creating computer
programs by the analysis of data sets. Pattern
recognition - classify data (patterns) based on
either a priori knowledge or on statistical
information extracted from the patterns.
4
Why Data Mining
  • Data mining is a technique that helps individuals
    or companies find useful information to make
    better decisions from large amounts of data.
  • Reduce risks
  • Find problems and issues
  • Save money
  • High confidence predictions
  • Simplifying information

5
Discussion Topics
1 ) Classification 2 )Regression 3)
Association 4) Clustering

6
Classifiers
Decision-Tree Classifiers each node has an
associated class and each internal node has a
predicate. Bayesian Classifiers find the
distribution of attribute values for each class
in the training data ( the maximum probability
predicted ). Nuro Net Classifiers Use the
training data to train artificial nuro nets.

7
Regression

Regression Deals with the prediction of a value
rather than a class. Linear Regression
Predict values using a polynomial by finding the
curve fitting, meaning finding coefficients that
give the best answer.
8
Associations
Finding the association or relationship between
two or more items. Support measure of what
fractions of the pupulation satisifies both the
antecedent and the consequent of the rule.
MILK gt Screwdrivers Confidence how often the
consequent is true when the antecedent is true.
MILK gt Bread
9
Clustering
Clustering is the classification of similar
objects into different groups, or more precisely,
the partitioning of a data set into subsets
(clusters), so that the data in each subset
(ideally) share some common trait - often
proximity according to some defined distance
measure.
10
Applications of Data Mining
  • 1. Predictions
  • - Stock Market
  • - Earth Quakes
  • NBA games
  • 2. Association
  • - Store Inventory
  • Fashion Trends
  • 3. Descriptive Patterns
  • - Disease Analysis
  • - Image Recognition
  • - Fraud Detection

11
Gather Data
12
Electrocardiogram
13
Disease Analysis
14
References
  • Silberschatz, H.F. Korth, S. Sudershan Database
    System Concepts, 5th ed., McGraw-Hill, 2006
  • Runge , Marschall, Magnus Ohman , and Frank
    Netter. Netter's Cardiology (Netter Clinical
    Science). W.B. Saunders Company, 2004.
  • "Data mining". Wikipedia. 4/1/2006
    lthttp//en.wikipedia.org/wiki/Data_Mininggt.
Write a Comment
User Comments (0)
About PowerShow.com