Research Interests - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Research Interests

Description:

Data Mining: Anticipating users questions in a. flexible and scalable manner ... NCR/WalMart: Teradata Architecture for Data Mining & Warehousing (Natalie Pakhomkina) ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 24
Provided by: engi79
Category:

less

Transcript and Presenter's Notes

Title: Research Interests


1
Research Interests
Juan E. Vargas Computer Science
Engineering vargasje_at_engr.sc.edu September 14,
2001
2
Data Mining,Knowledge DiscoveryData
Warehousing
Dynamic Uncertaintyof Research Areas
3
Data Mining Anticipating users questions in a
flexible and scalable manner
DM is searching for strong patterns within big
data that can be generalized to make predictions
and to support future decisions ...
DM is a cooperative effort involving humans and
computers...
DM is a process, not a product.
Features are identified from a problem domain
and measured over many cases, to do
classification or regression.
Complexity is given by the number of cases, the
number of features, and the number of distinct
values that features can assume.
4
Knowledge Discovery
KD is prior to prediction. KD is necessary when
the available information is insufficient to
predict accurately.
Given a set of data (D), a language (L), and some
measurement of certainty (C), find statements (S)
or patterns (P) that describe relationships among
subsets of D with certainty C.
Interesting patterns having sufficient certainty
can be treated as new pieces of knowledge that
can be incorporated into a knowledge base.
KD nontrivial extraction of implicit,
previously unknown, and potentially useful
information from data.
5
Underlying Principle
6
Data Mining
7
Bayesian Belief Networks
A Bayesian belief network (BBN) is a directed
acyclic graph (DAG) in which nodes represent
probabilistic variables and links represent
relations between the variables. Causal
relations are quantified by conditional
probabilities associated with each link. Belief
(probability) is computed using Bayes Rule,
propagating messages among nodes (for
singly-connected networks) or cliques in a tree
(for multiple-connected networks).
8
Bayesian Belief Networks
Bayesian Networks (BNs) offer the best
combination of formal representation, clear
semantics, and efficiency for reasoning under
uncertainty (incomplete, ambiguous, partially
available information).
Influence Diagrams (IDs) are knowledge
representations that combine probabilities with
utilities to offer the advantages of BNs plus a
uniform representation scheme for decision making.
We are furthering the state of the art on BNs and
IDs and applying these methodologies for Data
Mining Data Warehousing.
9
Rationale
Closed- form, analytical solutions are not always
available
e ma2 e mb2 e mc..
10
Current Projects
DARPA Resource Allocation in Dyamic Uncertain
Domains (TargetShare) (Nagabushan Mahadevan,
Kiran Tvarlapati)
NCR/WalMart Teradata Architecture for Data
Mining Warehousing (Natalie Pakhomkina)
DODSCARNG Vibration Monitoring Enhancement
Program (VMEP) (Natalie Pakhmokina, Elena Zagrai)
11
TargetShare Allocating Resources via Negotiating
Agents who use Bayesian Networks to Deal with
Uncertainty
12
Sensor Process Models
Process model
unknown
State(T-1)
State(T)
State(T1)
State(T2)
Sensor Model
Signals(T1)
Signals(T)
Signals(T2)
Signals(T-1)
observed
13
Sensor and Process Models (v2)
14
Monitoring the Sensors
  • Sensor is rewarded if its reading is a positive
    contribution towards the correct prediction.
  • Sensor is penalized (less trusted) if reading was
    a negative contribution or if reading is not
    available for a long time

Sensor1 Sensor2 Sensor3 Sensor4 Target
Loc
15
Tested Tracks
S
S
S
S
S
16
Results
17
Wal-Mart NCR Teradata Architecturefor
E-Commerce, Data Mining Data Warehousing
18
Teradata at USC
  • 5100M System
  • 10 nodes with 8 Processors
  • 2 GB of memory/node
  • 400 disks of 4.2 GB, or 1.7 Terabytes of Disk
    Storage
  • Ultra wide SCSI
  • RAID 5

19
VMEP
VMEP/VMU SYSTEM PROTOTYPE BETA TESTING AT SCARNG
SC-ARNG AASF
UH-60
AH-64
Data Warehousing and Mining
VMUs
USC Data Repository
Vibration Data
Crew Chiefs Laptop
Crew Chiefs Laptop
Condition Indicators
OS Cost Benefit Analysis
RTB and HUMS CIs
RTB Vibration Management
Parts and Maintenance
HUMS Vibration Diagnostics and Prognostics
Logistics Maintenance Data
ULLS-A
Product Qualification
AMCOM
IAC
MIMOSA
  • VMEP data must be
  • Catalogued
  • Time/aircraft synchronized
  • Accessible and retrievable

VMU 1-- 50
20
Teradata at USC
21
Teradata at USC
22
Teradata at USC
23
Principles Technologies
Probabilistic Networks Influence Diagrams (C,
JAVA, XML )
Distributed Systems and DB Design (SQL, ODBC,
JDBC, XML, C, Java, SOAP .NET, C)
Write a Comment
User Comments (0)
About PowerShow.com