Why Data Mining Research Does Not Contribute to Business - PowerPoint PPT Presentation

About This Presentation
Title:

Why Data Mining Research Does Not Contribute to Business

Description:

... Cyprus, Estonia, Hungary, Latvia, Lithuania, Malta, Poland, Slovakia, Slovenia ... Poland: has been wary that Turkey, once accepted into the EU club, would draw ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 20
Provided by: mykolapec9
Category:

less

Transcript and Presenter's Notes

Title: Why Data Mining Research Does Not Contribute to Business


1
Why Data Mining Research Does Not Contribute to
Business?
DMBiz05 Porto, Portugal October 3, 2005
  • Mykola Pechenizkiy, Seppo Puuronen Department of
    Computer ScienceUniversity of Jyväskylä
    Finland
  • Alexey Tsymbal
  • Department of Computer ScienceTrinity College
    DublinIreland

2
Outline
  • Introduction and What is our message?
  • Where we are? rigor vs. relevance in DM
  • Towards the new framework for DM research
  • DM System as adaptive Information System (IS)
  • DM research as IS Development DM system as
    artefact
  • DM success model success factors
  • Further plans and Discussion

3
Our Message
  • DM is still a technology having great
    expectations to enable organizations to take more
    benefit of their huge databases.
  • There exist some success stories where
    organizations have managed to have competitive
    advantage of DM.
  • Still the strong focus of most DM-researchers in
    technology-oriented topics does not support
    expanding the scope in less rigorous but
    practically very relevant sub-areas.
  • Research in the IS discipline has strong
    traditions to take into account human and
    organizational aspects of systems beside the
    technical ones.

4
Our Message
  • Currently the maturation of DM-supporting
    processes which would take into account human and
    organizational aspects is still living its
    childhood.
  • DM community might benefit, at least from the
    practical point of view, looking at some other
    older sub-areas of IT having traditions to
    consider solution-driven concepts with a focus
    also on human and organizational aspects.
  • The DM community by becoming more amenable to
    research results of the IS community might be
    able to increase its collective understanding of
  • how DM artifacts are developed conceived,
    constructed, and implemented,
  • how DM artifacts are used, supported and evolved,

  • how DM artifacts impact and are impacted by the
    contexts in which they are embedded.

5
Existing Frameworks for DM
  • Theory-oriented
  • Databases
  • Statistics
  • Machine learning
  • Data compression
  • Process-oriented
  • Fayyads
  • CRISP-DM
  • Reinartzs
  • Reductionist approach of viewing DM as statistics
    has advantages of the strong background, and
    easy-formulated problems.
  • The DM tasks concerning processes like
    clustering, regression and classification fit
    easily into these approaches.
  • More recent (process-oriented) frameworks address
    the issues related to a view of DM as a process,
    and its iterative and interactive nature

6
Rigor and Relevance in DM Research
  • Lin in Wu et al. notices that a new successful
    industry (as DM) can follow consecutive phases
  • discovering a new idea,
  • ensuring its applicability,
  • producing small-scale systems to test the market,

  • better understanding of new technology and
  • producing a fully scaled system.
  • At the present moment there are several dozens of
    DM systems, none of which can be compared to the
    scale of a DBMS system.
  • This fact indicates that we are still in the 3rd
    phase in the DM area!

7
Rigor vs Relevance in DM Research
8
Where is the focus?
  • Still! speeding-up, scaling-up, and increasing
    the accuracies of DM techniques.
  • Piatetsky-Shapiro we see many papers proposing
    incremental refinements in association rules
    algorithms, but very few papers describing how
    the discovered association rules are used
  • Lin claims that the RD goals of DM are quite
    different
  • since research is knowledge-oriented while
    development is profit-oriented.
  • Thus, DM research is concentrated on the
    development of new algorithms or their
    enhancements,
  • but the DM developers in domain areas are aware
    of cost considerations investment in research,
    product development, marketing, and product
    support.
  • However, we believe that the study of the DM
    development and DM use processes is equally
    important as the technological aspects and
    therefore such research activities are likely to
    emerge within the DM field.

Towards the new framework for DM research
9
DMS in the Kernel of an Organization
Environment
  • DM is fundamentally application-oriented area
    motivated by business and scientific needs to
    make sense of mountains of data.
  • A DMS is generally used to support or do some
    task(s) by human beings in an organizational
    environment both having their desires related to
    DMS.
  • Further, the organization has its own environment
    that has its own interest related to DMS, e.g.
    that privacy of people is not violated.

10
The ISs-based paradigm for DM
Ives B., Hamilton S., Davis G. (1980). A
Framework for Research in Computer-based MIS
Management Science, 26(9), 910-934.
Information systems are powerful instruments for
organizational problem solving through formal
information processing
Lyytinen, K., 1987, Different perspectives on
ISs problems and solutions. ACM Computing
Surveys, 19(1), 5-46.
11
DM Artifact Development
A multimethodological approach to the
construction of an artefact for DM
Adapted from Nunamaker, W., Chen, M., and
Purdin, T. 1990-91, Systems development in
information systems research, Journal of
Management Information Systems, 7(3), 89-106.
12
The Action Research and Design Science Approach
to Artifact Creation
13
DM Artifact Use Success Model 1 of 3
Adapted from DM IS Success Models
14
DM Artifact Use Success Model 2 of 3
  • What are the key factors of successful use and
    impact of DMS both at the individual and
    organizational levels.
  • how the system is used, and also supported and
    evolved, and
  • how the system impacts and is impacted by the
    contexts in which it is embedded.
  • Coppock the failure factors of DM-related
    projects.
  • have nothing to do with the skill of the modeler
    or the quality of data.
  • But those do include
  • persons in charge of the project did not
    formulate actionable insights,
  • the sponsors of the work did not communicate the
    insights derived to key constituents,
  • the results don't agree with institutional truths

the leadership, communication skills and
understanding of the culture of the organization
are not less important than the traditionally
emphasized technological job of turning data into
insights
15
DM Artifact Use Success Model 3 of 3
  • Hermiz communicated his beliefs that there are
    the four critical success factors for DM
    projects
  • (1) having a clearly articulated business problem
    that needs to be solved and for which DM is a
    proper tool
  • (2) insuring that the problem being pursued is
    supported by the right type of data of sufficient
    quality and in sufficient quantity for DM
  • (3) recognizing that DM is a process with many
    components and dependencies the entire project
    cannot be "managed" in the traditional sense of
    the business word
  • (4) planning to learn from the DM process
    regardless of the outcome, and clearly
    understanding, that there is no guarantee that
    any given DM project will be successful.

16
New Research Framework for DM Research
17
New Research Framework for DM Research
18
Further Work
  • Definition of Relevance concept in DM research
  • The revision of the book chapter
  • Further work on the new framework for DM
    research
  • Organization of Workshop/Working conf. or ST on
  • more social directions in DM research likely
    with one of the focuses on IS as a sister
    discipline.
  • SIAM DM 2006 Interests include
  • Human Factors and Social Issues
  • ? Ethics of Data ?Mining Intellectual
    Ownership? Privacy Models ? Privacy
    Preservation Techniques? Risk Analysis ?
    User Interfaces? Data and Result Visualization

19
Thank You!
  • Feedback is very welcome
  • Questions
  • Suggestions
  • Collaboration
  • Book chapter draft is available on request from
  • Mykola Pechenizkiy
  • Department of Computer Science and Information
    Systems,
  • University of Jyväskylä, FINLAND
  • E-mail mpechen_at_cs.jyu.fi
  • Tel. 358 14 2602472 Fax 358 14 260 3011
  • http//www.cs.jyu.fi/mpechen
Write a Comment
User Comments (0)
About PowerShow.com