Evaluation for Web Mining Applications - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Evaluation for Web Mining Applications

Description:

Evaluation focusses on usage. Goals of usage depend on stakeholder and viewpoint. 10 ... Evaluation, its foci, and design of evaluation studies ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 30
Provided by: warholWiw
Category:

less

Transcript and Presenter's Notes

Title: Evaluation for Web Mining Applications


1
Evaluation for Web Mining Applications
  • Bettina Berendt
  • Humboldt University Berlin
  • Ernestina Menasalvas
  • Universidad Politécnica de Madrid
  • Myra Spiliopoulou
  • Otto von Guericke University Magdeburg
  • www.wiwi.hu-berlin.de/berendt/Evaluation

2
Evaluation
  • the act of ascertaining
  • the value and
  • the functioning
  • of an object according to specified criteria,
    operationalised by measures.

? to assess concrete achievements ? to give
feedback towards improvement
3
Evaluation for Web mining applications, or
Evaluation of Web applications
Is this a good Website?
4
Agenda
Evaluation and Web mining
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
5
What is Web Mining?
  • Despite its success, one problem of the current
    WWW is that much of this knowledge lies dormant
    in the data.
  • Web mining tries to overcome these problems by
    applying data mining techniques to the content,
    (hyperlink) structure, and usage of Web resources.

Web Mining Areas Web content mining
5
6
Application problems and typicalpattern
discovery techniques
Markov chains
Prediction of next event
Sequence mining
Discovery of associated events/application objects
Association rules
Discovery of visitor groups with common
properties interests
Clustering
Discovery of visitor groups with common behaviour
Session Clustering
Characterization of visitors with respect to a
set of predefined classes
Classification
Card fraud detection
7
Knowledge Discovery steps The Cross-Industry
Standard Process for Data Mining CRISP-DM
8
Agenda
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
9
Application problems and goals (1)
  • Top-level goal 1 The Web exists in order to be
    used.
  • ? Evaluation focusses on usage.
  • Goals of usage depend on stakeholder and
    viewpoint.

10
Application problems and goals (2)
  • Stakeholders
  • Site users
  • Site owners / sponsors (technical, marketing,
    management, ...)
  • Viewpoints a Web site / a collection of Web
    sites or pages as ...
  • ... a piece of software
    ?
    usability?
  • ... a distribution channel for a business or
    organization ?
    profitability? market analysis recommendations
    for cross-selling ...
  • ... a collection of documents
    ?
    frequency of use / public perception?
    competition analysis
  • ... a medium for a given content and tasks (e.g.,
    e-Learning) ? cf. distribution
    channel
  • ... a Web of connections (e.g., a social network)
    ? what
    properties does the network have?

11
Is the site a good site? ? Is it successful?But
What does Success mean?
  • Before talking of success
  • Why does the site exist?
  • Why should someone visit it?
  • Why should someone return to it?
  • After answering these questions
  • Does the site satisfy its owner?
  • Does the site satisfy its users?
  • ALL the users?

12
The object of evaluation usability
  • The effectiveness, efficiency, and satisfaction
    with which specified users achieve
    specified goals in particular environments.
  • Effectiveness The accuracy and completeness with
    which specified users can achieve specified goals
    in particular environments.
  • Efficiency The resources expended in relation to
    the accuracy and completeness of goals
    achieved.
  • Satisfaction The comfort and acceptability of
    the work system to its users and other people
    affected by its use.

13
The measures Examples of usability metrics
Satisfaction Measures
Efficiency Measures
Effectiveness Measures
Usability Objective
Rating scale for satisfaction
Time to complete a task
Percentage of goals achieved
Suitability for the Task
Rating scale for satisfaction with

"power features"
Relative efficiency compared with an expert user
Number of "power features" used
Appropriate for trained users
Rating scale for "ease of learning"
Time to learn criterion
Percentage of functions learned
Learnability
Rating scale for error handling
Time spent on correcting errors
Percentage of errors corrected
successfully
Error Tolerance
14
Examples of usability measures derived from Web
mining
  • Berendt Spiliopoulou (2000) sequential
    patterns
  • Search criteria (interface)
  • Selection-based most popular (? user
    satisfaction), but least efficient.
  • Type-in least popular, most efficient
  • search criteria (content)
  • Location most popular
  • Kralisch Berendt (2004) quasi-experimental
    design, support, sequential patterns ? Search
    criteria popularity is influenced by country
    culture
  • Poblete Baeza-Yates (2004) Query clustering ?
    identify the need for hyperlinks and new content
  • Stojanovic et al. (2002) popularity ? identify
    need for new content concepts concepts to be
    dropped (ontology evolution) crawler obtains
    content

15
  • Before talking of success
  • Why does the site exist?
  • Why should someone visit it?
  • Why should someone return to it?
  • After answering these questions
  • Does the site satisfy its owner?
  • Does the site satisfy its users?
  • ALL the users?

Business goals
Value creation
Sustainable value
Application-centric measures
User-centric measures
User types
16
The object of evaluation satisfaction of
business goals
Personalisation
  • 1. Sale of products/services on-line

Amazon sells books (etc) online. The site should
help the users find the most suitable books for
their needs, identify more related products of
interest and, finally purchase them in a secure
and intuitive way.
Cross/Up-Selling
Site design
Selling
2. Marketing for products/services to be acquired
off-line
Insurances, banks, application service providers
etc providers of services based on a long-term
relationship with the customer do not sell
on-line to unknown users. The site should
demonstrate to the users the quality of the
product/service and the trustworthiness of its
owner and initiate an off-line contact.
3. Reduction of internal costs, information
dissemination,
17
The measures example e-marketing metrics based
on the sales process
Customer-company interaction phases
Information Acquisition
Negotiation Transaction
After Sales Support
  • Ratio of persons going from one phase to the next
    ? positive and negative
    measures
  • example Conversion rate customers / contacted
    prospects

18
Agenda
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
19
Objectives of the application The largest
European full multi-channel e-tailerselling
consumer electronics online in gt5000 shops
  • General objectives Standard e-tailer goals
    attract users/shoppers and convert them into
    customers
  • Specific objectives assess the success of the
    Web site in relation to other distribution
    channels
  • ? Questions of the evaluation
  • What business metrics can be calculated from Web
    usage data, transaction and demographic data for
    determining online success?
  • Are there cross-channel effects between a
    companys e-shop and its physical stores?

Background Internet market shares BCG 2002
Teltzrow Berendt, Proc. WebKDD 2003
Günther, Proc. 4th IBM eBusiness Conference 2003
20
Outline of the KDD process
  • Business understanding see previous slide
  • Data
  • gt 90K Web server sessions, gt 10K transaction
    records 21 days in 2002
  • Data understanding main step
  • modelling the semantics of the site in terms of a
    hierarchy of service concepts that follows the
    phases of the sales process
  • Data preparation
  • Session IDs usual data cleaning steps
  • Linking of sessions transaction information
    (anonymized)
  • Modelling / pattern discovery
  • Web metrics, cluster analysis, association rules,
    sequence mining correlation analysis,
    questionnaire study, qualitative market analysis
  • Pattern evaluation Interesting patterns

21
Starting point Web life-cycle metrics,
micro-conversion rates
Cutler and Sterne (2001)
W (whole population)
S (suspects / site visitors)
nS
P (prospects / active investigators)
nP
C (customers)
Cb (abandon cart)
nC
CR (repeat customers)
CA (attrited customers)
C1 (One time Customers)
Metrics example click-through rate M2 / M1
22
Extension for application-oriented success
measurement Multi-Channel Metrics
C
WM5 (paid online)
SM5 (paid in store)
SM5 (paid in store)
SM5 (paid in store)
WM5 (belong to SM5 in at least one following
transaction)
WM5 (belong to WM5 in every following
transaction)
WM5 (belong to SM5 in at least one following
transaction)
WM5 (belong to SM5 in at least one following
transaction)
C
WM6 (direct delivery)
SM6 (pick up in store)
SM6 (pick up in store)
SM6 (pick up in store)
SM6 (pick up in store)
WM6 (belong to SM6 in at least one following
transaction)
WM6 (belong to WM6 in every following
transaction)
WM6 (belong to SM6 in at least one following
transaction)
23
Internal consistency of preferences payment
and delivery preferences
  • Online payment ? Direct delivery (s0.27, c0.97)
    lt 1/3 traditional onl.users!
  • Online payment ? In-store pickup (s0.02, c0.03)
  • Cash on delivery ? Direct delivery (s0.02,
    c0.03)
  • In-store payment ? In-store pickup (s0.69,
    c0.94)
  • ? Site is primarily used to collect information.

s support, c confidence of the sequence
24
Development of preferences over time
  • Direct delivery ? In-store pickup in ?1 following
    transaction (s0.001,c0.15)
  • Direct delivery ? Direct delivery in all
    following transactions (s0.003,c0.85)
  • In-store pickup ? Direct delivery in ?1 foll.
    transaction (s0.001, c0.10) ()
  • In-store pickup ? In-store pickup in all foll.
    transactions (s0.004, c0.90)
  • Results for payment migration are similar.
  • ? 90 of repeat customers did not change
    transaction preferences at all.
  • ? Rule () as an indicator of the development of
    trust?!

25
Agenda
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
26
Evaluation of Web mining applications, or Web
mining as a project
Is it worthwhile to do the mining project?
Are the data appropriate for the mining project?
Is the result valuable for the application?
Are the techniques appropriate for the expected
results?
Are (all) the tasks performed well?
27
Evaluation, its foci, and design of evaluation
studies
Formative
Summative
Mode
understand how something works analyze strengths
and weaknesses towards improvement, give feedback
assess concrete achievements give results and
evidence
Purpose
Holistic interdependent system
Independent and dependent variables
Conceptuali-sation
Naturalistic inquiry
Experimental design
Design
Exploratory, hypothesis generating ? pattern disc.
Confirmatory, hypothesis testing
Relationship to prior knowledge
Purposeful, key informants ? in mining
interesting patterns
Random, probabilistic
Sampling
Case studies, content and pattern analysis
Descriptive and inferential statistics
Analysis
28
End of Part I
Questions thus far ?
29
Evaluation of Web mining applications, or Web
mining as a project
Is it worthwhile to do the mining project?
Is the result valuable for the application?
Are (all) the tasks performed well?
30
For which measures are field data from Web server
logs (in)adequate data sources?
Satisfaction Measures
Efficiency Measures
Effectiveness Measures
Usability Objective
the users task / intentions ? Assumptions can be
made if there is background knowledge about site
and users
Suitability for the Task
Rating scale for satisfaction
Time to complete a task
Percentage of goals achieved
Suitability for the Task
users level of expertise ? requires (1)
target-group specific logins, (2) induction from
requested content, or (3) other methods, usually
involving reactive data collection
Rating scale for satisfaction with

"power features"
Relative efficiency compared with an expert user
Number of "power features" used
Appropriate for trained users
Definitions of what there is to learn measures
of what the users learned ? usually requires
methods involving reactive data collection
Rating scale for "ease of learning"
Time to learn criterion
Percentage of functions learned
Learnability
Definition of what an error is, or what indicates
an error ? usually requires a detailed knowledge
of users tasks and intentions, i.e. reactive
data collection
Rating scale for error handling
Time spent on correcting errors
Percentage of errors corrected
successfully
Error Tolerance
InternationializationAccessibility Personalization
Write a Comment
User Comments (0)
About PowerShow.com