Evaluation for Web Mining Applications - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Evaluation for Web Mining Applications

Description:

Evaluation focusses on usage. Goals of usage depend on stakeholder and viewpoint. 10 ... Evaluation, its foci, and design of evaluation studies ... – PowerPoint PPT presentation

Number of Views:30

Avg rating:3.0/5.0

Slides: 30

Provided by: warholWiw

Category:

more less

Transcript and Presenter's Notes

Title: Evaluation for Web Mining Applications

1
Evaluation for Web Mining Applications

Bettina Berendt
Humboldt University Berlin
Ernestina Menasalvas
Universidad Politécnica de Madrid
Myra Spiliopoulou
Otto von Guericke University Magdeburg
www.wiwi.hu-berlin.de/berendt/Evaluation

2
Evaluation

the act of ascertaining
the value and
the functioning
of an object according to specified criteria,
operationalised by measures.

? to assess concrete achievements ? to give
feedback towards improvement
3
Evaluation for Web mining applications, or
Evaluation of Web applications
Is this a good Website?
4
Agenda
Evaluation and Web mining
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
5
What is Web Mining?

Despite its success, one problem of the current
WWW is that much of this knowledge lies dormant
in the data.
Web mining tries to overcome these problems by
applying data mining techniques to the content,
(hyperlink) structure, and usage of Web resources.

Web Mining Areas Web content mining
5
6
Application problems and typicalpattern
discovery techniques
Markov chains
Prediction of next event
Sequence mining
Discovery of associated events/application objects
Association rules
Discovery of visitor groups with common
properties interests
Clustering
Discovery of visitor groups with common behaviour
Session Clustering
Characterization of visitors with respect to a
set of predefined classes
Classification
Card fraud detection
7
Knowledge Discovery steps The Cross-Industry
Standard Process for Data Mining CRISP-DM
8
Agenda
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
9
Application problems and goals (1)

Top-level goal 1 The Web exists in order to be
used.
? Evaluation focusses on usage.
Goals of usage depend on stakeholder and
viewpoint.

10
Application problems and goals (2)

Stakeholders
Site users
Site owners / sponsors (technical, marketing,
management, ...)
Viewpoints a Web site / a collection of Web
sites or pages as ...
... a piece of software
?
usability?
... a distribution channel for a business or
organization ?
profitability? market analysis recommendations
for cross-selling ...
... a collection of documents
?
frequency of use / public perception?
competition analysis
... a medium for a given content and tasks (e.g.,
e-Learning) ? cf. distribution
channel
... a Web of connections (e.g., a social network)
? what
properties does the network have?

11
Is the site a good site? ? Is it successful?But
What does Success mean?

Before talking of success
Why does the site exist?
Why should someone visit it?
Why should someone return to it?
After answering these questions
Does the site satisfy its owner?
Does the site satisfy its users?
ALL the users?

12
The object of evaluation usability

The effectiveness, efficiency, and satisfaction
with which specified users achieve
specified goals in particular environments.
Effectiveness The accuracy and completeness with
which specified users can achieve specified goals
in particular environments.
Efficiency The resources expended in relation to
the accuracy and completeness of goals
achieved.
Satisfaction The comfort and acceptability of
the work system to its users and other people
affected by its use.

13
The measures Examples of usability metrics
Satisfaction Measures
Efficiency Measures
Effectiveness Measures
Usability Objective
Rating scale for satisfaction
Time to complete a task
Percentage of goals achieved
Suitability for the Task
Rating scale for satisfaction with

"power features"
Relative efficiency compared with an expert user
Number of "power features" used
Appropriate for trained users
Rating scale for "ease of learning"
Time to learn criterion
Percentage of functions learned
Learnability
Rating scale for error handling
Time spent on correcting errors
Percentage of errors corrected
successfully
Error Tolerance
14
Examples of usability measures derived from Web
mining

Berendt Spiliopoulou (2000) sequential
patterns
Search criteria (interface)
Selection-based most popular (? user
satisfaction), but least efficient.
Type-in least popular, most efficient
search criteria (content)
Location most popular
Kralisch Berendt (2004) quasi-experimental
design, support, sequential patterns ? Search
criteria popularity is influenced by country
culture
Poblete Baeza-Yates (2004) Query clustering ?
identify the need for hyperlinks and new content
Stojanovic et al. (2002) popularity ? identify
need for new content concepts concepts to be
dropped (ontology evolution) crawler obtains
content

Before talking of success
Why does the site exist?
Why should someone visit it?
Why should someone return to it?
After answering these questions
Does the site satisfy its owner?
Does the site satisfy its users?
ALL the users?

Business goals
Value creation
Sustainable value
Application-centric measures
User-centric measures
User types
16
The object of evaluation satisfaction of
business goals
Personalisation

1. Sale of products/services on-line

Amazon sells books (etc) online. The site should
help the users find the most suitable books for
their needs, identify more related products of
interest and, finally purchase them in a secure
and intuitive way.
Cross/Up-Selling
Site design
Selling
2. Marketing for products/services to be acquired
off-line
Insurances, banks, application service providers
etc providers of services based on a long-term
relationship with the customer do not sell
on-line to unknown users. The site should
demonstrate to the users the quality of the
product/service and the trustworthiness of its
owner and initiate an off-line contact.
3. Reduction of internal costs, information
dissemination,
17
The measures example e-marketing metrics based
on the sales process
Customer-company interaction phases
Information Acquisition
Negotiation Transaction
After Sales Support

Ratio of persons going from one phase to the next
? positive and negative
measures
example Conversion rate customers / contacted
prospects

18
Agenda
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
19
Objectives of the application The largest
European full multi-channel e-tailerselling
consumer electronics online in gt5000 shops

General objectives Standard e-tailer goals
attract users/shoppers and convert them into
customers
Specific objectives assess the success of the
Web site in relation to other distribution
channels

? Questions of the evaluation
What business metrics can be calculated from Web
usage data, transaction and demographic data for
determining online success?
Are there cross-channel effects between a
companys e-shop and its physical stores?

Background Internet market shares BCG 2002
Teltzrow Berendt, Proc. WebKDD 2003
Günther, Proc. 4th IBM eBusiness Conference 2003
20
Outline of the KDD process

Business understanding see previous slide
Data
gt 90K Web server sessions, gt 10K transaction
records 21 days in 2002
Data understanding main step
modelling the semantics of the site in terms of a
hierarchy of service concepts that follows the
phases of the sales process

Data preparation
Session IDs usual data cleaning steps
Linking of sessions transaction information
(anonymized)
Modelling / pattern discovery
Web metrics, cluster analysis, association rules,
sequence mining correlation analysis,
questionnaire study, qualitative market analysis
Pattern evaluation Interesting patterns

21
Starting point Web life-cycle metrics,
micro-conversion rates
Cutler and Sterne (2001)
W (whole population)
S (suspects / site visitors)
nS
P (prospects / active investigators)
nP
C (customers)
Cb (abandon cart)
nC
CR (repeat customers)
CA (attrited customers)
C1 (One time Customers)
Metrics example click-through rate M2 / M1
22
Extension for application-oriented success
measurement Multi-Channel Metrics
C
WM5 (paid online)
SM5 (paid in store)
SM5 (paid in store)
SM5 (paid in store)
WM5 (belong to SM5 in at least one following
transaction)
WM5 (belong to WM5 in every following
transaction)
WM5 (belong to SM5 in at least one following
transaction)
WM5 (belong to SM5 in at least one following
transaction)
C
WM6 (direct delivery)
SM6 (pick up in store)
SM6 (pick up in store)
SM6 (pick up in store)
SM6 (pick up in store)
WM6 (belong to SM6 in at least one following
transaction)
WM6 (belong to WM6 in every following
transaction)
WM6 (belong to SM6 in at least one following
transaction)
23
Internal consistency of preferences payment
and delivery preferences

Online payment ? Direct delivery (s0.27, c0.97)
lt 1/3 traditional onl.users!
Online payment ? In-store pickup (s0.02, c0.03)
Cash on delivery ? Direct delivery (s0.02,
c0.03)
In-store payment ? In-store pickup (s0.69,
c0.94)
? Site is primarily used to collect information.

s support, c confidence of the sequence
24
Development of preferences over time

Direct delivery ? In-store pickup in ?1 following
transaction (s0.001,c0.15)
Direct delivery ? Direct delivery in all
following transactions (s0.003,c0.85)
In-store pickup ? Direct delivery in ?1 foll.
transaction (s0.001, c0.10) ()
In-store pickup ? In-store pickup in all foll.
transactions (s0.004, c0.90)
Results for payment migration are similar.
? 90 of repeat customers did not change
transaction preferences at all.
? Rule () as an indicator of the development of
trust?!

25
Agenda
Evaluation and Web mining
Mining for evaluation perspectives and measures
A case study
Outlook Evaluation of mining
Web mining as a project towards a methodology
Evaluation and experimentation
26
Evaluation of Web mining applications, or Web
mining as a project
Is it worthwhile to do the mining project?
Are the data appropriate for the mining project?
Is the result valuable for the application?
Are the techniques appropriate for the expected
results?
Are (all) the tasks performed well?
27
Evaluation, its foci, and design of evaluation
studies
Formative
Summative
Mode
understand how something works analyze strengths
and weaknesses towards improvement, give feedback
assess concrete achievements give results and
evidence
Purpose
Holistic interdependent system
Independent and dependent variables
Conceptuali-sation
Naturalistic inquiry
Experimental design
Design
Exploratory, hypothesis generating ? pattern disc.
Confirmatory, hypothesis testing
Relationship to prior knowledge
Purposeful, key informants ? in mining
interesting patterns
Random, probabilistic
Sampling
Case studies, content and pattern analysis
Descriptive and inferential statistics
Analysis
28
End of Part I
Questions thus far ?
29
Evaluation of Web mining applications, or Web
mining as a project
Is it worthwhile to do the mining project?
Is the result valuable for the application?
Are (all) the tasks performed well?
30
For which measures are field data from Web server
logs (in)adequate data sources?
Satisfaction Measures
Efficiency Measures
Effectiveness Measures
Usability Objective
the users task / intentions ? Assumptions can be
made if there is background knowledge about site
and users
Suitability for the Task
Rating scale for satisfaction
Time to complete a task
Percentage of goals achieved
Suitability for the Task
users level of expertise ? requires (1)
target-group specific logins, (2) induction from
requested content, or (3) other methods, usually
involving reactive data collection
Rating scale for satisfaction with

"power features"
Relative efficiency compared with an expert user
Number of "power features" used
Appropriate for trained users
Definitions of what there is to learn measures
of what the users learned ? usually requires
methods involving reactive data collection
Rating scale for "ease of learning"
Time to learn criterion
Percentage of functions learned
Learnability
Definition of what an error is, or what indicates
an error ? usually requires a detailed knowledge
of users tasks and intentions, i.e. reactive
data collection
Rating scale for error handling
Time spent on correcting errors
Percentage of errors corrected
successfully
Error Tolerance
InternationializationAccessibility Personalization

Write a Comment

User Comments (0)