Hotel Demand - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Hotel Demand

Description:

Estimating Demand in the Hotel Industry by Mining User-Generated and Crowdsourced Content Anindya Ghose (with P. Ipeirotis and B. Li ) Stern School of Business – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 27
Provided by: cciSomYa
Category:
Tags: demand | hotel

less

Transcript and Presenter's Notes

Title: Hotel Demand


1
Estimating Demand in the Hotel Industry by Mining
User-Generated   and Crowdsourced Content
Anindya Ghose (with P. Ipeirotis and B. Li )
Stern School of Business New York University
2
Before We Start
  • How can I find a 5-star hotel in Miami,
  • near the interstate highway with easy
    access to beach in an area with lots of
    nightlife, and also provides a great price for
    what it offers?

3
Customer Search in Travel Search Engines
  • Rudimentary ranking facilities using a single
    criterion
  • i.e., name, price per night, class, customer
    reviews.

Largely ignore - multidimensional preferences of
consumers - location and service characteristics
of the hotels
4
Introduction
Customers try to identify hotels with particular
characteristics e.g. location, service,
Search?
Near the Beach,
Near Downtown
Location
Demand
Influence Desirability
Service
Free internet access
24 hour fitness center
No empirical studies have focused on location,
service, and hotel demand. What characteristics
and how to get the data?
5
Research Agenda
Problem Locate the hotels that satisfy specific
criteria and offer the best value for the
money.
Challenge Need to quantify the economic weight
of the location- and service-based
characteristics of hotels.
Method Combine structural modeling of demand
estimation with text mining of user-generated
content, on-demand annotations using
crowd-sourcing and image classification to
identify and measure hotel characteristics.
6
New Ranking Approach for Hotels
  • Consumers ideally like the best product shown
    first on the screen
  • Best product Highest value for money
  • Consumers gain utility from product
    characteristics (WTP)
  • Consumers lose utility by paying for product
    (Price)
  • Value for money Difference of the two
  • Transaction data from travel search engines
  • Compute consumer surplus for each hotel using
    location and service characteristics minus price
  • Rank according to value for money

7
Main Data Travelocity hotel reservations
  • Our technique is validated on a unique panel
    dataset consisting of based on 1500 different
    hotels located in the United States for 3 months.
    (2008/11 - 2009/02).
  • Supplemented this dataset with data from
    Microsoft, Tripadvisor, Geonames, Amazon and
    Google.

8
Identification of Hotel Characteristics
  • An online anonymous survey
  • 100 users on Amazon Mechanical Turk
    (AMT/MTurk)
  • What characteristics do you consider to be
    the most important when you choose a hotel?

9
Identification of Hotel Characteristics
Location-based hotel characteristics
  • Near the Beach
  • Near the Lake/River
  • Near Public Transportation
  • Near Downtown
  • Near Interstate Highway
  • Number of External Amenities (i.e., near
    Restaurants/Shops/Bars/Markets)
  • Safe Neighborhood
  • Number of local competitors
  • Convention center, airport, etc

10
Identification of Hotel Characteristics
Service-based hotel characteristics
  • Hotel Class
  • Number of Internal Amenities
  • (Aggregation of 23 hotel internal
    amenities, i.e.,
  • free breakfast, business center,
    high speed
  • internet, swimming pool, parking,
    etc.)
  • Customer Review (Count, Valence, Text)
  • Text mining of hotel reviews on both
    Travelocity and Tripadvisor

The service-based hotel characteristics data were
crawled from www.Tripadvisor.com.
11
Acquiring Location-based Characteristics
However, all the location-based characteristic
information can NOT be easily derived from the
same way. i.e., Near the Beach vs. Near
Restaurants
12
Acquiring Location-based Characteristics
(1) Commercial characteristics are computed via
local search queries using Virtual Earth
Interactive SDK.
A new generation of interactive online mapping
services, providing both a main mapping site and
a JavaScript API.
i.e., Near Restaurants/Shops/Bars/Markets
13
Acquiring Location-based Characteristics
  • Geographical characteristics with rich textural
    information are derived by image classification
    with Gabor feature extraction.
  • 256 256 pixels ? 49
    overlapping regions

SVM Classification Accuracy 0.912
SVM Classification Accuracy 0.807
14
Acquiring Location-based Characteristics
  • (3) Geographical characteristics too hard even
    for image
  • classification algorithms are classified
    using on-demand
  • human annotation through AMT survey.
  • 4 different zoom levels for each location to 5
    Turkers
  • Public Transportation , Lake/river, Highway

(4) Characteristics related to neighborhood
safety are acquired from the FBI online
statistics (http//www.FBI.com ).
  • City Annual Crime Rate over last 6 years

15
(No Transcript)
16
Summary Statistics
17
Acquiring Hotel Characteristics
Goal Locate the hotel with specific criteria
and the best value for the money.
Estimate the economic value for those
characteristics.
What characteristics
Collection of the data
18
Framework of Structural Model
  • First, consumer finds a subset of hotels that
    matches her own.
  • Each hotel belongs to one of the following types
    of travel category Family Trip, Business
    Trip, Romantic Trip, Tourists Trip, Trip with
    Kids, Trip with Seniors, Pets Friendly and
    Disabilities Friendly.
  • In order to capture heterogeneity in consumers
    travel category, we introduce an idiosyncratic
    taste shock similar in flavor to BLP (1995)
    model.
  • Second, once the consumer has picked a specific
    travel category, she will make a decision based
    on her evaluation of the quality of the hotels.
  • Pure characteristic model (Berry and Pakes 2007)
    to capture the differentiation among hotels
    within the same category
  • Summary Combine the BLP (1995) and Berry Pakes
    (2007)

19
Structural Modeling
We propose a two-step random coefficient based
structural model in the following form
  • jk represents hotel j with category type k (
    1k7)
  • ß and ? are random coefficients that capture
    consumers heterogeneous tastes towards
    different observed hotel characteristics, X, and
    towards price per night, P.
  • ? represents the set of hotel characteristics
    that are unobservable to the econometrician.
  • e with a superscript k represents a travel
    category level taste shock with a Type-I EV
    distribution.

20
Estimation
  • Step 1 Calculating market share.
  • Step 2 Solving mean utility.
  • Solution is based on contraction mapping
    technique
  • Step 3 Solving variance of ß and ?.
  • Instrumental variables IV for price - Average
    price of the same-star rating hotels in the
    same market/other markets (Hausman 1994).
  • Form a GMM objective function using moment
    conditions.
  • Minimize the GMM objective function.

21
Estimation
Step 1 Calculating market share.
  • Market share for hotel within a travel
    category type
  • - PCM-based model
  • Market share for each travel category type as a
    whole
  • - BLP-based model
  • Final market share for a hotel with a travel
    category type.

22
Estimation
Step 1 Calculating market share.
  • Market share for hotel within a travel
    category type
  • - PCM-based model
  • Market share for each travel category type as a
    whole
  • - BLP-based model
  • Final market share for a hotel with a travel
    category type.

Within-category Market share
Market share for a particular category
23
Estimation
Step 2 Solving mean utility.
  • Solving mean utility such that the model
    predicted market
  • share equates the observed market share.
  • Solution is based on contraction mapping
    technique.

Step 3 Solving variance of ß and ?.
  • Instrumental variables IV for price -
    Average price of the same-star

  • rating hotels in the same market.
  • Form a GMM objective function using moment
    conditions.
  • Iterate step 1, 2 and 3 to minimize the GMM
    objective function.

24
Identification (BLP (1995) and PCM (2007) models)
  • (i) Monotonicity sj is weakly increasing and
    continuous in ?j and weakly decreasing in ?j-1,
    where ?j -1is the unobserved characteristics for
    the rival-products.
  • (ii) linearity of utility in ? - if ? for every
    good is increasing by an equal amount, then no
    market share changes, and
  • (iii) substitutes with some other good - every
    product must be a strict substitute with some
    other good.

25
Economic Value of Characteristics
.
I. At least 1 review from either TA or
TL. II. Reviews gt5. III. Review gt10.
26
Hotel Characteristic Impact
  • Positive Impact
  • Beach
  • Interstate Highway
  • Downtown
  • Public Transportation
  • Hotel Class
  • Hotel External Amenities
  • Hotel Internal Amenities
  • Negative Impact
  • Price
  • Annual crime rate
  • Number of competitors
  • Lake
  • Spelling errors
  • Syllables
  • Complexity
  • Subjectivity


27
Marginal Effects
28
Marginal Effects
29
Robustness Checks
  • Sample consisting of those hotels that have at
    least one review from either Travelocity or
    TripAdvisor.
  • Estimations after extracting individual service
    features from the text of reviews.
  • Estimations with hotel brand, convention center,
    distance from airport, etc.
  • Estimations with Google Trends data to control
    for endogeneity of WoM and sales.
  • Estimations with BLP (1995) model and PCM (2007)
    models.
  • Estimations across only those cities where all
    location features present.

30
Robustness Test (I) - Using Alternative
Sample Split

.
IV. At least 1 review from TA. V. At least 1
review from TL. VI. At least 1 review from both.
31
Robustness Test (II) - Using an Alternative
Model - BLP

.
32
Text mining method to extract score service
features
  • Use a POS (part-of-speech) tagger to identify
    frequently mentioned nouns and noun phrases,
    which we consider candidate hotel features.
  • Clustering using wordnet and a context-sensitive
    hierarchical agglomerative clustering algorithm
    (Manning and Schutze 1999), into set of similar
    nouns and noun phrases.
  • We keep the top-5 features since they covered 80
    of the hotels in our data.
  • Hotel staff, food quality, bathroom, parking
    facilities, and bed quality.
  • Extract all the adjectives and adverbs that are
    being used to evaluate the individual features.
  • Used AMT to create the ontology with scores for
    each evaluation phrase (Ghose et al. 2008).
  • AMT workers look at the pair of the evaluation
    phrase together with the product feature, and
    assign a grade from -3 (strongly negative) to 3
    (strongly positive) to the evaluation.
  • Dropped the highest and lowest evaluation score,
    and used the average of the remaining evaluations
    as the externally imposed score.

33
Robustness Test (III) - Using Additional
Features

.
Consistent with Pakes (2003) and Archak et al.
(2008)
34
Model Fit With UGC vs. Without UGC
35
Model Validation
36
Counterfactual Experiments
  • Simulate a dataset with 6000 observations based
    on the distribution of the original hotel data
  • Compute the corresponding utility for hotels in
    the simulated dataset based on our model, using
    our prior set of estimates.

37
Counterfactual Experiments (1)
1. Marginal Effects Under Different Location
Environments.
  • Goal
  • Examine the robustness for the rank order of
    marginal effects of the
  • location features in areas with no beach, or no
    transportation, etc.
  • Treatment
  • Generate 6 derivative samples, by assuming each
    of the 6 location features
  • (beach, downtown, highway, lake, trans,
    external) to be absent, one at a time
  • Re-compute the corresponding utility for
    hotels, with the corresponding
  • absent location feature value zero.
  • Re-estimate with the updated utilities and the
    remaining features.
  • Finding
  • The rank order of marginal effects for the
    remaining location features
  • stay consistent with our original baseline
    estimates.

38
Counterfactual Experiments (2)
2. Effects of Competition Under Different
Location Environments.
  • Goal
  • Examine the effect on demand from the entry of
    one local competitor under
  • different location environments.
  • Treatment
  • Consider 2 different types of location feature
    combinations
  • Type 1 beach and highway (typical
    west/south coast setting)
  • Type 2 downtown, transportation and
    external amenities (typical big city setting)
  • Generate 2 derivative samples correspondingly,
    by assuming the unrelated
  • location features in each of the two types
    to be absent (valuezero)
  • Re-compute the utility and re-estimate the
    model.
  • Finding
  • Demand drop in big city is 1.5 times larger
    than that in coastline.

39
Counterfactual Experiments (3)
3. Effects of Changes in Pricing Policy Under
Different Location Environments.
  • Goal
  • Examine how price change will affect hotel
    demand under different
  • location environments.
  • Treatment
  • Consider the same two derivative samples as in
    Experiment (2)
  • Assume a price cut by 20.
  • Finding
  • Increase in demand is lower in big city than
    that in coastline.

Consumers in big cities are less sensitive to
price.
40
  • 3 - Effects of competition under different
    location environments.
  • of Competitors increases by 1 Price Cut 20
  • (i) Beach Highway -0.46 1.43
  • (ii) Downtown, Transportation Amenity -0.70
    1.18
  • Baseline -0.59 2.31

41
Value for Money Based Ranking
  • We propose a ranking approach for hotels based on
    the value for money of each hotel for consumers
    on an aggregate level.
  • This ranking idea is based on how much extra
    value consumers can obtain after paying for that
    hotel.
  • If a hotel provides a comparably higher value for
    money for consumers on an aggregate level, then
    it should appear on the top part of our ranking
    list.
  • Higher ranked hotels can provide consumers with
    higher surplus (WTP) value, thus should be more
    often recommended to consumers.

42
Results Based on Consumer Surplus Estimation
(Best Value for Money)

43
Ranking Evaluation - User Study
(1) Comparison with blinded lists Hide all the
titles, and conduct pair wise comparison with
each of the other 9 competing alternatives.

44
Ranking Evaluation - User Study
New York City Los Angeles San Francisco
Orlando New Orleans Salt Lake City
45
Ranking Evaluation - User Study
Explanations from users -Diversity 30
5-star, 40 4-star, 30 3-star (and
lower) -Price is not the only factor multi
dimensional preferences are taken into account.
Our reasoning Based on qualitative opinions of
users, diversity is indeed an important factor
that improves the satisfaction of consumers. Our
economic-based ranking approach seems to
introduce diversity more naturally.
46
On-going Work
  • Personalized ranking

Derive personalized consumer surplus by
incorporating consumer demographics. (i.e., age
group, travel purpose)
47
Personalized Model
  • Goal
  • Examine the interaction effect between consumer
    demographics and hotel characteristics
  • Derive personalized ranking based on individual
    utility, conditional on consumer demographics
    (i.e., age group, travel purpose).
  • Model

48
Weights of Hotel Characteristics Based on
Different Travel Purposes
Consumers with different travel purposes assign
different weight distributions on the same set of
hotel characteristics.
49
User Study
Experiment 2 Blind pair-wise comparisons, 100
anonymous AMT users baseline generalized
CS-based ranking (for an average consumer). E.g.,
Business trip and family trip AMT user study
results in the NYC experiment.
Conclusion Personalized CS-based ranking is
overwhelmingly preferred.
Reasoning Capture consumers specific
expectations, dovetail with their real purchase
motivation.
50
Estimation Results Capture Consumers Real
Motivation
e.g., In the user study, business travelers
indicated that they prefer quiet inner
environment and easy access to highway and
public transportation. This was fully captured in
our estimation results, see (b).
51
Conclusion
  • We empirically estimate the economic impact of
    hotel characteristics, using
  • user-generated and crowd-sourced content
  • structural modeling, automated image
    classification, automatic text mining on-demand
    surveys

New Ranking System for Hotels on Travel Search
Engines
http//hyperion.stern.nyu.edu/mturk/travel.html
52
AMT demographics survey
  • Surveyed AMT workers about their place of origin
    and residence, gender, age, education, income,
    marital status, household size, and number of
    children.
  • We also asked them about the time that they spend
    every week on AMT, the amount of work that they
    complete, the payment they receive, and their
    reasons for participating on AMT.
  • To ensure consistency in results, we conducted
    the survey six times, once a month in 2009.
  • The results of the surveys suggest that AMT
    participants are well representative of the
    overall Internet population.
  • Also asked them about their experience with
    visits to online travel search engines
    Tripadvisor and Travelocity
Write a Comment
User Comments (0)
About PowerShow.com