Mining Airfare Data to Minimize Ticket Purchase Price

1 / 42
About This Presentation
Title:

Mining Airfare Data to Minimize Ticket Purchase Price

Description:

Airline industry often use dynamic pricing strategies to ... Flights depart around holidays or weekends appear to change more. ... Weekend Boosts. Example ... – PowerPoint PPT presentation

Number of Views:336
Avg rating:3.0/5.0
Slides: 43
Provided by: bill545

less

Transcript and Presenter's Notes

Title: Mining Airfare Data to Minimize Ticket Purchase Price


1
Mining Airfare Data to Minimize Ticket Purchase
Price
  • Oren Etzioni
  • Craig Knoblock
  • Alex Yates
  • Rattapoom Tuchinda

2
Outline
  • Introduction and Motivation
  • Data Mining Methods for Airfare Data
  • Experiments
  • Conclusion

3
Introduction
  • Corporations often use complex polices to vary
    product prices over time.
  • Airline industry often use dynamic pricing
    strategies to maximize its revenue. (based on
    seasons, seats available, pricing of other
    airlines)
  • In 1988, Bell Laboratories patented a
    mathematical formula that could perform rapid
    calculations on fare problems with thousands of
    variables.

4
Introduction Cont.
  • Airlines have learned to more effectively price
    their product, instead of attacking each other to
    gain market share.
  • Today, airlines use sophisticated software to
    track their competitors pricing history and
    propose adjustments that optimize their overall
    revenue.

5
Pricing Strategies
  • Flights depart around holidays or weekends appear
    to change more.
  • 7 and 14 day advance purchase (airlines
    usually increase ticket price two weeks before
    departure date and ticket prices are at a maximum
    on departure dates.)
  • Weekend Boosts

6
Example
  • Price change over time for American Airlines
    flight 192223, LAX-BOS, departing on Jan. 2.
  • (This example shows the rapid price change in
    the days priori to the New Year)

7
Categories of airlines
  • Big players (United Airlines, American Airlines)
  • Smaller airlines which concentrate on selling
    low-price tickets ( Air Trans, Southwest)
  • Different types of tickets (economic, business,
    first class, restricted or unrestricted)

8
Is Airfare Prediction Possible?
  • Airlines have tons of historical data.
  • Airlines have many fare classes for seats on the
    same flight, use different sales channels (e.g.,
    travel agents, priceline.com)
  • Airfare prices information become increasingly
    available on the Web.
  • But some information is not available (e.g., of
    unsold seats in a particular flight).

9
Motivation
  • The goal of this paper is to learn whether to buy
    a ticket or wait at a particular time point, for
    a particular flight, given the current price
    history recorded.

10
Advisor Model
  • Consumer wants to buy a ticket.
  • Hamlet buy (this is a good price).
  • Or wait (a better price will emerge).
  • Notify consumer when price drops.

11
Will Flights sell out?
  • Watch the number of empty seats.
  • Upgrade to business class.
  • Place on another flight
  • In our experiment upgrades were sufficient

12
Flight Data
  • Airlines typically user the same flight number
    (e.g., NW17) to refer to multiple flights with
    the same route that depart at the same time on
    different dates.
  • In this paper, a particular flight is referred to
    a combination of its flight number and date.
  • In the training dataset, same class is the set of
    states with the same flight number and the same
    hours but different department dates.

13
Data Set
  • Used Fetch.coms data collection infrastructure.
  • Collected over 12,000 price observations
  • 41 day period, every 3 hours.
  • Lowest available fare for a one-week roundtrip.
  • LAX-BOS and SEA-IAD.
  • 6 airlines including American, United, etc.

14
Learning Task Formulation
  • Input price observation data (built a flight
    data collection agent that runs at a scheduled
    internal, extracts the pricing data, and stores
    the results in a database.
  • Algorithm label observations run learner--
    Hamlet
  • Output for each purchase point ?
  • buy VS wait

15
Formulation cont.
  • Want to use the latest information.
  • Run learner daily to produce new model.
  • Learner is trained on data gathered to date.
  • Learned policy is a collection of these models.
  • Our experiments evaluate the savings on the
    flights.

16
Candidate Approaches
  • By hand an expert looks at the data.
  • Time series
  • Not effective at price jumps!
  • Reinforcement learning Q-learning.
  • Used in computational finance.
  • Rule learning Ripper,

17
RIPPER
  • RIPPER (Repeated Incremental Pruning to Produce
    Error Reduction) is an efficient rule learning
    algorithm that can process noisy datasets
    containing thousands of examples.
  • First, the algorithm partition examples into a
    growing set and a pruning set. Next, a rule is
    grown by adding and pruning features. Repeat the
    process until the rule set output by RIPPER.
  • RIPPER is suitable to handle two-class learning
    problem.

18
RIPPER Cont.
  • In this study, features include price, airline,
    route, hours-before-takeoff, etc.
  • Learned 20-30 rules

19
Simple Time Series
  • Predict price using a fixed window of k price
    observations weighted by a.
  • We used a linearly increasing function for a
  • The time series model makes its decision based on
    a one-step prediction of the ticket price change
  • IF Pt1gtPt THEN buy, ELSE wait.

20
Reinforcement Learning
  • Reinforcement learning is learning what to
    do---how to map situations to actions---so as to
    maximize a numerical reward signal.
  • Supervised learning is learning from examples
    provided by some knowledgeable external
    supervisor.

21
Q-learning
  • Q learning is a method addressing the task of
    Reinforcement Learning. The main idea is to
    approximate the function assigning each
    state-action pair the highest possible payoff.
  • Q denote a transition function, mapping each
    state-action pair to a successor state
  • S denote a finite set of states
  • A denote a finite set of actions
  • R denote a reward function

22
Q-learning Cont.
  • Standard Q-Learning formula
  • s is the state resulting from taking action a in
    state s.
  • r is the discount factor for future rewards, in
    this paper, r1

23
Hand-Crafted Rule
  • A fairly simple policy consulted with travel
    agents, using it to compare with other data
    mining algorithms
  • IF ExpPrice(s0,t0)ltCurPrice AND s0gt7 days
  • THEN wait ELSE buy
  • ExpPrice(s0,t0) denotes the average over all
    MinPrice(s,t) for flights in the training set
    with that flight number, where MinPrice(s,t) is
    the minimum price of that flight over the
    interval starting from s days before departure up
    until time t

24
Hamlet
  • Stacking with three base learners
  • Ripper (e.g., Rwait)
  • Time series
  • Q-learning (e.g., Qbuy)
  • Using multiple data mining methods to combine the
    outputs
  • Output classifies each purchase point as buy
    or wait.

25
A Sample Rule Generated by Hamlet
  • IF hours-before-takeoffgt480
  • AND airlineUnited
  • AND pricegt360
  • AND TSbuy
  • AND QLwait
  • THEN wait
  • TS is the output of the Time Series algorithm,
    QL is the output of Q-Learning.

26
Ticket Purchasing Simulation
  • Real price data (collected from the Web)
  • Simulated passengers (a passenger is a person
    wanting to buy a ticket on a particular flight at
    a particular date and time)
  • Hamlet run once per day (training data all data
    gathered in the past)

27
Saving Experiments
  • Savings for a simulated passenger is the
    difference between the price of a ticket at the
    earliest purchase point and the price of the
    ticket at the point when the predictive model
    recommends buying.
  • Net savings is savings net of both losses and
    upgrade costs.

28
Effectiveness
  • HAMLETs savings that were 61.8 of optimal!
  • Savings buy immediately VS Hamlet.
  • Optimal buy at the best possible time (knowing
    the future price information)

29
Savings by Method
  • Savings over buy now.
  • Penalty for sell out upgrade cost.
  • Total ticket cost is 4,579,600.

30
Savings by Method Cont.
  • Table below shows the savings, losses, upgrade
    costs, and net savings achieved in the simulation
    by each predictive model

31
Sensitivity Analysis
  • Varying two key parameters to test the robustness
    of the results to changes in simulation
  • Change the distribution of passengers requesting
    flight tickets (e.g., uniform, linear
    decrease/increase) to check the performance on
    multiple flights in three hour interval.
  • Allow a passenger to purchase a ticket at any
    time during a three hour internal (e.g.,
    specifies fly in the morning, afternoon or
    evening)

32
Sensitivity Analysis Cont.
Legend
Time Series Q-Learning By Hand Ripper Hamlet Optim
al
33
Upgrade Penalty
  • Most algorithms avoided the costly upgrades
    almost all the time. ( Upgrades as a fraction
    of the number of test passengers 4488 of them)

34
Discussion
  • 76 of the time --- no savings possible.
  • (Prices never dropped from the earliest
    purchase point until the flight departed)
  • Uniform distribution over 21 days.
  • 33 of the passengers arrived in the last week.
  • No passengers arrived 28 days before.
  • Simulation understates possible savings!

35
Savings on Feasible Flights
  • Comparison of Net Savings (as a percent of total
    ticket price) on Feasible Flights (price saving
    is possible)

36
Related Work
  • Trading agent competition.
  • Auction strategies
  • Temporal data mining.
  • Time Series.
  • Computational finance.

37
Future Work
  • More tests are necessary international,
    multi-routes, hotels, etc.
  • Cost sensitive learning
  • Additional base learners
  • Bagging/boosting
  • Refined predictions
  • Commercialization patent, license.

38
Conclusions
  • Dynamic pricing is prevalent.
  • Price mining a-la-Hamlet is feasible.
  • Price drops can be surprisingly predictable.
  • Need additional studies and algorithms.
  • Great potential to help consumers!?

39
But
  • Airlines may introduce noise into their pricing
    patterns to fool a price miner.
  • Demand and supply of seats are uncertain. (Good
    prediction based on good assumptions)
  • Who can earn most benefit in the end?
  • (Airlines, consumers, price miner?)

40
John Nash said
Its a GAME!
41
References
  • Fast Effective Rule Induction, In A. Prieditis
    and S. Russell, editors, Proc of the 12th ICML,
    1995.
  • A General Method for making classifiers
    cost-sensitive, In Proc. of Fifth ACM SIGKDD,
    1999.
  • Reinforcement Learning An Introduction. MIT
    Press, Cambridge, MA, 1998.
  • The Analysis of Time Series An Introduction.
    Chapman and Hall, London, UK, 1989.
  • Airlines Rely on Technology To Manipulate Fare
    Structure By Scott Mccartney, Wall Street
    Jounal, November 3, 1997.

42
End
  • Thank you!!
Write a Comment
User Comments (0)